2025-10-10T00:42:56.5175753Z Current runner version: '2.328.0' 2025-10-10T00:42:56.5181786Z Runner name: 'i-088ba17e0301f2c3f' 2025-10-10T00:42:56.5182571Z Runner group name: 'default' 2025-10-10T00:42:56.5183434Z Machine name: 'ip-10-0-20-73' 2025-10-10T00:42:56.5186209Z ##[group]GITHUB_TOKEN Permissions 2025-10-10T00:42:56.5188465Z Contents: read 2025-10-10T00:42:56.5189028Z Metadata: read 2025-10-10T00:42:56.5189518Z ##[endgroup] 2025-10-10T00:42:56.5191438Z Secret source: Actions 2025-10-10T00:42:56.5192097Z Prepare workflow directory 2025-10-10T00:42:56.5704140Z Prepare all required actions 2025-10-10T00:42:56.5741922Z Getting action download info 2025-10-10T00:42:56.8643641Z Download action repository 'pytorch/test-infra@main' (SHA:264eed5d70b428e3aa5c1a7c98e4330f866e183f) 2025-10-10T00:42:59.1629969Z Download action repository 'pytorch/pytorch@main' (SHA:a6fa4f9c283971c0fb6f60a89674a1f35370ac79) 2025-10-10T00:43:15.0118628Z Download action repository 'actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065' (SHA:a26af69be951a213d495a4c3e4e4022e16d87065) 2025-10-10T00:43:15.3573114Z Download action repository 'aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722' (SHA:ececac1a45f3b08a01d2dd070d28d111c5fe6722) 2025-10-10T00:43:15.6247641Z Download action repository 'aws-actions/amazon-ecr-login@062b18b96a7aff071d4dc91bc00c4c1a7945b076' (SHA:062b18b96a7aff071d4dc91bc00c4c1a7945b076) 2025-10-10T00:43:15.8514144Z Download action repository 'seemethere/upload-artifact-s3@baba72d0712b404f646cebe0730933554ebce96a' (SHA:baba72d0712b404f646cebe0730933554ebce96a) 2025-10-10T00:43:16.1405600Z Getting action download info 2025-10-10T00:43:16.2645171Z Download action repository 'actions/checkout@v4' (SHA:08eba0b27e820071cde6df949e0beb9ba4906955) 2025-10-10T00:43:16.5646335Z Getting action download info 2025-10-10T00:43:16.6708867Z Download action repository 'nick-fields/retry@v3.0.0' (SHA:7152eba30c6575329ac0576536151aca5a72780e) 2025-10-10T00:43:16.8871544Z Getting action download info 2025-10-10T00:43:17.0107601Z Download action repository 'nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482' (SHA:3e91a01664abd3c5cd539100d10d33b9c5b68482) 2025-10-10T00:43:17.2042974Z Getting action download info 2025-10-10T00:43:17.3450116Z Uses: pytorch/pytorch/.github/workflows/_linux-test.yml@refs/heads/main (344e6365a0068c2d2847fcec0c55dd53291d475e) 2025-10-10T00:43:17.3453733Z ##[group] Inputs 2025-10-10T00:43:17.3454105Z build-environment: linux-jammy-cuda12.8-py3.10-gcc11-sm86 2025-10-10T00:43:17.3455171Z test-matrix: {"include": [{"config": "slow", "shard": 1, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}, {"config": "slow", "shard": 2, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}, {"config": "slow", "shard": 3, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}]} 2025-10-10T00:43:17.3456632Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:43:17.3457427Z sync-tag: 2025-10-10T00:43:17.3458141Z timeout-minutes: 240 2025-10-10T00:43:17.3458399Z use-gha: 2025-10-10T00:43:17.3458620Z dashboard-tag: 2025-10-10T00:43:17.3458871Z s3-bucket: gha-artifacts 2025-10-10T00:43:17.3459144Z aws-role-to-assume: 2025-10-10T00:43:17.3459815Z disable-monitor: false 2025-10-10T00:43:17.3460108Z monitor-log-interval: 5 2025-10-10T00:43:17.3460400Z monitor-data-collect-interval: 1 2025-10-10T00:43:17.3460715Z ##[endgroup] 2025-10-10T00:43:17.3461179Z Complete job name: linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 2, 3, linux.g5.4xlarge.nvidia.gpu) 2025-10-10T00:43:17.4133239Z A job started hook has been configured by the self-hosted runner administrator 2025-10-10T00:43:17.4234481Z ##[group]Run '/home/ec2-user/runner-scripts/before_job.sh' 2025-10-10T00:43:17.4246109Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:43:17.4246702Z ##[endgroup] 2025-10-10T00:43:18.8336195Z Runner Type: linux.g5.4xlarge.nvidia.gpu 2025-10-10T00:43:18.8337105Z Instance Type: g5.4xlarge 2025-10-10T00:43:18.8337361Z AMI Name: unknown 2025-10-10T00:43:18.8380819Z AMI ID: ami-08982f1c5bf93d976 2025-10-10T00:43:24.3517707Z ##[group]Run pytorch/test-infra/.github/actions/setup-ssh@main 2025-10-10T00:43:24.3518118Z with: 2025-10-10T00:43:24.3518635Z github-secret: *** 2025-10-10T00:43:24.3519291Z instructions: All testing is done inside the container, to start an interactive session run: docker exec -it $(docker container ps --format '{{.ID}}') bash 2025-10-10T00:43:24.3519993Z activate-with-label: false 2025-10-10T00:43:24.3520261Z label: with-ssh 2025-10-10T00:43:24.3520503Z remove-existing-keys: true 2025-10-10T00:43:24.3520767Z fail-silently: true 2025-10-10T00:43:24.3521005Z env: 2025-10-10T00:43:24.3521231Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:43:24.3521490Z ##[endgroup] 2025-10-10T00:43:24.4976827Z Please see https://github.com/pytorch/pytorch/wiki/Debugging-using-with-ssh-for-Github-Actions for more info. 2025-10-10T00:43:24.4979050Z Not on pull request and ciflow reference could not be extracted, skipping adding ssh keys 2025-10-10T00:43:24.5164693Z ##[group]Run pytorch/pytorch/.github/actions/checkout-pytorch@main 2025-10-10T00:43:24.5165115Z with: 2025-10-10T00:43:24.5165332Z no-sudo: true 2025-10-10T00:43:24.5165565Z submodules: recursive 2025-10-10T00:43:24.5165814Z fetch-depth: 0 2025-10-10T00:43:24.5166038Z env: 2025-10-10T00:43:24.5166255Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:43:24.5166505Z ##[endgroup] 2025-10-10T00:43:24.5247283Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-10-10T00:43:24.5248185Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-10-10T00:43:24.5263152Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:43:24.5263525Z env: 2025-10-10T00:43:24.5263761Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:43:24.5264055Z ##[endgroup] 2025-10-10T00:43:24.5365458Z ##[group]Run # Use all available CPUs for fetching 2025-10-10T00:43:24.5365929Z # Use all available CPUs for fetching 2025-10-10T00:43:24.5366278Z cd "${GITHUB_WORKSPACE}" 2025-10-10T00:43:24.5366608Z git config --global fetch.parallel 0 2025-10-10T00:43:24.5366990Z git config --global submodule.fetchJobs 0 2025-10-10T00:43:24.5367442Z  2025-10-10T00:43:24.5367797Z # Clean workspace. The default checkout action should also do this, but 2025-10-10T00:43:24.5368239Z # do it here as well just in case 2025-10-10T00:43:24.5368556Z if [[ -d .git ]]; then 2025-10-10T00:43:24.5368850Z  if [ -z "${NO_SUDO}" ]; then 2025-10-10T00:43:24.5369160Z  sudo git clean -ffdx 2025-10-10T00:43:24.5369434Z  else 2025-10-10T00:43:24.5369671Z  git clean -ffdx 2025-10-10T00:43:24.5369933Z  fi 2025-10-10T00:43:24.5370148Z fi 2025-10-10T00:43:24.5379612Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:43:24.5379977Z env: 2025-10-10T00:43:24.5380260Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:43:24.5380540Z NO_SUDO: true 2025-10-10T00:43:24.5380764Z ##[endgroup] 2025-10-10T00:43:24.5530632Z ##[group]Run actions/checkout@v4 2025-10-10T00:43:24.5530929Z with: 2025-10-10T00:43:24.5531178Z ref: 344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:43:24.5531493Z fetch-depth: 0 2025-10-10T00:43:24.5531735Z submodules: recursive 2025-10-10T00:43:24.5531997Z show-progress: false 2025-10-10T00:43:24.5532265Z repository: pytorch/pytorch 2025-10-10T00:43:24.5532660Z token: *** 2025-10-10T00:43:24.5532887Z ssh-strict: true 2025-10-10T00:43:24.5533118Z ssh-user: git 2025-10-10T00:43:24.5533360Z persist-credentials: true 2025-10-10T00:43:24.5533624Z clean: true 2025-10-10T00:43:24.5533881Z sparse-checkout-cone-mode: true 2025-10-10T00:43:24.5534177Z fetch-tags: false 2025-10-10T00:43:24.5534637Z lfs: false 2025-10-10T00:43:24.5534862Z set-safe-directory: true 2025-10-10T00:43:24.5535132Z env: 2025-10-10T00:43:24.5535346Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:43:24.5535600Z ##[endgroup] 2025-10-10T00:43:24.6639421Z Syncing repository: pytorch/pytorch 2025-10-10T00:43:24.6640714Z ##[group]Getting Git version info 2025-10-10T00:43:24.6641170Z Working directory is '/home/ec2-user/actions-runner/_work/pytorch/pytorch' 2025-10-10T00:43:24.6641799Z [command]/usr/bin/git version 2025-10-10T00:43:24.6843389Z git version 2.50.1 2025-10-10T00:43:24.6868798Z ##[endgroup] 2025-10-10T00:43:24.6879841Z Copying '/home/ec2-user/.gitconfig' to '/home/ec2-user/actions-runner/_work/_temp/0531a6fd-7294-46ef-b761-bcb516037a51/.gitconfig' 2025-10-10T00:43:24.6899255Z Temporarily overriding HOME='/home/ec2-user/actions-runner/_work/_temp/0531a6fd-7294-46ef-b761-bcb516037a51' before making global git config changes 2025-10-10T00:43:24.6900358Z Adding repository directory to the temporary git global config as a safe directory 2025-10-10T00:43:24.6905019Z [command]/usr/bin/git config --global --add safe.directory /home/ec2-user/actions-runner/_work/pytorch/pytorch 2025-10-10T00:43:24.6969473Z Deleting the contents of '/home/ec2-user/actions-runner/_work/pytorch/pytorch' 2025-10-10T00:43:24.6972577Z ##[group]Initializing the repository 2025-10-10T00:43:24.6976641Z [command]/usr/bin/git init /home/ec2-user/actions-runner/_work/pytorch/pytorch 2025-10-10T00:43:24.7059002Z hint: Using 'master' as the name for the initial branch. This default branch name 2025-10-10T00:43:24.7059925Z hint: is subject to change. To configure the initial branch name to use in all 2025-10-10T00:43:24.7060637Z hint: of your new repositories, which will suppress this warning, call: 2025-10-10T00:43:24.7061045Z hint: 2025-10-10T00:43:24.7061361Z hint: git config --global init.defaultBranch 2025-10-10T00:43:24.7061709Z hint: 2025-10-10T00:43:24.7062046Z hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and 2025-10-10T00:43:24.7062636Z hint: 'development'. The just-created branch can be renamed via this command: 2025-10-10T00:43:24.7063067Z hint: 2025-10-10T00:43:24.7063296Z hint: git branch -m 2025-10-10T00:43:24.7063562Z hint: 2025-10-10T00:43:24.7063936Z hint: Disable this message with "git config set advice.defaultBranchName false" 2025-10-10T00:43:24.7068032Z Initialized empty Git repository in /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/ 2025-10-10T00:43:24.7080903Z [command]/usr/bin/git remote add origin https://github.com/pytorch/pytorch 2025-10-10T00:43:24.7130574Z ##[endgroup] 2025-10-10T00:43:24.7131275Z ##[group]Disabling automatic garbage collection 2025-10-10T00:43:24.7135144Z [command]/usr/bin/git config --local gc.auto 0 2025-10-10T00:43:24.7172766Z ##[endgroup] 2025-10-10T00:43:24.7173400Z ##[group]Setting up auth 2025-10-10T00:43:24.7180105Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-10-10T00:43:24.7217441Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-10-10T00:43:24.7641227Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-10-10T00:43:24.7677405Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-10-10T00:43:24.8098078Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-10-10T00:43:24.8160562Z ##[endgroup] 2025-10-10T00:43:24.8161238Z ##[group]Fetching the repository 2025-10-10T00:43:24.8169195Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2025-10-10T00:44:16.0665623Z From https://github.com/pytorch/pytorch 2025-10-10T00:44:16.0666720Z * [new branch] 2.6.0.dev20241004+ -> origin/2.6.0.dev20241004+ 2025-10-10T00:44:16.0667322Z * [new branch] AaronWang04_addmmfusion_perftest -> origin/AaronWang04_addmmfusion_perftest 2025-10-10T00:44:16.0667979Z * [new branch] BootcampDynamo -> origin/BootcampDynamo 2025-10-10T00:44:16.0668482Z * [new branch] DynamoFixGit -> origin/DynamoFixGit 2025-10-10T00:44:16.0669607Z * [new branch] DynamoVariaT -> origin/DynamoVariaT 2025-10-10T00:44:16.0672577Z * [new branch] HDCharles-2.6.0-release-notes -> origin/HDCharles-2.6.0-release-notes 2025-10-10T00:44:16.0674646Z * [new branch] ISSUE-154849 -> origin/ISSUE-154849 2025-10-10T00:44:16.0678305Z * [new branch] IvanKobzarev/stack/1 -> origin/IvanKobzarev/stack/1 2025-10-10T00:44:16.0680201Z * [new branch] IvanKobzarev/stack/2 -> origin/IvanKobzarev/stack/2 2025-10-10T00:44:16.0682416Z * [new branch] NicoshevSVE128 -> origin/NicoshevSVE128 2025-10-10T00:44:16.0684228Z * [new branch] PR-AOTInductorNoneBug -> origin/PR-AOTInductorNoneBug 2025-10-10T00:44:16.0686186Z * [new branch] PR-AOTInductorNoneBugFix -> origin/PR-AOTInductorNoneBugFix 2025-10-10T00:44:16.0688046Z * [new branch] PR-FixConfigsIssue -> origin/PR-FixConfigsIssue 2025-10-10T00:44:16.0689903Z * [new branch] PR-NoneBugFix-viable -> origin/PR-NoneBugFix-viable 2025-10-10T00:44:16.0691888Z * [new branch] PR-ResetToZero -> origin/PR-ResetToZero 2025-10-10T00:44:16.0693870Z * [new branch] Update-Flash-Packaging -> origin/Update-Flash-Packaging 2025-10-10T00:44:16.0698226Z * [new branch] VLA_exp -> origin/VLA_exp 2025-10-10T00:44:16.0699154Z * [new branch] actually-run-mps-aot-inductor -> origin/actually-run-mps-aot-inductor 2025-10-10T00:44:16.0700308Z * [new branch] add_compile_benchmarking -> origin/add_compile_benchmarking 2025-10-10T00:44:16.0701345Z * [new branch] add_op_tests -> origin/add_op_tests 2025-10-10T00:44:16.0703641Z * [new branch] add_op_to_dashboard -> origin/add_op_to_dashboard 2025-10-10T00:44:16.0705519Z * [new branch] addmm-heuristic -> origin/addmm-heuristic 2025-10-10T00:44:16.0707528Z * [new branch] addremovefunction -> origin/addremovefunction 2025-10-10T00:44:16.0709334Z * [new branch] addvllmtest -> origin/addvllmtest 2025-10-10T00:44:16.0711962Z * [new branch] adi/test -> origin/adi/test 2025-10-10T00:44:16.0713820Z * [new branch] adi/test_bgemm -> origin/adi/test_bgemm 2025-10-10T00:44:16.0715650Z * [new branch] adi/test_fusions -> origin/adi/test_fusions 2025-10-10T00:44:16.0717474Z * [new branch] adi/test_onednn -> origin/adi/test_onednn 2025-10-10T00:44:16.0719477Z * [new branch] adi/test_onednn_v3.9 -> origin/adi/test_onednn_v3.9 2025-10-10T00:44:16.0720932Z * [new branch] adi/test_presve_change -> origin/adi/test_presve_change 2025-10-10T00:44:16.0722945Z * [new branch] adi/test_timm -> origin/adi/test_timm 2025-10-10T00:44:16.0725464Z * [new branch] adi/testpresve_change -> origin/adi/testpresve_change 2025-10-10T00:44:16.0728659Z * [new branch] aditew01/test/vec_bf16 -> origin/aditew01/test/vec_bf16 2025-10-10T00:44:16.0730526Z * [new branch] ah-globalfeedback-hook -> origin/ah-globalfeedback-hook 2025-10-10T00:44:16.0732352Z * [new branch] alt-disable -> origin/alt-disable 2025-10-10T00:44:16.0735037Z * [new branch] angelayi/allow_fake -> origin/angelayi/allow_fake 2025-10-10T00:44:16.0736974Z * [new branch] angelayi/aoti_additional_files -> origin/angelayi/aoti_additional_files 2025-10-10T00:44:16.0738870Z * [new branch] angelayi/benchmark -> origin/angelayi/benchmark 2025-10-10T00:44:16.0740745Z * [new branch] angelayi/benchmark2 -> origin/angelayi/benchmark2 2025-10-10T00:44:16.0742545Z * [new branch] angelayi/benchmark3 -> origin/angelayi/benchmark3 2025-10-10T00:44:16.0744463Z * [new branch] angelayi/change_pytree_serialization -> origin/angelayi/change_pytree_serialization 2025-10-10T00:44:16.0745859Z * [new branch] angelayi/cpp_loader -> origin/angelayi/cpp_loader 2025-10-10T00:44:16.0748320Z * [new branch] angelayi/customop -> origin/angelayi/customop 2025-10-10T00:44:16.0750700Z * [new branch] angelayi/fix_mps -> origin/angelayi/fix_mps 2025-10-10T00:44:16.0752812Z * [new branch] angelayi/lint -> origin/angelayi/lint 2025-10-10T00:44:16.0755000Z * [new branch] angelayi/no_so_weight -> origin/angelayi/no_so_weight 2025-10-10T00:44:16.0756730Z * [new branch] angelayi/opaque_obj_v2 -> origin/angelayi/opaque_obj_v2 2025-10-10T00:44:16.0758597Z * [new branch] angelayi/pattern -> origin/angelayi/pattern 2025-10-10T00:44:16.0760573Z * [new branch] angelayi/pattern_in_out_2 -> origin/angelayi/pattern_in_out_2 2025-10-10T00:44:16.0762460Z * [new branch] angelayi/post_grad -> origin/angelayi/post_grad 2025-10-10T00:44:16.0764380Z * [new branch] angelayi/pytree -> origin/angelayi/pytree 2025-10-10T00:44:16.0766257Z * [new branch] angelayi/scan_layers -> origin/angelayi/scan_layers 2025-10-10T00:44:16.0768405Z * [new branch] angelayi/symint_input -> origin/angelayi/symint_input 2025-10-10T00:44:16.0770326Z * [new branch] angelayi/symm_mem -> origin/angelayi/symm_mem 2025-10-10T00:44:16.0772201Z * [new branch] angelayi/test_cpp -> origin/angelayi/test_cpp 2025-10-10T00:44:16.0774070Z * [new branch] angelayi/torch_size -> origin/angelayi/torch_size 2025-10-10T00:44:16.0775912Z * [new branch] angelayi/wrap_grad -> origin/angelayi/wrap_grad 2025-10-10T00:44:16.0777783Z * [new branch] annotate_1 -> origin/annotate_1 2025-10-10T00:44:16.0779676Z * [new branch] annotation_bw -> origin/annotation_bw 2025-10-10T00:44:16.0781412Z * [new branch] annotation_dynamo -> origin/annotation_dynamo 2025-10-10T00:44:16.0783335Z * [new branch] aot_eager_stack_trace -> origin/aot_eager_stack_trace 2025-10-10T00:44:16.0785251Z * [new branch] aoti-cuda-alloc -> origin/aoti-cuda-alloc 2025-10-10T00:44:16.0787079Z * [new branch] aoti_fqn_name_interface -> origin/aoti_fqn_name_interface 2025-10-10T00:44:16.0788975Z * [new branch] aoti_metal_shimify -> origin/aoti_metal_shimify 2025-10-10T00:44:16.0790813Z * [new branch] aoti_package_weights_binary -> origin/aoti_package_weights_binary 2025-10-10T00:44:16.0792659Z * [new branch] aoti_target_windows -> origin/aoti_target_windows 2025-10-10T00:44:16.0794531Z * [new branch] aoti_weight_sharing -> origin/aoti_weight_sharing 2025-10-10T00:44:16.0796430Z * [new branch] aoti_windows_mingw -> origin/aoti_windows_mingw 2025-10-10T00:44:16.0798252Z * [new branch] aoti_windows_mingw_2 -> origin/aoti_windows_mingw_2 2025-10-10T00:44:16.0801879Z * [new branch] arsh/feat/inductor_check_profiling -> origin/arsh/feat/inductor_check_profiling 2025-10-10T00:44:16.0803664Z * [new branch] async_tp -> origin/async_tp 2025-10-10T00:44:16.0805788Z * [new branch] atalman-inductor-perf-cu124 -> origin/atalman-inductor-perf-cu124 2025-10-10T00:44:16.0807829Z * [new branch] atalman-inductor-perf-cu124.1 -> origin/atalman-inductor-perf-cu124.1 2025-10-10T00:44:16.0809579Z * [new branch] atalman-patch-1 -> origin/atalman-patch-1 2025-10-10T00:44:16.0811434Z * [new branch] atalman-patch-2 -> origin/atalman-patch-2 2025-10-10T00:44:16.0813448Z * [new branch] atalman-patch-3 -> origin/atalman-patch-3 2025-10-10T00:44:16.0815340Z * [new branch] atalman-patch-4 -> origin/atalman-patch-4 2025-10-10T00:44:16.0817280Z * [new branch] atalman-patch-5 -> origin/atalman-patch-5 2025-10-10T00:44:16.0819218Z * [new branch] atalman-patch-6 -> origin/atalman-patch-6 2025-10-10T00:44:16.0821166Z * [new branch] atalman-patch-7 -> origin/atalman-patch-7 2025-10-10T00:44:16.0823292Z * [new branch] atalman_inductor_2.3.0 -> origin/atalman_inductor_2.3.0 2025-10-10T00:44:16.0825076Z * [new branch] atalman_inductor_2.3.1 -> origin/atalman_inductor_2.3.1 2025-10-10T00:44:16.0826891Z * [new branch] atalman_inductor_2.4.0 -> origin/atalman_inductor_2.4.0 2025-10-10T00:44:16.0828851Z * [new branch] atalman_inductor_2.4.x -> origin/atalman_inductor_2.4.x 2025-10-10T00:44:16.0830659Z * [new branch] attention_benchmark -> origin/attention_benchmark 2025-10-10T00:44:16.0832652Z * [new branch] attention_benchmarking_clean -> origin/attention_benchmarking_clean 2025-10-10T00:44:16.0834375Z * [new branch] b200_op_bench -> origin/b200_op_bench 2025-10-10T00:44:16.0836958Z * [new branch] bahuang/annotation -> origin/bahuang/annotation 2025-10-10T00:44:16.0838740Z * [new branch] bahuang/debug_mode -> origin/bahuang/debug_mode 2025-10-10T00:44:16.0840558Z * [new branch] bahuang/debug_mode_default -> origin/bahuang/debug_mode_default 2025-10-10T00:44:16.0842362Z * [new branch] bahuang/debug_mode_fix -> origin/bahuang/debug_mode_fix 2025-10-10T00:44:16.0844174Z * [new branch] bahuang/dt_fix_scalar_add -> origin/bahuang/dt_fix_scalar_add 2025-10-10T00:44:16.0845927Z * [new branch] bahuang/dt_reduce_mean -> origin/bahuang/dt_reduce_mean 2025-10-10T00:44:16.0847847Z * [new branch] bahuang/dtensor_demo -> origin/bahuang/dtensor_demo 2025-10-10T00:44:16.0849952Z * [new branch] bahuang/export_dtensor -> origin/bahuang/export_dtensor 2025-10-10T00:44:16.0852189Z * [new branch] bahuang/fix_debug_mode -> origin/bahuang/fix_debug_mode 2025-10-10T00:44:16.0854215Z * [new branch] bahuang/fix_debug_mode2 -> origin/bahuang/fix_debug_mode2 2025-10-10T00:44:16.0856018Z * [new branch] bahuang/fix_expand -> origin/bahuang/fix_expand 2025-10-10T00:44:16.0857949Z * [new branch] bahuang/noop_redistribute -> origin/bahuang/noop_redistribute 2025-10-10T00:44:16.0859930Z * [new branch] bahuang/reland -> origin/bahuang/reland 2025-10-10T00:44:16.0861842Z * [new branch] bahuang/reland_fake_export -> origin/bahuang/reland_fake_export 2025-10-10T00:44:16.0863581Z * [new branch] bahuang/rename -> origin/bahuang/rename 2025-10-10T00:44:16.0865618Z * [new branch] bahuang/test -> origin/bahuang/test 2025-10-10T00:44:16.0868281Z * [new branch] base/1.5 -> origin/base/1.5 2025-10-10T00:44:16.0870314Z * [new branch] batching_sdpa_efficient_attention -> origin/batching_sdpa_efficient_attention 2025-10-10T00:44:16.0872117Z * [new branch] bc-lint-test-new-config -> origin/bc-lint-test-new-config 2025-10-10T00:44:16.0874102Z * [new branch] benchmark-updates -> origin/benchmark-updates 2025-10-10T00:44:16.0876012Z * [new branch] benchmarking-script -> origin/benchmarking-script 2025-10-10T00:44:16.0878627Z * [new branch] bertmaher/pinbump26 -> origin/bertmaher/pinbump26 2025-10-10T00:44:16.0881175Z * [new branch] bertrand/cutlass -> origin/bertrand/cutlass 2025-10-10T00:44:16.0883725Z * [new branch] bf/cg-custom-wrapper -> origin/bf/cg-custom-wrapper 2025-10-10T00:44:16.0885556Z * [new branch] bf/cg-error-re-record -> origin/bf/cg-error-re-record 2025-10-10T00:44:16.0887501Z * [new branch] bf/cg-partition-custom-op-mutation -> origin/bf/cg-partition-custom-op-mutation 2025-10-10T00:44:16.0889055Z * [new branch] bf/cg-remove-check -> origin/bf/cg-remove-check 2025-10-10T00:44:16.0891097Z * [new branch] bf/cg-warn-dynamic-shapes -> origin/bf/cg-warn-dynamic-shapes 2025-10-10T00:44:16.0893817Z * [new branch] bf/cherry-pick-partition-share-default-device-context -> origin/bf/cherry-pick-partition-share-default-device-context 2025-10-10T00:44:16.0894939Z * [new branch] bf/clean-hf -> origin/bf/clean-hf 2025-10-10T00:44:16.0897131Z * [new branch] bf/clean-timm -> origin/bf/clean-timm 2025-10-10T00:44:16.0899315Z * [new branch] bf/clean-torchbench -> origin/bf/clean-torchbench 2025-10-10T00:44:16.0901463Z * [new branch] bf/clean-torchbench-hf -> origin/bf/clean-torchbench-hf 2025-10-10T00:44:16.0903138Z * [new branch] bf/cudagraph -> origin/bf/cudagraph 2025-10-10T00:44:16.0905141Z * [new branch] bf/cudagraph-disable-input-mutation -> origin/bf/cudagraph-disable-input-mutation 2025-10-10T00:44:16.0907257Z * [new branch] bf/cudagraph-enable-input-mutation-support-benchmark -> origin/bf/cudagraph-enable-input-mutation-support-benchmark 2025-10-10T00:44:16.0908711Z * [new branch] bf/cudagraph-partition -> origin/bf/cudagraph-partition 2025-10-10T00:44:16.0911155Z * [new branch] bf/donated-buffer-bench -> origin/bf/donated-buffer-bench 2025-10-10T00:44:16.0913147Z * [new branch] bf/minor-cg-config-doc -> origin/bf/minor-cg-config-doc 2025-10-10T00:44:16.0915156Z * [new branch] bf/minor-fa-tma-config -> origin/bf/minor-fa-tma-config 2025-10-10T00:44:16.0917074Z * [new branch] bf/pa-non-divisible -> origin/bf/pa-non-divisible 2025-10-10T00:44:16.0918937Z * [new branch] bf/partition-custom-op-alias -> origin/bf/partition-custom-op-alias 2025-10-10T00:44:16.0921380Z * [new branch] bf/partition-default-device-context -> origin/bf/partition-default-device-context 2025-10-10T00:44:16.0922957Z * [new branch] bf/partition-move-cpu -> origin/bf/partition-move-cpu 2025-10-10T00:44:16.0925037Z * [new branch] bf/remove-check-55b0c39d -> origin/bf/remove-check-55b0c39d 2025-10-10T00:44:16.0926928Z * [new branch] bf/rope -> origin/bf/rope 2025-10-10T00:44:16.0929164Z * [new branch] bf16_support -> origin/bf16_support 2025-10-10T00:44:16.0931127Z * [new branch] bf16_support_per_channel -> origin/bf16_support_per_channel 2025-10-10T00:44:16.0933068Z * [new branch] bisect_perf_hf_T5_3acc6eac492 -> origin/bisect_perf_hf_T5_3acc6eac492 2025-10-10T00:44:16.0934942Z * [new branch] bisect_perf_hf_T5_3fcf66f61fb -> origin/bisect_perf_hf_T5_3fcf66f61fb 2025-10-10T00:44:16.0936800Z * [new branch] bisect_perf_hf_T5_4009d154129 -> origin/bisect_perf_hf_T5_4009d154129 2025-10-10T00:44:16.0938687Z * [new branch] bisect_perf_hf_T5_40d0740e73d -> origin/bisect_perf_hf_T5_40d0740e73d 2025-10-10T00:44:16.0940468Z * [new branch] bisect_perf_hf_T5_5268754e -> origin/bisect_perf_hf_T5_5268754e 2025-10-10T00:44:16.0942447Z * [new branch] bisect_perf_hf_T5_7d89a8d385c -> origin/bisect_perf_hf_T5_7d89a8d385c 2025-10-10T00:44:16.0944311Z * [new branch] bisect_perf_hf_T5_b7a25c1ee7c -> origin/bisect_perf_hf_T5_b7a25c1ee7c 2025-10-10T00:44:16.0946210Z * [new branch] bisect_perf_hf_T5_c25b201583f -> origin/bisect_perf_hf_T5_c25b201583f 2025-10-10T00:44:16.0948135Z * [new branch] bisect_perf_hf_T5_c93e57efac0 -> origin/bisect_perf_hf_T5_c93e57efac0 2025-10-10T00:44:16.0950014Z * [new branch] bisect_perf_hf_T5_ca9813ea149 -> origin/bisect_perf_hf_T5_ca9813ea149 2025-10-10T00:44:16.0951869Z * [new branch] bisect_perf_hf_T5_d65f194a -> origin/bisect_perf_hf_T5_d65f194a 2025-10-10T00:44:16.0953744Z * [new branch] bisect_perf_hf_T5_da94ab0b -> origin/bisect_perf_hf_T5_da94ab0b 2025-10-10T00:44:16.0955806Z * [new branch] bisect_perf_hf_T5_da94ab0b_new -> origin/bisect_perf_hf_T5_da94ab0b_new 2025-10-10T00:44:16.0957620Z * [new branch] bisect_perf_hf_T5_db4e8a1d8a8 -> origin/bisect_perf_hf_T5_db4e8a1d8a8 2025-10-10T00:44:16.0959518Z * [new branch] bisect_perf_hf_T5_e0d97e936a2 -> origin/bisect_perf_hf_T5_e0d97e936a2 2025-10-10T00:44:16.0961376Z * [new branch] bisect_perf_hf_T5_f23621ec563 -> origin/bisect_perf_hf_T5_f23621ec563 2025-10-10T00:44:16.0964038Z * [new branch] bowbao/wip_prs -> origin/bowbao/wip_prs 2025-10-10T00:44:16.0966746Z * [new branch] brister/break_scatter_src_is_tensor -> origin/brister/break_scatter_src_is_tensor 2025-10-10T00:44:16.0968832Z * [new branch] brister/fx_cond -> origin/brister/fx_cond 2025-10-10T00:44:16.0970950Z * [new branch] brister/fx_dynamic_input -> origin/brister/fx_dynamic_input 2025-10-10T00:44:16.0972743Z * [new branch] brister/fx_index_put -> origin/brister/fx_index_put 2025-10-10T00:44:16.0974623Z * [new branch] brister/fx_no_python_slow -> origin/brister/fx_no_python_slow 2025-10-10T00:44:16.0976413Z * [new branch] brister/fx_scatter_reduce -> origin/brister/fx_scatter_reduce 2025-10-10T00:44:16.0978624Z * [new branch] brister/fx_unbacked_symbols -> origin/brister/fx_unbacked_symbols 2025-10-10T00:44:16.0980914Z * [new branch] brister/property_type_check -> origin/brister/property_type_check 2025-10-10T00:44:16.0982802Z * [new branch] brister/test_inductor_all_fx -> origin/brister/test_inductor_all_fx 2025-10-10T00:44:16.0984905Z * [new branch] brister/tiled_reduction_no_numel_check -> origin/brister/tiled_reduction_no_numel_check 2025-10-10T00:44:16.0986458Z * [new branch] build-aarch64-wheels -> origin/build-aarch64-wheels 2025-10-10T00:44:16.0988669Z * [new branch] bwd-backup -> origin/bwd-backup 2025-10-10T00:44:16.0990841Z * [new branch] c57382a49 -> origin/c57382a49 2025-10-10T00:44:16.0992662Z * [new branch] ca_0431d47eaa -> origin/ca_0431d47eaa 2025-10-10T00:44:16.0994600Z * [new branch] ca_fix_0431d47eaa -> origin/ca_fix_0431d47eaa 2025-10-10T00:44:16.0997523Z * [new branch] camyll/cherrypick_0098e5636d3afa7c75aef8c447a5c402ea9ed524 -> origin/camyll/cherrypick_0098e5636d3afa7c75aef8c447a5c402ea9ed524 2025-10-10T00:44:16.0999481Z * [new branch] camyll/cherrypick_3016616ccbba3dc9bb6a80eb4a81a846ddf49cc9 -> origin/camyll/cherrypick_3016616ccbba3dc9bb6a80eb4a81a846ddf49cc9 2025-10-10T00:44:16.1002266Z * [new branch] camyll/revert-94bc900da97ad7f3c35b3b819bb53b23c74b581a-for-release-2.8 -> origin/camyll/revert-94bc900da97ad7f3c35b3b819bb53b23c74b581a-for-release-2.8 2025-10-10T00:44:16.1003812Z * [new branch] camyll/revert_5d749ceb92c2c28bcfbdf918b4ab99b1a91fcb50 -> origin/camyll/revert_5d749ceb92c2c28bcfbdf918b4ab99b1a91fcb50 2025-10-10T00:44:16.1006903Z * [new branch] camyllh/cherrypick_5e7be988003a38be49227cfaa9bff6a2ea9e6929_v2 -> origin/camyllh/cherrypick_5e7be988003a38be49227cfaa9bff6a2ea9e6929_v2 2025-10-10T00:44:16.1008538Z * [new branch] camyllh/cherrypick_dda071587f0522a16b237f92cbe27fd13a1a1c11 -> origin/camyllh/cherrypick_dda071587f0522a16b237f92cbe27fd13a1a1c11 2025-10-10T00:44:16.1011575Z * [new branch] camyllh/release2_9_cherrypick/dda071587f0522a16b237f92cbe27fd13a1a1c11 -> origin/camyllh/release2_9_cherrypick/dda071587f0522a16b237f92cbe27fd13a1a1c11 2025-10-10T00:44:16.1013142Z * [new branch] camyllh/test_setup_hooks_push -> origin/camyllh/test_setup_hooks_push 2025-10-10T00:44:16.1015546Z * [new branch] cherry-pick-157453-by-pytorch_bot_bot_ -> origin/cherry-pick-157453-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1017717Z * [new branch] cherry-pick-157513-by-pytorch_bot_bot_ -> origin/cherry-pick-157513-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1019149Z * [new branch] cherry-pick-157695-by-pytorch_bot_bot_ -> origin/cherry-pick-157695-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1021667Z * [new branch] cherry-pick-157732-by-pytorch_bot_bot_ -> origin/cherry-pick-157732-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1023818Z * [new branch] cherry-pick-158537-by-pytorch_bot_bot_ -> origin/cherry-pick-158537-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1025923Z * [new branch] cherry-pick-159969-by-pytorch_bot_bot_ -> origin/cherry-pick-159969-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1027876Z * [new branch] cherry-pick-160586-by-pytorch_bot_bot_ -> origin/cherry-pick-160586-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1030500Z * [new branch] cherry-pick-161299-by-pytorch_bot_bot_ -> origin/cherry-pick-161299-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1033536Z * [new branch] cherry-pick-161394-by-pytorch_bot_bot_ -> origin/cherry-pick-161394-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1034903Z * [new branch] cherry-pick-161430-by-pytorch_bot_bot_ -> origin/cherry-pick-161430-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1037258Z * [new branch] cherry-pick-162168-by-pytorch_bot_bot_ -> origin/cherry-pick-162168-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1039459Z * [new branch] cherry-pick-162194-by-pytorch_bot_bot_ -> origin/cherry-pick-162194-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1041502Z * [new branch] cherry-pick-162240-by-pytorch_bot_bot_ -> origin/cherry-pick-162240-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1043851Z * [new branch] cherry-pick-162295-by-pytorch_bot_bot_ -> origin/cherry-pick-162295-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1045904Z * [new branch] cherry-pick-162323-by-pytorch_bot_bot_ -> origin/cherry-pick-162323-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1047619Z * [new branch] cherry-pick-162425-by-pytorch_bot_bot_ -> origin/cherry-pick-162425-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1050226Z * [new branch] cherry-pick-162530-by-pytorch_bot_bot_ -> origin/cherry-pick-162530-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1051801Z * [new branch] cherry-pick-162555-by-pytorch_bot_bot_ -> origin/cherry-pick-162555-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1054395Z * [new branch] cherry-pick-162566-by-pytorch_bot_bot_ -> origin/cherry-pick-162566-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1055901Z * [new branch] cherry-pick-162587-by-pytorch_bot_bot_ -> origin/cherry-pick-162587-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1058135Z * [new branch] cherry-pick-162622-by-pytorch_bot_bot_ -> origin/cherry-pick-162622-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1060124Z * [new branch] cherry-pick-162657-by-pytorch_bot_bot_ -> origin/cherry-pick-162657-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1062058Z * [new branch] cherry-pick-162680-by-pytorch_bot_bot_ -> origin/cherry-pick-162680-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1064287Z * [new branch] cherry-pick-162693-by-pytorch_bot_bot_ -> origin/cherry-pick-162693-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1066182Z * [new branch] cherry-pick-162744-by-pytorch_bot_bot_ -> origin/cherry-pick-162744-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1068475Z * [new branch] cherry-pick-162764-by-pytorch_bot_bot_ -> origin/cherry-pick-162764-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1070548Z * [new branch] cherry-pick-162865-by-pytorch_bot_bot_ -> origin/cherry-pick-162865-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1072480Z * [new branch] cherry-pick-162866-by-pytorch_bot_bot_ -> origin/cherry-pick-162866-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1074406Z * [new branch] cherry-pick-162877-by-pytorch_bot_bot_ -> origin/cherry-pick-162877-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1076426Z * [new branch] cherry-pick-162950-by-pytorch_bot_bot_ -> origin/cherry-pick-162950-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1078115Z * [new branch] cherry-pick-163008-by-pytorch_bot_bot_ -> origin/cherry-pick-163008-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1080400Z * [new branch] cherry-pick-163111-by-pytorch_bot_bot_ -> origin/cherry-pick-163111-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1082388Z * [new branch] cherry-pick-163112-by-pytorch_bot_bot_ -> origin/cherry-pick-163112-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1084381Z * [new branch] cherry-pick-163152-by-pytorch_bot_bot_ -> origin/cherry-pick-163152-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1086358Z * [new branch] cherry-pick-163171-by-pytorch_bot_bot_ -> origin/cherry-pick-163171-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1088557Z * [new branch] cherry-pick-163194-by-pytorch_bot_bot_ -> origin/cherry-pick-163194-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1090562Z * [new branch] cherry-pick-163227-by-pytorch_bot_bot_ -> origin/cherry-pick-163227-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1092662Z * [new branch] cherry-pick-163269-by-pytorch_bot_bot_ -> origin/cherry-pick-163269-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1094285Z * [new branch] cherry-pick-163298-by-pytorch_bot_bot_ -> origin/cherry-pick-163298-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1096459Z * [new branch] cherry-pick-163315-by-pytorch_bot_bot_ -> origin/cherry-pick-163315-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1098614Z * [new branch] cherry-pick-163339-by-pytorch_bot_bot_ -> origin/cherry-pick-163339-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1100812Z * [new branch] cherry-pick-163341-by-pytorch_bot_bot_ -> origin/cherry-pick-163341-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1102221Z * [new branch] cherry-pick-163370-by-pytorch_bot_bot_ -> origin/cherry-pick-163370-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1104534Z * [new branch] cherry-pick-163383-by-pytorch_bot_bot_ -> origin/cherry-pick-163383-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1106572Z * [new branch] cherry-pick-163426-by-pytorch_bot_bot_ -> origin/cherry-pick-163426-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1108520Z * [new branch] cherry-pick-163549-by-pytorch_bot_bot_ -> origin/cherry-pick-163549-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1110448Z * [new branch] cherry-pick-163571-by-pytorch_bot_bot_ -> origin/cherry-pick-163571-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1112407Z * [new branch] cherry-pick-163578-by-pytorch_bot_bot_ -> origin/cherry-pick-163578-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1114394Z * [new branch] cherry-pick-163581-by-pytorch_bot_bot_ -> origin/cherry-pick-163581-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1116308Z * [new branch] cherry-pick-163585-by-pytorch_bot_bot_ -> origin/cherry-pick-163585-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1132864Z * [new branch] cherry-pick-163587-by-pytorch_bot_bot_ -> origin/cherry-pick-163587-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1133806Z * [new branch] cherry-pick-163598-by-pytorch_bot_bot_ -> origin/cherry-pick-163598-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1134568Z * [new branch] cherry-pick-163661-by-pytorch_bot_bot_ -> origin/cherry-pick-163661-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1135410Z * [new branch] cherry-pick-163677-by-pytorch_bot_bot_ -> origin/cherry-pick-163677-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1136143Z * [new branch] cherry-pick-163682-by-pytorch_bot_bot_ -> origin/cherry-pick-163682-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1136943Z * [new branch] cherry-pick-163712-by-pytorch_bot_bot_ -> origin/cherry-pick-163712-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1137668Z * [new branch] cherry-pick-163719-by-pytorch_bot_bot_ -> origin/cherry-pick-163719-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1138543Z * [new branch] cherry-pick-163768-by-pytorch_bot_bot_ -> origin/cherry-pick-163768-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1139354Z * [new branch] cherry-pick-163776-by-pytorch_bot_bot_ -> origin/cherry-pick-163776-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1140071Z * [new branch] cherry-pick-163797-by-pytorch_bot_bot_ -> origin/cherry-pick-163797-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1140907Z * [new branch] cherry-pick-163837-by-pytorch_bot_bot_ -> origin/cherry-pick-163837-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1141802Z * [new branch] cherry-pick-163886-by-pytorch_bot_bot_ -> origin/cherry-pick-163886-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1142874Z * [new branch] cherry-pick-163903-by-pytorch_bot_bot_ -> origin/cherry-pick-163903-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1145275Z * [new branch] cherry-pick-163956-by-pytorch_bot_bot_ -> origin/cherry-pick-163956-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1146965Z * [new branch] cherry-pick-163988-by-pytorch_bot_bot_ -> origin/cherry-pick-163988-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1149331Z * [new branch] cherry-pick-164093-by-pytorch_bot_bot_ -> origin/cherry-pick-164093-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1151018Z * [new branch] cherry-pick-164108-by-pytorch_bot_bot_ -> origin/cherry-pick-164108-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1153414Z * [new branch] cherry-pick-164138-by-pytorch_bot_bot_ -> origin/cherry-pick-164138-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1154825Z * [new branch] cherry-pick-164190 -> origin/cherry-pick-164190 2025-10-10T00:44:16.1157363Z * [new branch] cherry-pick-164470-by-pytorch_bot_bot_ -> origin/cherry-pick-164470-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1159484Z * [new branch] cherry-pick-164575-by-pytorch_bot_bot_ -> origin/cherry-pick-164575-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1161959Z * [new branch] cherry-pick-164774-by-pytorch_bot_bot_ -> origin/cherry-pick-164774-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1164233Z * [new branch] cherry-pick-164870-by-pytorch_bot_bot_ -> origin/cherry-pick-164870-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1165841Z * [new branch] cherry-pick-164946-by-pytorch_bot_bot_ -> origin/cherry-pick-164946-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1168339Z * [new branch] cherry-pick-165013-by-pytorch_bot_bot_ -> origin/cherry-pick-165013-by-pytorch_bot_bot_ 2025-10-10T00:44:16.1169881Z * [new branch] cherry_pick_graph_custom -> origin/cherry_pick_graph_custom 2025-10-10T00:44:16.1172209Z * [new branch] cherrypick-e88cca0691 -> origin/cherrypick-e88cca0691 2025-10-10T00:44:16.1174328Z * [new branch] chuanqi129-patch-1 -> origin/chuanqi129-patch-1 2025-10-10T00:44:16.1176193Z * [new branch] ck_dlpack -> origin/ck_dlpack 2025-10-10T00:44:16.1178117Z * [new branch] codegen_trace -> origin/codegen_trace 2025-10-10T00:44:16.1180260Z * [new branch] codex-testing -> origin/codex-testing 2025-10-10T00:44:16.1183136Z * [new branch] codex/add-metadata-field-for-file-path -> origin/codex/add-metadata-field-for-file-path 2025-10-10T00:44:16.1184836Z * [new branch] codex/add-test-for-inductor-local-cache-behavior -> origin/codex/add-test-for-inductor-local-cache-behavior 2025-10-10T00:44:16.1187516Z * [new branch] codex/create-test-for-tensor-memory-leak-in-cudagraph -> origin/codex/create-test-for-tensor-memory-leak-in-cudagraph 2025-10-10T00:44:16.1189451Z * [new branch] codex/enhance-cuda.matmul-with-allow_splitk-argument -> origin/codex/enhance-cuda.matmul-with-allow_splitk-argument 2025-10-10T00:44:16.1191128Z * [new branch] codex/fix-issue-121219-in-pytorch -> origin/codex/fix-issue-121219-in-pytorch 2025-10-10T00:44:16.1193621Z * [new branch] codex/refactor-dimension-handling-in-shape.cu -> origin/codex/refactor-dimension-handling-in-shape.cu 2025-10-10T00:44:16.1195446Z * [new branch] codex/refactor-lintrunner-config-to-use-uv-run -> origin/codex/refactor-lintrunner-config-to-use-uv-run 2025-10-10T00:44:16.1197265Z * [new branch] codex/remove-allow-untyped-defs-and-fix-type-errors -> origin/codex/remove-allow-untyped-defs-and-fix-type-errors 2025-10-10T00:44:16.1199526Z * [new branch] codex/remove-allow-untyped-defs-and-fix-type-errors-vx0jek -> origin/codex/remove-allow-untyped-defs-and-fix-type-errors-vx0jek 2025-10-10T00:44:16.1201243Z * [new branch] compile_kernel_include_dir -> origin/compile_kernel_include_dir 2025-10-10T00:44:16.1203410Z * [new branch] context_test -> origin/context_test 2025-10-10T00:44:16.1205376Z * [new branch] conv1d_decomp -> origin/conv1d_decomp 2025-10-10T00:44:16.1207344Z * [new branch] conv_autotune -> origin/conv_autotune 2025-10-10T00:44:16.1210191Z * [new branch] copilot/fix-157446 -> origin/copilot/fix-157446 2025-10-10T00:44:16.1211685Z * [new branch] copilot/fix-163730 -> origin/copilot/fix-163730 2025-10-10T00:44:16.1214532Z * [new branch] cpio/fix_new_ami_tests -> origin/cpio/fix_new_ami_tests 2025-10-10T00:44:16.1216578Z * [new branch] cpp-docs-dependency-upgrade -> origin/cpp-docs-dependency-upgrade 2025-10-10T00:44:16.1218480Z * [new branch] cpp_head -> origin/cpp_head 2025-10-10T00:44:16.1220517Z * [new branch] crcrpar-patch-1 -> origin/crcrpar-patch-1 2025-10-10T00:44:16.1223109Z * [new branch] csl/add_win_shard -> origin/csl/add_win_shard 2025-10-10T00:44:16.1224953Z * [new branch] csl/always_produce_xml -> origin/csl/always_produce_xml 2025-10-10T00:44:16.1226796Z * [new branch] csl/build_test_more_procs -> origin/csl/build_test_more_procs 2025-10-10T00:44:16.1228362Z * [new branch] csl/build_test_more_procs2 -> origin/csl/build_test_more_procs2 2025-10-10T00:44:16.1230414Z * [new branch] csl/fix_internal_graph_executor -> origin/csl/fix_internal_graph_executor 2025-10-10T00:44:16.1232533Z * [new branch] csl/fix_nightly_docs_push -> origin/csl/fix_nightly_docs_push 2025-10-10T00:44:16.1234839Z * [new branch] csl/inductor_h100_nightly -> origin/csl/inductor_h100_nightly 2025-10-10T00:44:16.1236744Z * [new branch] csl/katex -> origin/csl/katex 2025-10-10T00:44:16.1238709Z * [new branch] csl/larger_runner -> origin/csl/larger_runner 2025-10-10T00:44:16.1240588Z * [new branch] csl/lint_no_submodules -> origin/csl/lint_no_submodules 2025-10-10T00:44:16.1242471Z * [new branch] csl/lint_testing -> origin/csl/lint_testing 2025-10-10T00:44:16.1244169Z * [new branch] csl/lintrunner_stuff -> origin/csl/lintrunner_stuff 2025-10-10T00:44:16.1246305Z * [new branch] csl/mps_sharding -> origin/csl/mps_sharding 2025-10-10T00:44:16.1248467Z * [new branch] csl/multistage_docker -> origin/csl/multistage_docker 2025-10-10T00:44:16.1250363Z * [new branch] csl/no_keep_goin_rocm -> origin/csl/no_keep_goin_rocm 2025-10-10T00:44:16.1252240Z * [new branch] csl/reuse_old_whl_fix_metadata -> origin/csl/reuse_old_whl_fix_metadata 2025-10-10T00:44:16.1254061Z * [new branch] csl/revert_open -> origin/csl/revert_open 2025-10-10T00:44:16.1256069Z * [new branch] csl/skip_build -> origin/csl/skip_build 2025-10-10T00:44:16.1257979Z * [new branch] csl/smaller_avx_amx_runenrs -> origin/csl/smaller_avx_amx_runenrs 2025-10-10T00:44:16.1259964Z * [new branch] csl/test_cuda_build_large_runner -> origin/csl/test_cuda_build_large_runner 2025-10-10T00:44:16.1262085Z * [new branch] csl/test_info_status -> origin/csl/test_info_status 2025-10-10T00:44:16.1263597Z * [new branch] csl/test_info_upload_changes -> origin/csl/test_info_upload_changes 2025-10-10T00:44:16.1265646Z * [new branch] csl/test_owners_ao_sparse -> origin/csl/test_owners_ao_sparse 2025-10-10T00:44:16.1267733Z * [new branch] csl/test_owners_autograd_dispatch_nn -> origin/csl/test_owners_autograd_dispatch_nn 2025-10-10T00:44:16.1269296Z * [new branch] csl/test_owners_cuda -> origin/csl/test_owners_cuda 2025-10-10T00:44:16.1271387Z * [new branch] csl/test_owners_distributed -> origin/csl/test_owners_distributed 2025-10-10T00:44:16.1273464Z * [new branch] csl/test_owners_higher_confidence -> origin/csl/test_owners_higher_confidence 2025-10-10T00:44:16.1275056Z * [new branch] csl/testing_better_job_name -> origin/csl/testing_better_job_name 2025-10-10T00:44:16.1277133Z * [new branch] csl/vllm_pin_labeler -> origin/csl/vllm_pin_labeler 2025-10-10T00:44:16.1279004Z * [new branch] csl/win_cpp_tests -> origin/csl/win_cpp_tests 2025-10-10T00:44:16.1280951Z * [new branch] csl/win_sccache -> origin/csl/win_sccache 2025-10-10T00:44:16.1282945Z * [new branch] cu_stream_api -> origin/cu_stream_api 2025-10-10T00:44:16.1285606Z * [new branch] cublasltrelax2 -> origin/cublasltrelax2 2025-10-10T00:44:16.1287721Z * [new branch] cublasnowdeterministic -> origin/cublasnowdeterministic 2025-10-10T00:44:16.1289699Z * [new branch] cublasrelax2 -> origin/cublasrelax2 2025-10-10T00:44:16.1291635Z * [new branch] cuda-include-paths-fix -> origin/cuda-include-paths-fix 2025-10-10T00:44:16.1293582Z * [new branch] custom_lowering_dict -> origin/custom_lowering_dict 2025-10-10T00:44:16.1296270Z * [new branch] d4l3k/delete_hook -> origin/d4l3k/delete_hook 2025-10-10T00:44:16.1299349Z * [new branch] daxia6/2.8o3 -> origin/daxia6/2.8o3 2025-10-10T00:44:16.1301165Z * [new branch] dcp_zoc -> origin/dcp_zoc 2025-10-10T00:44:16.1303083Z * [new branch] debug-guard -> origin/debug-guard 2025-10-10T00:44:16.1305259Z * [new branch] delete-quant-docs -> origin/delete-quant-docs 2025-10-10T00:44:16.1311442Z * [new branch] dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.0 -> origin/dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.0 2025-10-10T00:44:16.1313732Z * [new branch] desertfire/test_cpp_wrapper -> origin/desertfire/test_cpp_wrapper 2025-10-10T00:44:16.1315764Z * [new branch] desertfire/triton-cpu-for-aarch64 -> origin/desertfire/triton-cpu-for-aarch64 2025-10-10T00:44:16.1318900Z * [new branch] dev/dhruva/flex_attn_opt -> origin/dev/dhruva/flex_attn_opt 2025-10-10T00:44:16.1321755Z * [new branch] dev/joona/MPSNDArrayAdd -> origin/dev/joona/MPSNDArrayAdd 2025-10-10T00:44:16.1323813Z * [new branch] dev/joona/Unranked -> origin/dev/joona/Unranked 2025-10-10T00:44:16.1325952Z * [new branch] dev/joona/cat -> origin/dev/joona/cat 2025-10-10T00:44:16.1328016Z * [new branch] dev/joona/embeddingbag -> origin/dev/joona/embeddingbag 2025-10-10T00:44:16.1330231Z * [new branch] dev/joona/getTensorsString -> origin/dev/joona/getTensorsString 2025-10-10T00:44:16.1332402Z * [new branch] dev/joona/maxpool2dwithindices_errmsg -> origin/dev/joona/maxpool2dwithindices_errmsg 2025-10-10T00:44:16.1334457Z * [new branch] dev/joona/mps_linear_macos14 -> origin/dev/joona/mps_linear_macos14 2025-10-10T00:44:16.1336375Z * [new branch] dev/joona/sdpa -> origin/dev/joona/sdpa 2025-10-10T00:44:16.1339132Z * [new branch] dev/joona/topk_newapi -> origin/dev/joona/topk_newapi 2025-10-10T00:44:16.1341420Z * [new branch] dev/joona/type_inf -> origin/dev/joona/type_inf 2025-10-10T00:44:16.1343581Z * [new branch] dev/joona/upsize3d -> origin/dev/joona/upsize3d 2025-10-10T00:44:16.1345579Z * [new branch] disable -> origin/disable 2025-10-10T00:44:16.1347629Z * [new branch] disp_counter -> origin/disp_counter 2025-10-10T00:44:16.1349712Z * [new branch] dtensor-issues -> origin/dtensor-issues 2025-10-10T00:44:16.1351721Z * [new branch] eager_model_benchmarks -> origin/eager_model_benchmarks 2025-10-10T00:44:16.1354459Z * [new branch] embg/test_inductor_ci_128B -> origin/embg/test_inductor_ci_128B 2025-10-10T00:44:16.1356354Z * [new branch] embg/test_inductor_ci_base -> origin/embg/test_inductor_ci_base 2025-10-10T00:44:16.1358279Z * [new branch] embg/test_inductor_ci_control -> origin/embg/test_inductor_ci_control 2025-10-10T00:44:16.1359812Z * [new branch] embg/triton_l2_prefetch_128B -> origin/embg/triton_l2_prefetch_128B 2025-10-10T00:44:16.1362111Z * [new branch] embg/triton_l2_prefetch_256B -> origin/embg/triton_l2_prefetch_256B 2025-10-10T00:44:16.1364332Z * [new branch] enable-keep-going-for-trunk-tags -> origin/enable-keep-going-for-trunk-tags 2025-10-10T00:44:16.1366189Z * [new branch] eqy-patch-3 -> origin/eqy-patch-3 2025-10-10T00:44:16.1368330Z * [new branch] eqy-patch-5 -> origin/eqy-patch-5 2025-10-10T00:44:16.1371039Z * [new branch] exclamaforte/amd-ma -> origin/exclamaforte/amd-ma 2025-10-10T00:44:16.1373049Z * [new branch] exclamaforte/combo-kernels-perf-run -> origin/exclamaforte/combo-kernels-perf-run 2025-10-10T00:44:16.1374637Z * [new branch] exclamaforte/do_bench_refactor -> origin/exclamaforte/do_bench_refactor 2025-10-10T00:44:16.1376698Z * [new branch] exclamaforte/enable-mem-dep-fusion -> origin/exclamaforte/enable-mem-dep-fusion 2025-10-10T00:44:16.1378399Z * [new branch] exclamaforte/fix-exhaustive-autotuning -> origin/exclamaforte/fix-exhaustive-autotuning 2025-10-10T00:44:16.1380845Z * [new branch] exclamaforte/fix-exhuastive-autotuning-reland -> origin/exclamaforte/fix-exhuastive-autotuning-reland 2025-10-10T00:44:16.1383235Z * [new branch] exclamaforte/fix-trace-parsing-fx-svg -> origin/exclamaforte/fix-trace-parsing-fx-svg 2025-10-10T00:44:16.1384887Z * [new branch] exclamaforte/force-pointwise-cat-perf-run -> origin/exclamaforte/force-pointwise-cat-perf-run 2025-10-10T00:44:16.1386846Z * [new branch] exclamaforte/fusion-data -> origin/exclamaforte/fusion-data 2025-10-10T00:44:16.1388672Z * [new branch] exclamaforte/gemm-benchmark-run -> origin/exclamaforte/gemm-benchmark-run 2025-10-10T00:44:16.1390697Z * [new branch] exclamaforte/gemm-export-model -> origin/exclamaforte/gemm-export-model 2025-10-10T00:44:16.1392738Z * [new branch] exclamaforte/gemm-model -> origin/exclamaforte/gemm-model 2025-10-10T00:44:16.1395037Z * [new branch] exclamaforte/gemm-model-all-data-collection -> origin/exclamaforte/gemm-model-all-data-collection 2025-10-10T00:44:16.1396489Z * [new branch] exclamaforte/gemm-to-amd -> origin/exclamaforte/gemm-to-amd 2025-10-10T00:44:16.1398714Z * [new branch] exclamaforte/just-gemm-model -> origin/exclamaforte/just-gemm-model 2025-10-10T00:44:16.1401601Z * [new branch] exclamaforte/just-gemm-model-no-refactor -> origin/exclamaforte/just-gemm-model-no-refactor 2025-10-10T00:44:16.1403117Z * [new branch] exclamaforte/profile-diff-algo -> origin/exclamaforte/profile-diff-algo 2025-10-10T00:44:16.1405456Z * [new branch] exclamaforte/profiler-visualization -> origin/exclamaforte/profiler-visualization 2025-10-10T00:44:16.1406992Z * [new branch] exclamaforte/test_cpp_wrapper_mode -> origin/exclamaforte/test_cpp_wrapper_mode 2025-10-10T00:44:16.1409326Z * [new branch] exclamaforte/update-autotune-configs -> origin/exclamaforte/update-autotune-configs 2025-10-10T00:44:16.1411032Z * [new branch] exclamaforte/update-autotune-configs-2 -> origin/exclamaforte/update-autotune-configs-2 2025-10-10T00:44:16.1413778Z * [new branch] exclamforte/gemm-model-final -> origin/exclamforte/gemm-model-final 2025-10-10T00:44:16.1415704Z * [new branch] exec -> origin/exec 2025-10-10T00:44:16.1417919Z * [new branch] experimental-mosaic -> origin/experimental-mosaic 2025-10-10T00:44:16.1419882Z * [new branch] export-D58091437 -> origin/export-D58091437 2025-10-10T00:44:16.1422083Z * [new branch] export-D61047529 -> origin/export-D61047529 2025-10-10T00:44:16.1424073Z * [new branch] export-D71412006 -> origin/export-D71412006 2025-10-10T00:44:16.1426240Z * [new branch] export-D73042989 -> origin/export-D73042989 2025-10-10T00:44:16.1428209Z * [new branch] export-D76797250 -> origin/export-D76797250 2025-10-10T00:44:16.1430177Z * [new branch] export-D76885271 -> origin/export-D76885271 2025-10-10T00:44:16.1432367Z * [new branch] export-D76885620 -> origin/export-D76885620 2025-10-10T00:44:16.1434194Z * [new branch] export-D76936623 -> origin/export-D76936623 2025-10-10T00:44:16.1436255Z * [new branch] export-D76958268 -> origin/export-D76958268 2025-10-10T00:44:16.1438243Z * [new branch] export-D78375400 -> origin/export-D78375400 2025-10-10T00:44:16.1440283Z * [new branch] export-D78431305 -> origin/export-D78431305 2025-10-10T00:44:16.1442371Z * [new branch] export-D78580107 -> origin/export-D78580107 2025-10-10T00:44:16.1444365Z * [new branch] export-D78822171 -> origin/export-D78822171 2025-10-10T00:44:16.1446448Z * [new branch] export-D78822351 -> origin/export-D78822351 2025-10-10T00:44:16.1448457Z * [new branch] export-D78822507 -> origin/export-D78822507 2025-10-10T00:44:16.1450403Z * [new branch] export-D78826994 -> origin/export-D78826994 2025-10-10T00:44:16.1452418Z * [new branch] export-D78894324 -> origin/export-D78894324 2025-10-10T00:44:16.1454507Z * [new branch] export-D78929245 -> origin/export-D78929245 2025-10-10T00:44:16.1456512Z * [new branch] export-D78934925 -> origin/export-D78934925 2025-10-10T00:44:16.1459098Z * [new branch] export-D78953203 -> origin/export-D78953203 2025-10-10T00:44:16.1461152Z * [new branch] export-D78953229 -> origin/export-D78953229 2025-10-10T00:44:16.1463091Z * [new branch] export-D78957093 -> origin/export-D78957093 2025-10-10T00:44:16.1465020Z * [new branch] export-D78957389 -> origin/export-D78957389 2025-10-10T00:44:16.1467008Z * [new branch] export-D78996107 -> origin/export-D78996107 2025-10-10T00:44:16.1468998Z * [new branch] export-D79026433 -> origin/export-D79026433 2025-10-10T00:44:16.1471046Z * [new branch] export-D79230339 -> origin/export-D79230339 2025-10-10T00:44:16.1473018Z * [new branch] export-D79319835 -> origin/export-D79319835 2025-10-10T00:44:16.1474994Z * [new branch] export-D79328456 -> origin/export-D79328456 2025-10-10T00:44:16.1477090Z * [new branch] export-D79378362 -> origin/export-D79378362 2025-10-10T00:44:16.1478937Z * [new branch] export-D80823877 -> origin/export-D80823877 2025-10-10T00:44:16.1480900Z * [new branch] export-D80948073 -> origin/export-D80948073 2025-10-10T00:44:16.1483089Z * [new branch] export-D80958642 -> origin/export-D80958642 2025-10-10T00:44:16.1485090Z * [new branch] export-D81054193 -> origin/export-D81054193 2025-10-10T00:44:16.1487007Z * [new branch] export-D81204584 -> origin/export-D81204584 2025-10-10T00:44:16.1489145Z * [new branch] export-D81429090 -> origin/export-D81429090 2025-10-10T00:44:16.1491183Z * [new branch] export-D81651226 -> origin/export-D81651226 2025-10-10T00:44:16.1493224Z * [new branch] export-D81698719 -> origin/export-D81698719 2025-10-10T00:44:16.1495209Z * [new branch] export-D82140619 -> origin/export-D82140619 2025-10-10T00:44:16.1497263Z * [new branch] export-D82174075 -> origin/export-D82174075 2025-10-10T00:44:16.1499641Z * [new branch] export-D82232574 -> origin/export-D82232574 2025-10-10T00:44:16.1501802Z * [new branch] export-D82250826 -> origin/export-D82250826 2025-10-10T00:44:16.1503752Z * [new branch] export-D82253817 -> origin/export-D82253817 2025-10-10T00:44:16.1505835Z * [new branch] export-D82380307 -> origin/export-D82380307 2025-10-10T00:44:16.1507947Z * [new branch] export-D82597111 -> origin/export-D82597111 2025-10-10T00:44:16.1510122Z * [new branch] export-D83023706 -> origin/export-D83023706 2025-10-10T00:44:16.1512033Z * [new branch] export-D83195687 -> origin/export-D83195687 2025-10-10T00:44:16.1514029Z * [new branch] export-D83200714 -> origin/export-D83200714 2025-10-10T00:44:16.1516113Z * [new branch] export-D83378477 -> origin/export-D83378477 2025-10-10T00:44:16.1518156Z * [new branch] export-D83390563 -> origin/export-D83390563 2025-10-10T00:44:16.1520133Z * [new branch] export-D83390775 -> origin/export-D83390775 2025-10-10T00:44:16.1522157Z * [new branch] export-D83391942 -> origin/export-D83391942 2025-10-10T00:44:16.1524144Z * [new branch] export-D83395610 -> origin/export-D83395610 2025-10-10T00:44:16.1526047Z * [new branch] export-D83539263 -> origin/export-D83539263 2025-10-10T00:44:16.1528145Z * [new branch] export-D83541846 -> origin/export-D83541846 2025-10-10T00:44:16.1530262Z * [new branch] export-D83591083 -> origin/export-D83591083 2025-10-10T00:44:16.1532590Z * [new branch] export-D83609850 -> origin/export-D83609850 2025-10-10T00:44:16.1534876Z * [new branch] export-D83627170 -> origin/export-D83627170 2025-10-10T00:44:16.1536853Z * [new branch] export-D83714690 -> origin/export-D83714690 2025-10-10T00:44:16.1538814Z * [new branch] export-D83766701 -> origin/export-D83766701 2025-10-10T00:44:16.1540881Z * [new branch] export-D83768878 -> origin/export-D83768878 2025-10-10T00:44:16.1542885Z * [new branch] export-D83769447 -> origin/export-D83769447 2025-10-10T00:44:16.1544936Z * [new branch] export-D84009392 -> origin/export-D84009392 2025-10-10T00:44:16.1546934Z * [new branch] export-D84089824 -> origin/export-D84089824 2025-10-10T00:44:16.1548971Z * [new branch] export-D84098898 -> origin/export-D84098898 2025-10-10T00:44:16.1551281Z * [new branch] export-D84103213 -> origin/export-D84103213 2025-10-10T00:44:16.1553250Z * [new branch] export-D84213020 -> origin/export-D84213020 2025-10-10T00:44:16.1555089Z * [new branch] export-reland -> origin/export-reland 2025-10-10T00:44:16.1557386Z * [new branch] exported-model-train-idempotent -> origin/exported-model-train-idempotent 2025-10-10T00:44:16.1559268Z * [new branch] extend_lift_up_op -> origin/extend_lift_up_op 2025-10-10T00:44:16.1561363Z * [new branch] ezyang-titan-october -> origin/ezyang-titan-october 2025-10-10T00:44:16.1563439Z * [new branch] ezyang-titan-october2 -> origin/ezyang-titan-october2 2025-10-10T00:44:16.1565348Z * [new branch] ezyang-war -> origin/ezyang-war 2025-10-10T00:44:16.1568202Z * [new branch] ezyang/wip-aot-descriptors -> origin/ezyang/wip-aot-descriptors 2025-10-10T00:44:16.1570096Z * [new branch] fa_u8_brgemm -> origin/fa_u8_brgemm 2025-10-10T00:44:16.1572352Z * [new branch] fadeputr-fix-fbgemm_genai-build -> origin/fadeputr-fix-fbgemm_genai-build 2025-10-10T00:44:16.1575004Z * [new branch] fadeputr/sequence_fbgemm -> origin/fadeputr/sequence_fbgemm 2025-10-10T00:44:16.1577011Z * [new branch] fastmath_baseline -> origin/fastmath_baseline 2025-10-10T00:44:16.1579751Z * [new branch] fbcode/warm -> origin/fbcode/warm 2025-10-10T00:44:16.1581876Z * [new branch] fca -> origin/fca 2025-10-10T00:44:16.1583855Z * [new branch] fca2_ca5984c -> origin/fca2_ca5984c 2025-10-10T00:44:16.1585783Z * [new branch] fca5 -> origin/fca5 2025-10-10T00:44:16.1588496Z * [new branch] feature/justknobs-cpp -> origin/feature/justknobs-cpp 2025-10-10T00:44:16.1590964Z * [new branch] ffast_math_baseline -> origin/ffast_math_baseline 2025-10-10T00:44:16.1592961Z * [new branch] ffast_math_target -> origin/ffast_math_target 2025-10-10T00:44:16.1595710Z * [new branch] findhao/base_commit -> origin/findhao/base_commit 2025-10-10T00:44:16.1597526Z * [new branch] findhao/base_commit1 -> origin/findhao/base_commit1 2025-10-10T00:44:16.1599692Z * [new branch] findhao/multistream2 -> origin/findhao/multistream2 2025-10-10T00:44:16.1601678Z * [new branch] findhao/multistream5 -> origin/findhao/multistream5 2025-10-10T00:44:16.1603206Z * [new branch] findhao/multistream6 -> origin/findhao/multistream6 2025-10-10T00:44:16.1605233Z * [new branch] findhao/operatorbench3 -> origin/findhao/operatorbench3 2025-10-10T00:44:16.1607133Z * [new branch] findhao/operatorbench5 -> origin/findhao/operatorbench5 2025-10-10T00:44:16.1609213Z * [new branch] findhao/tritonparse -> origin/findhao/tritonparse 2025-10-10T00:44:16.1611402Z * [new branch] fix-ck-gemm-template-format -> origin/fix-ck-gemm-template-format 2025-10-10T00:44:16.1613749Z * [new branch] fix-config-ignore -> origin/fix-config-ignore 2025-10-10T00:44:16.1615761Z * [new branch] fix-dict-guard -> origin/fix-dict-guard 2025-10-10T00:44:16.1617739Z * [new branch] fix-fqn -> origin/fix-fqn 2025-10-10T00:44:16.1619817Z * [new branch] fix-rlease-feature-template -> origin/fix-rlease-feature-template 2025-10-10T00:44:16.1622277Z * [new branch] fix-upload-vllm-wheel-credential -> origin/fix-upload-vllm-wheel-credential 2025-10-10T00:44:16.1624112Z * [new branch] fix_153389 -> origin/fix_153389 2025-10-10T00:44:16.1626127Z * [new branch] fix_nvrtc_discovery -> origin/fix_nvrtc_discovery 2025-10-10T00:44:16.1628290Z * [new branch] fix_op_benchmark -> origin/fix_op_benchmark 2025-10-10T00:44:16.1630103Z * [new branch] fix_op_runner -> origin/fix_op_runner 2025-10-10T00:44:16.1632288Z * [new branch] fix_ubn_159469 -> origin/fix_ubn_159469 2025-10-10T00:44:16.1634249Z * [new branch] fixes -> origin/fixes 2025-10-10T00:44:16.1636276Z * [new branch] fixes-triage -> origin/fixes-triage 2025-10-10T00:44:16.1638243Z * [new branch] fixflashgit -> origin/fixflashgit 2025-10-10T00:44:16.1640323Z * [new branch] fixflashinfer -> origin/fixflashinfer 2025-10-10T00:44:16.1642315Z * [new branch] flash_decoding_cpu -> origin/flash_decoding_cpu 2025-10-10T00:44:16.1644201Z * [new branch] flex-flash -> origin/flex-flash 2025-10-10T00:44:16.1646487Z * [new branch] flex_attention_functorch_grad -> origin/flex_attention_functorch_grad 2025-10-10T00:44:16.1648591Z * [new branch] flex_flash -> origin/flex_flash 2025-10-10T00:44:16.1651398Z * [new branch] fmassa/fix_memeff_sharding_rule -> origin/fmassa/fix_memeff_sharding_rule 2025-10-10T00:44:16.1653316Z * [new branch] free-stack2 -> origin/free-stack2 2025-10-10T00:44:16.1656065Z * [new branch] fsdp2_trace_rules -> origin/fsdp2_trace_rules 2025-10-10T00:44:16.1658228Z * [new branch] fsdpv2_3d -> origin/fsdpv2_3d 2025-10-10T00:44:16.1659222Z * [new branch] fsdpv2_3d_m1 -> origin/fsdpv2_3d_m1 2025-10-10T00:44:16.1661642Z * [new branch] fused_moving_avg_obs_fake_quant_half_support -> origin/fused_moving_avg_obs_fake_quant_half_support 2025-10-10T00:44:16.1663523Z * [new branch] fx_cpp -> origin/fx_cpp 2025-10-10T00:44:16.1666245Z * [new branch] fy/fix-win -> origin/fy/fix-win 2025-10-10T00:44:16.1670243Z * [new branch] gh/AlnisM/1/base -> origin/gh/AlnisM/1/base 2025-10-10T00:44:16.1672206Z * [new branch] gh/AlnisM/1/head -> origin/gh/AlnisM/1/head 2025-10-10T00:44:16.1675510Z * [new branch] gh/ColinPeppler/80/base -> origin/gh/ColinPeppler/80/base 2025-10-10T00:44:16.1677354Z * [new branch] gh/ColinPeppler/80/head -> origin/gh/ColinPeppler/80/head 2025-10-10T00:44:16.1679235Z * [new branch] gh/ColinPeppler/80/orig -> origin/gh/ColinPeppler/80/orig 2025-10-10T00:44:16.1681987Z * [new branch] gh/ColinPeppler/81/base -> origin/gh/ColinPeppler/81/base 2025-10-10T00:44:16.1683884Z * [new branch] gh/ColinPeppler/81/head -> origin/gh/ColinPeppler/81/head 2025-10-10T00:44:16.1685610Z * [new branch] gh/ColinPeppler/81/orig -> origin/gh/ColinPeppler/81/orig 2025-10-10T00:44:16.1688502Z * [new branch] gh/ColinPeppler/82/base -> origin/gh/ColinPeppler/82/base 2025-10-10T00:44:16.1690052Z * [new branch] gh/ColinPeppler/82/head -> origin/gh/ColinPeppler/82/head 2025-10-10T00:44:16.1692207Z * [new branch] gh/ColinPeppler/82/orig -> origin/gh/ColinPeppler/82/orig 2025-10-10T00:44:16.1694898Z * [new branch] gh/ColinPeppler/83/base -> origin/gh/ColinPeppler/83/base 2025-10-10T00:44:16.1696961Z * [new branch] gh/ColinPeppler/83/head -> origin/gh/ColinPeppler/83/head 2025-10-10T00:44:16.1699152Z * [new branch] gh/ColinPeppler/83/orig -> origin/gh/ColinPeppler/83/orig 2025-10-10T00:44:16.1702098Z * [new branch] gh/ColinPeppler/84/base -> origin/gh/ColinPeppler/84/base 2025-10-10T00:44:16.1703713Z * [new branch] gh/ColinPeppler/84/head -> origin/gh/ColinPeppler/84/head 2025-10-10T00:44:16.1706672Z * [new branch] gh/ColinPeppler/85/base -> origin/gh/ColinPeppler/85/base 2025-10-10T00:44:16.1708113Z * [new branch] gh/ColinPeppler/85/head -> origin/gh/ColinPeppler/85/head 2025-10-10T00:44:16.1710786Z * [new branch] gh/ColinPeppler/86/base -> origin/gh/ColinPeppler/86/base 2025-10-10T00:44:16.1712814Z * [new branch] gh/ColinPeppler/86/head -> origin/gh/ColinPeppler/86/head 2025-10-10T00:44:16.1715057Z * [new branch] gh/ColinPeppler/87/base -> origin/gh/ColinPeppler/87/base 2025-10-10T00:44:16.1717037Z * [new branch] gh/ColinPeppler/87/head -> origin/gh/ColinPeppler/87/head 2025-10-10T00:44:16.1719355Z * [new branch] gh/ColinPeppler/88/base -> origin/gh/ColinPeppler/88/base 2025-10-10T00:44:16.1721267Z * [new branch] gh/ColinPeppler/88/head -> origin/gh/ColinPeppler/88/head 2025-10-10T00:44:16.1723728Z * [new branch] gh/ColinPeppler/89/base -> origin/gh/ColinPeppler/89/base 2025-10-10T00:44:16.1725376Z * [new branch] gh/ColinPeppler/89/head -> origin/gh/ColinPeppler/89/head 2025-10-10T00:44:16.1728154Z * [new branch] gh/ColinPeppler/90/base -> origin/gh/ColinPeppler/90/base 2025-10-10T00:44:16.1729675Z * [new branch] gh/ColinPeppler/90/head -> origin/gh/ColinPeppler/90/head 2025-10-10T00:44:16.1732462Z * [new branch] gh/ColinPeppler/91/base -> origin/gh/ColinPeppler/91/base 2025-10-10T00:44:16.1734047Z * [new branch] gh/ColinPeppler/91/head -> origin/gh/ColinPeppler/91/head 2025-10-10T00:44:16.1736782Z * [new branch] gh/ColinPeppler/92/base -> origin/gh/ColinPeppler/92/base 2025-10-10T00:44:16.1738660Z * [new branch] gh/ColinPeppler/92/head -> origin/gh/ColinPeppler/92/head 2025-10-10T00:44:16.1741253Z * [new branch] gh/ColinPeppler/93/base -> origin/gh/ColinPeppler/93/base 2025-10-10T00:44:16.1743200Z * [new branch] gh/ColinPeppler/93/head -> origin/gh/ColinPeppler/93/head 2025-10-10T00:44:16.1745102Z * [new branch] gh/ColinPeppler/93/orig -> origin/gh/ColinPeppler/93/orig 2025-10-10T00:44:16.1747986Z * [new branch] gh/ColinPeppler/94/base -> origin/gh/ColinPeppler/94/base 2025-10-10T00:44:16.1750121Z * [new branch] gh/ColinPeppler/94/head -> origin/gh/ColinPeppler/94/head 2025-10-10T00:44:16.1752131Z * [new branch] gh/ColinPeppler/94/orig -> origin/gh/ColinPeppler/94/orig 2025-10-10T00:44:16.1754914Z * [new branch] gh/ColinPeppler/95/base -> origin/gh/ColinPeppler/95/base 2025-10-10T00:44:16.1757003Z * [new branch] gh/ColinPeppler/95/head -> origin/gh/ColinPeppler/95/head 2025-10-10T00:44:16.1758930Z * [new branch] gh/ColinPeppler/95/orig -> origin/gh/ColinPeppler/95/orig 2025-10-10T00:44:16.1762217Z * [new branch] gh/EikanWang/67/base -> origin/gh/EikanWang/67/base 2025-10-10T00:44:16.1763760Z * [new branch] gh/EikanWang/67/head -> origin/gh/EikanWang/67/head 2025-10-10T00:44:16.1767507Z * [new branch] gh/Gasoonjia/1/base -> origin/gh/Gasoonjia/1/base 2025-10-10T00:44:16.1769502Z * [new branch] gh/Gasoonjia/1/head -> origin/gh/Gasoonjia/1/head 2025-10-10T00:44:16.1772608Z * [new branch] gh/H-Huang/131/base -> origin/gh/H-Huang/131/base 2025-10-10T00:44:16.1774475Z * [new branch] gh/H-Huang/131/head -> origin/gh/H-Huang/131/head 2025-10-10T00:44:16.1776370Z * [new branch] gh/H-Huang/131/orig -> origin/gh/H-Huang/131/orig 2025-10-10T00:44:16.1778924Z * [new branch] gh/H-Huang/132/base -> origin/gh/H-Huang/132/base 2025-10-10T00:44:16.1780808Z * [new branch] gh/H-Huang/132/head -> origin/gh/H-Huang/132/head 2025-10-10T00:44:16.1782887Z * [new branch] gh/H-Huang/132/orig -> origin/gh/H-Huang/132/orig 2025-10-10T00:44:16.1785369Z * [new branch] gh/H-Huang/180/base -> origin/gh/H-Huang/180/base 2025-10-10T00:44:16.1787251Z * [new branch] gh/H-Huang/180/head -> origin/gh/H-Huang/180/head 2025-10-10T00:44:16.1789139Z * [new branch] gh/H-Huang/180/orig -> origin/gh/H-Huang/180/orig 2025-10-10T00:44:16.1791634Z * [new branch] gh/H-Huang/182/base -> origin/gh/H-Huang/182/base 2025-10-10T00:44:16.1793504Z * [new branch] gh/H-Huang/182/head -> origin/gh/H-Huang/182/head 2025-10-10T00:44:16.1795327Z * [new branch] gh/H-Huang/182/orig -> origin/gh/H-Huang/182/orig 2025-10-10T00:44:16.1797987Z * [new branch] gh/H-Huang/187/base -> origin/gh/H-Huang/187/base 2025-10-10T00:44:16.1800105Z * [new branch] gh/H-Huang/187/head -> origin/gh/H-Huang/187/head 2025-10-10T00:44:16.1801630Z * [new branch] gh/H-Huang/187/orig -> origin/gh/H-Huang/187/orig 2025-10-10T00:44:16.1804596Z * [new branch] gh/H-Huang/207/base -> origin/gh/H-Huang/207/base 2025-10-10T00:44:16.1806372Z * [new branch] gh/H-Huang/207/head -> origin/gh/H-Huang/207/head 2025-10-10T00:44:16.1808348Z * [new branch] gh/H-Huang/207/orig -> origin/gh/H-Huang/207/orig 2025-10-10T00:44:16.1810873Z * [new branch] gh/H-Huang/210/base -> origin/gh/H-Huang/210/base 2025-10-10T00:44:16.1812754Z * [new branch] gh/H-Huang/210/head -> origin/gh/H-Huang/210/head 2025-10-10T00:44:16.1814601Z * [new branch] gh/H-Huang/210/orig -> origin/gh/H-Huang/210/orig 2025-10-10T00:44:16.1817160Z * [new branch] gh/H-Huang/212/base -> origin/gh/H-Huang/212/base 2025-10-10T00:44:16.1818978Z * [new branch] gh/H-Huang/212/head -> origin/gh/H-Huang/212/head 2025-10-10T00:44:16.1820865Z * [new branch] gh/H-Huang/212/orig -> origin/gh/H-Huang/212/orig 2025-10-10T00:44:16.1823413Z * [new branch] gh/H-Huang/214/base -> origin/gh/H-Huang/214/base 2025-10-10T00:44:16.1825258Z * [new branch] gh/H-Huang/214/head -> origin/gh/H-Huang/214/head 2025-10-10T00:44:16.1827080Z * [new branch] gh/H-Huang/214/orig -> origin/gh/H-Huang/214/orig 2025-10-10T00:44:16.1829586Z * [new branch] gh/H-Huang/215/base -> origin/gh/H-Huang/215/base 2025-10-10T00:44:16.1831480Z * [new branch] gh/H-Huang/215/head -> origin/gh/H-Huang/215/head 2025-10-10T00:44:16.1833370Z * [new branch] gh/H-Huang/215/orig -> origin/gh/H-Huang/215/orig 2025-10-10T00:44:16.1835897Z * [new branch] gh/H-Huang/216/base -> origin/gh/H-Huang/216/base 2025-10-10T00:44:16.1837759Z * [new branch] gh/H-Huang/216/head -> origin/gh/H-Huang/216/head 2025-10-10T00:44:16.1839454Z * [new branch] gh/H-Huang/216/orig -> origin/gh/H-Huang/216/orig 2025-10-10T00:44:16.1842437Z * [new branch] gh/H-Huang/217/base -> origin/gh/H-Huang/217/base 2025-10-10T00:44:16.1843967Z * [new branch] gh/H-Huang/217/head -> origin/gh/H-Huang/217/head 2025-10-10T00:44:16.1846051Z * [new branch] gh/H-Huang/217/orig -> origin/gh/H-Huang/217/orig 2025-10-10T00:44:16.1848884Z * [new branch] gh/H-Huang/218/base -> origin/gh/H-Huang/218/base 2025-10-10T00:44:16.1850728Z * [new branch] gh/H-Huang/218/head -> origin/gh/H-Huang/218/head 2025-10-10T00:44:16.1852636Z * [new branch] gh/H-Huang/218/orig -> origin/gh/H-Huang/218/orig 2025-10-10T00:44:16.1855216Z * [new branch] gh/H-Huang/219/base -> origin/gh/H-Huang/219/base 2025-10-10T00:44:16.1857100Z * [new branch] gh/H-Huang/219/head -> origin/gh/H-Huang/219/head 2025-10-10T00:44:16.1859095Z * [new branch] gh/H-Huang/219/orig -> origin/gh/H-Huang/219/orig 2025-10-10T00:44:16.1861532Z * [new branch] gh/H-Huang/220/base -> origin/gh/H-Huang/220/base 2025-10-10T00:44:16.1863110Z * [new branch] gh/H-Huang/220/head -> origin/gh/H-Huang/220/head 2025-10-10T00:44:16.1865216Z * [new branch] gh/H-Huang/220/orig -> origin/gh/H-Huang/220/orig 2025-10-10T00:44:16.1867773Z * [new branch] gh/H-Huang/221/base -> origin/gh/H-Huang/221/base 2025-10-10T00:44:16.1879445Z * [new branch] gh/H-Huang/221/head -> origin/gh/H-Huang/221/head 2025-10-10T00:44:16.1880255Z * [new branch] gh/H-Huang/221/orig -> origin/gh/H-Huang/221/orig 2025-10-10T00:44:16.1880934Z * [new branch] gh/H-Huang/222/base -> origin/gh/H-Huang/222/base 2025-10-10T00:44:16.1881610Z * [new branch] gh/H-Huang/222/head -> origin/gh/H-Huang/222/head 2025-10-10T00:44:16.1882152Z * [new branch] gh/H-Huang/222/orig -> origin/gh/H-Huang/222/orig 2025-10-10T00:44:16.1882787Z * [new branch] gh/H-Huang/223/base -> origin/gh/H-Huang/223/base 2025-10-10T00:44:16.1883448Z * [new branch] gh/H-Huang/223/head -> origin/gh/H-Huang/223/head 2025-10-10T00:44:16.1884561Z * [new branch] gh/H-Huang/223/orig -> origin/gh/H-Huang/223/orig 2025-10-10T00:44:16.1888059Z * [new branch] gh/IvanKobzarev/115/base -> origin/gh/IvanKobzarev/115/base 2025-10-10T00:44:16.1889612Z * [new branch] gh/IvanKobzarev/115/head -> origin/gh/IvanKobzarev/115/head 2025-10-10T00:44:16.1891887Z * [new branch] gh/IvanKobzarev/115/orig -> origin/gh/IvanKobzarev/115/orig 2025-10-10T00:44:16.1894780Z * [new branch] gh/IvanKobzarev/116/base -> origin/gh/IvanKobzarev/116/base 2025-10-10T00:44:16.1896341Z * [new branch] gh/IvanKobzarev/116/head -> origin/gh/IvanKobzarev/116/head 2025-10-10T00:44:16.1898865Z * [new branch] gh/IvanKobzarev/116/orig -> origin/gh/IvanKobzarev/116/orig 2025-10-10T00:44:16.1903397Z * [new branch] gh/IvanKobzarev/118/base -> origin/gh/IvanKobzarev/118/base 2025-10-10T00:44:16.1904767Z * [new branch] gh/IvanKobzarev/118/head -> origin/gh/IvanKobzarev/118/head 2025-10-10T00:44:16.1907066Z * [new branch] gh/IvanKobzarev/118/orig -> origin/gh/IvanKobzarev/118/orig 2025-10-10T00:44:16.1909789Z * [new branch] gh/IvanKobzarev/126/base -> origin/gh/IvanKobzarev/126/base 2025-10-10T00:44:16.1911794Z * [new branch] gh/IvanKobzarev/126/head -> origin/gh/IvanKobzarev/126/head 2025-10-10T00:44:16.1913359Z * [new branch] gh/IvanKobzarev/126/orig -> origin/gh/IvanKobzarev/126/orig 2025-10-10T00:44:16.1916395Z * [new branch] gh/IvanKobzarev/127/base -> origin/gh/IvanKobzarev/127/base 2025-10-10T00:44:16.1918053Z * [new branch] gh/IvanKobzarev/127/head -> origin/gh/IvanKobzarev/127/head 2025-10-10T00:44:16.1920188Z * [new branch] gh/IvanKobzarev/127/orig -> origin/gh/IvanKobzarev/127/orig 2025-10-10T00:44:16.1922774Z * [new branch] gh/IvanKobzarev/128/base -> origin/gh/IvanKobzarev/128/base 2025-10-10T00:44:16.1924628Z * [new branch] gh/IvanKobzarev/128/head -> origin/gh/IvanKobzarev/128/head 2025-10-10T00:44:16.1926655Z * [new branch] gh/IvanKobzarev/128/orig -> origin/gh/IvanKobzarev/128/orig 2025-10-10T00:44:16.1929719Z * [new branch] gh/IvanKobzarev/135/base -> origin/gh/IvanKobzarev/135/base 2025-10-10T00:44:16.1931175Z * [new branch] gh/IvanKobzarev/135/head -> origin/gh/IvanKobzarev/135/head 2025-10-10T00:44:16.1933628Z * [new branch] gh/IvanKobzarev/135/orig -> origin/gh/IvanKobzarev/135/orig 2025-10-10T00:44:16.1936160Z * [new branch] gh/IvanKobzarev/138/base -> origin/gh/IvanKobzarev/138/base 2025-10-10T00:44:16.1937672Z * [new branch] gh/IvanKobzarev/138/head -> origin/gh/IvanKobzarev/138/head 2025-10-10T00:44:16.1939832Z * [new branch] gh/IvanKobzarev/138/orig -> origin/gh/IvanKobzarev/138/orig 2025-10-10T00:44:16.1942680Z * [new branch] gh/IvanKobzarev/141/base -> origin/gh/IvanKobzarev/141/base 2025-10-10T00:44:16.1944141Z * [new branch] gh/IvanKobzarev/141/head -> origin/gh/IvanKobzarev/141/head 2025-10-10T00:44:16.1946329Z * [new branch] gh/IvanKobzarev/141/orig -> origin/gh/IvanKobzarev/141/orig 2025-10-10T00:44:16.1949737Z * [new branch] gh/IvanKobzarev/142/base -> origin/gh/IvanKobzarev/142/base 2025-10-10T00:44:16.1951252Z * [new branch] gh/IvanKobzarev/142/head -> origin/gh/IvanKobzarev/142/head 2025-10-10T00:44:16.1953379Z * [new branch] gh/IvanKobzarev/142/orig -> origin/gh/IvanKobzarev/142/orig 2025-10-10T00:44:16.1956141Z * [new branch] gh/IvanKobzarev/144/base -> origin/gh/IvanKobzarev/144/base 2025-10-10T00:44:16.1957684Z * [new branch] gh/IvanKobzarev/144/head -> origin/gh/IvanKobzarev/144/head 2025-10-10T00:44:16.1959894Z * [new branch] gh/IvanKobzarev/144/orig -> origin/gh/IvanKobzarev/144/orig 2025-10-10T00:44:16.1962616Z * [new branch] gh/IvanKobzarev/145/base -> origin/gh/IvanKobzarev/145/base 2025-10-10T00:44:16.1964135Z * [new branch] gh/IvanKobzarev/145/head -> origin/gh/IvanKobzarev/145/head 2025-10-10T00:44:16.1966377Z * [new branch] gh/IvanKobzarev/145/orig -> origin/gh/IvanKobzarev/145/orig 2025-10-10T00:44:16.1969363Z * [new branch] gh/IvanKobzarev/146/base -> origin/gh/IvanKobzarev/146/base 2025-10-10T00:44:16.1970787Z * [new branch] gh/IvanKobzarev/146/head -> origin/gh/IvanKobzarev/146/head 2025-10-10T00:44:16.1973052Z * [new branch] gh/IvanKobzarev/146/orig -> origin/gh/IvanKobzarev/146/orig 2025-10-10T00:44:16.1975718Z * [new branch] gh/IvanKobzarev/147/base -> origin/gh/IvanKobzarev/147/base 2025-10-10T00:44:16.1977548Z * [new branch] gh/IvanKobzarev/147/head -> origin/gh/IvanKobzarev/147/head 2025-10-10T00:44:16.1979491Z * [new branch] gh/IvanKobzarev/147/orig -> origin/gh/IvanKobzarev/147/orig 2025-10-10T00:44:16.1982170Z * [new branch] gh/IvanKobzarev/148/base -> origin/gh/IvanKobzarev/148/base 2025-10-10T00:44:16.1983618Z * [new branch] gh/IvanKobzarev/148/head -> origin/gh/IvanKobzarev/148/head 2025-10-10T00:44:16.1986445Z * [new branch] gh/IvanKobzarev/149/base -> origin/gh/IvanKobzarev/149/base 2025-10-10T00:44:16.1987995Z * [new branch] gh/IvanKobzarev/149/head -> origin/gh/IvanKobzarev/149/head 2025-10-10T00:44:16.1991158Z * [new branch] gh/IvanKobzarev/150/base -> origin/gh/IvanKobzarev/150/base 2025-10-10T00:44:16.1993319Z * [new branch] gh/IvanKobzarev/150/head -> origin/gh/IvanKobzarev/150/head 2025-10-10T00:44:16.1994898Z * [new branch] gh/IvanKobzarev/150/orig -> origin/gh/IvanKobzarev/150/orig 2025-10-10T00:44:16.1997882Z * [new branch] gh/IvanKobzarev/151/base -> origin/gh/IvanKobzarev/151/base 2025-10-10T00:44:16.2000251Z * [new branch] gh/IvanKobzarev/151/head -> origin/gh/IvanKobzarev/151/head 2025-10-10T00:44:16.2001647Z * [new branch] gh/IvanKobzarev/151/orig -> origin/gh/IvanKobzarev/151/orig 2025-10-10T00:44:16.2004706Z * [new branch] gh/IvanKobzarev/152/base -> origin/gh/IvanKobzarev/152/base 2025-10-10T00:44:16.2006193Z * [new branch] gh/IvanKobzarev/152/head -> origin/gh/IvanKobzarev/152/head 2025-10-10T00:44:16.2008140Z * [new branch] gh/IvanKobzarev/152/orig -> origin/gh/IvanKobzarev/152/orig 2025-10-10T00:44:16.2011437Z * [new branch] gh/IvanKobzarev/153/base -> origin/gh/IvanKobzarev/153/base 2025-10-10T00:44:16.2012815Z * [new branch] gh/IvanKobzarev/153/head -> origin/gh/IvanKobzarev/153/head 2025-10-10T00:44:16.2014868Z * [new branch] gh/IvanKobzarev/153/orig -> origin/gh/IvanKobzarev/153/orig 2025-10-10T00:44:16.2017534Z * [new branch] gh/IvanKobzarev/154/base -> origin/gh/IvanKobzarev/154/base 2025-10-10T00:44:16.2019573Z * [new branch] gh/IvanKobzarev/154/head -> origin/gh/IvanKobzarev/154/head 2025-10-10T00:44:16.2021421Z * [new branch] gh/IvanKobzarev/154/orig -> origin/gh/IvanKobzarev/154/orig 2025-10-10T00:44:16.2024057Z * [new branch] gh/IvanKobzarev/155/base -> origin/gh/IvanKobzarev/155/base 2025-10-10T00:44:16.2025969Z * [new branch] gh/IvanKobzarev/155/head -> origin/gh/IvanKobzarev/155/head 2025-10-10T00:44:16.2027872Z * [new branch] gh/IvanKobzarev/155/orig -> origin/gh/IvanKobzarev/155/orig 2025-10-10T00:44:16.2030678Z * [new branch] gh/IvanKobzarev/156/base -> origin/gh/IvanKobzarev/156/base 2025-10-10T00:44:16.2032707Z * [new branch] gh/IvanKobzarev/156/head -> origin/gh/IvanKobzarev/156/head 2025-10-10T00:44:16.2034627Z * [new branch] gh/IvanKobzarev/156/orig -> origin/gh/IvanKobzarev/156/orig 2025-10-10T00:44:16.2037487Z * [new branch] gh/IvanKobzarev/157/base -> origin/gh/IvanKobzarev/157/base 2025-10-10T00:44:16.2039523Z * [new branch] gh/IvanKobzarev/157/head -> origin/gh/IvanKobzarev/157/head 2025-10-10T00:44:16.2041405Z * [new branch] gh/IvanKobzarev/157/orig -> origin/gh/IvanKobzarev/157/orig 2025-10-10T00:44:16.2044084Z * [new branch] gh/IvanKobzarev/158/base -> origin/gh/IvanKobzarev/158/base 2025-10-10T00:44:16.2045951Z * [new branch] gh/IvanKobzarev/158/head -> origin/gh/IvanKobzarev/158/head 2025-10-10T00:44:16.2047941Z * [new branch] gh/IvanKobzarev/158/orig -> origin/gh/IvanKobzarev/158/orig 2025-10-10T00:44:16.2050676Z * [new branch] gh/IvanKobzarev/159/base -> origin/gh/IvanKobzarev/159/base 2025-10-10T00:44:16.2052536Z * [new branch] gh/IvanKobzarev/159/head -> origin/gh/IvanKobzarev/159/head 2025-10-10T00:44:16.2054468Z * [new branch] gh/IvanKobzarev/159/orig -> origin/gh/IvanKobzarev/159/orig 2025-10-10T00:44:16.2057140Z * [new branch] gh/IvanKobzarev/160/base -> origin/gh/IvanKobzarev/160/base 2025-10-10T00:44:16.2059086Z * [new branch] gh/IvanKobzarev/160/head -> origin/gh/IvanKobzarev/160/head 2025-10-10T00:44:16.2060613Z * [new branch] gh/IvanKobzarev/160/orig -> origin/gh/IvanKobzarev/160/orig 2025-10-10T00:44:16.2064180Z * [new branch] gh/NikhilAPatel/1/base -> origin/gh/NikhilAPatel/1/base 2025-10-10T00:44:16.2065942Z * [new branch] gh/NikhilAPatel/1/head -> origin/gh/NikhilAPatel/1/head 2025-10-10T00:44:16.2068644Z * [new branch] gh/NikhilAPatel/2/base -> origin/gh/NikhilAPatel/2/base 2025-10-10T00:44:16.2070193Z * [new branch] gh/NikhilAPatel/2/head -> origin/gh/NikhilAPatel/2/head 2025-10-10T00:44:16.2073219Z * [new branch] gh/NikhilAPatel/4/base -> origin/gh/NikhilAPatel/4/base 2025-10-10T00:44:16.2075245Z * [new branch] gh/NikhilAPatel/4/head -> origin/gh/NikhilAPatel/4/head 2025-10-10T00:44:16.2078271Z * [new branch] gh/PaliC/1/base -> origin/gh/PaliC/1/base 2025-10-10T00:44:16.2080088Z * [new branch] gh/PaliC/1/head -> origin/gh/PaliC/1/head 2025-10-10T00:44:16.2082024Z * [new branch] gh/PaliC/1/orig -> origin/gh/PaliC/1/orig 2025-10-10T00:44:16.2084568Z * [new branch] gh/PaliC/17/base -> origin/gh/PaliC/17/base 2025-10-10T00:44:16.2086527Z * [new branch] gh/PaliC/17/head -> origin/gh/PaliC/17/head 2025-10-10T00:44:16.2088439Z * [new branch] gh/PaliC/17/orig -> origin/gh/PaliC/17/orig 2025-10-10T00:44:16.2090960Z * [new branch] gh/PaliC/18/base -> origin/gh/PaliC/18/base 2025-10-10T00:44:16.2092855Z * [new branch] gh/PaliC/18/head -> origin/gh/PaliC/18/head 2025-10-10T00:44:16.2094687Z * [new branch] gh/PaliC/18/orig -> origin/gh/PaliC/18/orig 2025-10-10T00:44:16.2097222Z * [new branch] gh/PaliC/2/base -> origin/gh/PaliC/2/base 2025-10-10T00:44:16.2099572Z * [new branch] gh/PaliC/2/head -> origin/gh/PaliC/2/head 2025-10-10T00:44:16.2104776Z * [new branch] gh/PaliC/2/orig -> origin/gh/PaliC/2/orig 2025-10-10T00:44:16.2107401Z * [new branch] gh/PaliC/20/base -> origin/gh/PaliC/20/base 2025-10-10T00:44:16.2109254Z * [new branch] gh/PaliC/20/head -> origin/gh/PaliC/20/head 2025-10-10T00:44:16.2111129Z * [new branch] gh/PaliC/20/orig -> origin/gh/PaliC/20/orig 2025-10-10T00:44:16.2113720Z * [new branch] gh/PaliC/21/base -> origin/gh/PaliC/21/base 2025-10-10T00:44:16.2124303Z * [new branch] gh/PaliC/21/head -> origin/gh/PaliC/21/head 2025-10-10T00:44:16.2124836Z * [new branch] gh/PaliC/21/orig -> origin/gh/PaliC/21/orig 2025-10-10T00:44:16.2125322Z * [new branch] gh/PaliC/22/base -> origin/gh/PaliC/22/base 2025-10-10T00:44:16.2125797Z * [new branch] gh/PaliC/22/head -> origin/gh/PaliC/22/head 2025-10-10T00:44:16.2126274Z * [new branch] gh/PaliC/22/orig -> origin/gh/PaliC/22/orig 2025-10-10T00:44:16.2126756Z * [new branch] gh/PaliC/23/base -> origin/gh/PaliC/23/base 2025-10-10T00:44:16.2128028Z * [new branch] gh/PaliC/23/head -> origin/gh/PaliC/23/head 2025-10-10T00:44:16.2130013Z * [new branch] gh/PaliC/23/orig -> origin/gh/PaliC/23/orig 2025-10-10T00:44:16.2132767Z * [new branch] gh/PaliC/24/base -> origin/gh/PaliC/24/base 2025-10-10T00:44:16.2134455Z * [new branch] gh/PaliC/24/head -> origin/gh/PaliC/24/head 2025-10-10T00:44:16.2136239Z * [new branch] gh/PaliC/24/orig -> origin/gh/PaliC/24/orig 2025-10-10T00:44:16.2138794Z * [new branch] gh/PaliC/25/head -> origin/gh/PaliC/25/head 2025-10-10T00:44:16.2140624Z * [new branch] gh/PaliC/25/next -> origin/gh/PaliC/25/next 2025-10-10T00:44:16.2142473Z * [new branch] gh/PaliC/25/orig -> origin/gh/PaliC/25/orig 2025-10-10T00:44:16.2145551Z * [new branch] gh/PaliC/26/head -> origin/gh/PaliC/26/head 2025-10-10T00:44:16.2147295Z * [new branch] gh/PaliC/26/next -> origin/gh/PaliC/26/next 2025-10-10T00:44:16.2149211Z * [new branch] gh/PaliC/26/orig -> origin/gh/PaliC/26/orig 2025-10-10T00:44:16.2151772Z * [new branch] gh/PaliC/27/head -> origin/gh/PaliC/27/head 2025-10-10T00:44:16.2153468Z * [new branch] gh/PaliC/27/next -> origin/gh/PaliC/27/next 2025-10-10T00:44:16.2155355Z * [new branch] gh/PaliC/27/orig -> origin/gh/PaliC/27/orig 2025-10-10T00:44:16.2157936Z * [new branch] gh/PaliC/28/head -> origin/gh/PaliC/28/head 2025-10-10T00:44:16.2159707Z * [new branch] gh/PaliC/28/next -> origin/gh/PaliC/28/next 2025-10-10T00:44:16.2161523Z * [new branch] gh/PaliC/28/orig -> origin/gh/PaliC/28/orig 2025-10-10T00:44:16.2164099Z * [new branch] gh/PaliC/29/head -> origin/gh/PaliC/29/head 2025-10-10T00:44:16.2165998Z * [new branch] gh/PaliC/29/next -> origin/gh/PaliC/29/next 2025-10-10T00:44:16.2167899Z * [new branch] gh/PaliC/29/orig -> origin/gh/PaliC/29/orig 2025-10-10T00:44:16.2170503Z * [new branch] gh/PaliC/30/head -> origin/gh/PaliC/30/head 2025-10-10T00:44:16.2172211Z * [new branch] gh/PaliC/30/next -> origin/gh/PaliC/30/next 2025-10-10T00:44:16.2174234Z * [new branch] gh/PaliC/30/orig -> origin/gh/PaliC/30/orig 2025-10-10T00:44:16.2176748Z * [new branch] gh/PaliC/31/head -> origin/gh/PaliC/31/head 2025-10-10T00:44:16.2178458Z * [new branch] gh/PaliC/31/next -> origin/gh/PaliC/31/next 2025-10-10T00:44:16.2180380Z * [new branch] gh/PaliC/31/orig -> origin/gh/PaliC/31/orig 2025-10-10T00:44:16.2183605Z * [new branch] gh/PaulZhang12/22/base -> origin/gh/PaulZhang12/22/base 2025-10-10T00:44:16.2185486Z * [new branch] gh/PaulZhang12/22/head -> origin/gh/PaulZhang12/22/head 2025-10-10T00:44:16.2187312Z * [new branch] gh/PaulZhang12/22/orig -> origin/gh/PaulZhang12/22/orig 2025-10-10T00:44:16.2190073Z * [new branch] gh/PaulZhang12/24/base -> origin/gh/PaulZhang12/24/base 2025-10-10T00:44:16.2191936Z * [new branch] gh/PaulZhang12/24/head -> origin/gh/PaulZhang12/24/head 2025-10-10T00:44:16.2193816Z * [new branch] gh/PaulZhang12/24/orig -> origin/gh/PaulZhang12/24/orig 2025-10-10T00:44:16.2196471Z * [new branch] gh/PaulZhang12/25/base -> origin/gh/PaulZhang12/25/base 2025-10-10T00:44:16.2198887Z * [new branch] gh/PaulZhang12/25/head -> origin/gh/PaulZhang12/25/head 2025-10-10T00:44:16.2200536Z * [new branch] gh/PaulZhang12/25/orig -> origin/gh/PaulZhang12/25/orig 2025-10-10T00:44:16.2203211Z * [new branch] gh/PaulZhang12/26/base -> origin/gh/PaulZhang12/26/base 2025-10-10T00:44:16.2205071Z * [new branch] gh/PaulZhang12/26/head -> origin/gh/PaulZhang12/26/head 2025-10-10T00:44:16.2206929Z * [new branch] gh/PaulZhang12/26/orig -> origin/gh/PaulZhang12/26/orig 2025-10-10T00:44:16.2209733Z * [new branch] gh/PaulZhang12/27/base -> origin/gh/PaulZhang12/27/base 2025-10-10T00:44:16.2211728Z * [new branch] gh/PaulZhang12/27/head -> origin/gh/PaulZhang12/27/head 2025-10-10T00:44:16.2213555Z * [new branch] gh/PaulZhang12/27/orig -> origin/gh/PaulZhang12/27/orig 2025-10-10T00:44:16.2216145Z * [new branch] gh/PaulZhang12/28/base -> origin/gh/PaulZhang12/28/base 2025-10-10T00:44:16.2218070Z * [new branch] gh/PaulZhang12/28/head -> origin/gh/PaulZhang12/28/head 2025-10-10T00:44:16.2219994Z * [new branch] gh/PaulZhang12/28/orig -> origin/gh/PaulZhang12/28/orig 2025-10-10T00:44:16.2222789Z * [new branch] gh/PaulZhang12/29/base -> origin/gh/PaulZhang12/29/base 2025-10-10T00:44:16.2224355Z * [new branch] gh/PaulZhang12/29/head -> origin/gh/PaulZhang12/29/head 2025-10-10T00:44:16.2226262Z * [new branch] gh/PaulZhang12/29/orig -> origin/gh/PaulZhang12/29/orig 2025-10-10T00:44:16.2228875Z * [new branch] gh/PaulZhang12/30/base -> origin/gh/PaulZhang12/30/base 2025-10-10T00:44:16.2230839Z * [new branch] gh/PaulZhang12/30/head -> origin/gh/PaulZhang12/30/head 2025-10-10T00:44:16.2232676Z * [new branch] gh/PaulZhang12/30/orig -> origin/gh/PaulZhang12/30/orig 2025-10-10T00:44:16.2235464Z * [new branch] gh/PaulZhang12/31/base -> origin/gh/PaulZhang12/31/base 2025-10-10T00:44:16.2237363Z * [new branch] gh/PaulZhang12/31/head -> origin/gh/PaulZhang12/31/head 2025-10-10T00:44:16.2239234Z * [new branch] gh/PaulZhang12/31/orig -> origin/gh/PaulZhang12/31/orig 2025-10-10T00:44:16.2242126Z * [new branch] gh/PaulZhang12/32/base -> origin/gh/PaulZhang12/32/base 2025-10-10T00:44:16.2243763Z * [new branch] gh/PaulZhang12/32/head -> origin/gh/PaulZhang12/32/head 2025-10-10T00:44:16.2245650Z * [new branch] gh/PaulZhang12/32/orig -> origin/gh/PaulZhang12/32/orig 2025-10-10T00:44:16.2249812Z * [new branch] gh/PaulZhang12/33/base -> origin/gh/PaulZhang12/33/base 2025-10-10T00:44:16.2252259Z * [new branch] gh/PaulZhang12/33/head -> origin/gh/PaulZhang12/33/head 2025-10-10T00:44:16.2254112Z * [new branch] gh/PaulZhang12/33/orig -> origin/gh/PaulZhang12/33/orig 2025-10-10T00:44:16.2256775Z * [new branch] gh/PaulZhang12/34/base -> origin/gh/PaulZhang12/34/base 2025-10-10T00:44:16.2258773Z * [new branch] gh/PaulZhang12/34/head -> origin/gh/PaulZhang12/34/head 2025-10-10T00:44:16.2260583Z * [new branch] gh/PaulZhang12/34/orig -> origin/gh/PaulZhang12/34/orig 2025-10-10T00:44:16.2263043Z * [new branch] gh/PaulZhang12/35/base -> origin/gh/PaulZhang12/35/base 2025-10-10T00:44:16.2265116Z * [new branch] gh/PaulZhang12/35/head -> origin/gh/PaulZhang12/35/head 2025-10-10T00:44:16.2266791Z * [new branch] gh/PaulZhang12/35/orig -> origin/gh/PaulZhang12/35/orig 2025-10-10T00:44:16.2269876Z * [new branch] gh/SamGinzburg/11/base -> origin/gh/SamGinzburg/11/base 2025-10-10T00:44:16.2271969Z * [new branch] gh/SamGinzburg/11/head -> origin/gh/SamGinzburg/11/head 2025-10-10T00:44:16.2275615Z * [new branch] gh/SherlockNoMad/1/base -> origin/gh/SherlockNoMad/1/base 2025-10-10T00:44:16.2277564Z * [new branch] gh/SherlockNoMad/1/head -> origin/gh/SherlockNoMad/1/head 2025-10-10T00:44:16.2280269Z * [new branch] gh/SherlockNoMad/10/base -> origin/gh/SherlockNoMad/10/base 2025-10-10T00:44:16.2282105Z * [new branch] gh/SherlockNoMad/10/head -> origin/gh/SherlockNoMad/10/head 2025-10-10T00:44:16.2284073Z * [new branch] gh/SherlockNoMad/10/orig -> origin/gh/SherlockNoMad/10/orig 2025-10-10T00:44:16.2287308Z * [new branch] gh/SherlockNoMad/11/base -> origin/gh/SherlockNoMad/11/base 2025-10-10T00:44:16.2289317Z * [new branch] gh/SherlockNoMad/11/head -> origin/gh/SherlockNoMad/11/head 2025-10-10T00:44:16.2291205Z * [new branch] gh/SherlockNoMad/11/orig -> origin/gh/SherlockNoMad/11/orig 2025-10-10T00:44:16.2293693Z * [new branch] gh/SherlockNoMad/12/base -> origin/gh/SherlockNoMad/12/base 2025-10-10T00:44:16.2295691Z * [new branch] gh/SherlockNoMad/12/head -> origin/gh/SherlockNoMad/12/head 2025-10-10T00:44:16.2297527Z * [new branch] gh/SherlockNoMad/12/orig -> origin/gh/SherlockNoMad/12/orig 2025-10-10T00:44:16.2300348Z * [new branch] gh/SherlockNoMad/13/base -> origin/gh/SherlockNoMad/13/base 2025-10-10T00:44:16.2302247Z * [new branch] gh/SherlockNoMad/13/head -> origin/gh/SherlockNoMad/13/head 2025-10-10T00:44:16.2304021Z * [new branch] gh/SherlockNoMad/13/orig -> origin/gh/SherlockNoMad/13/orig 2025-10-10T00:44:16.2306444Z * [new branch] gh/SherlockNoMad/2/base -> origin/gh/SherlockNoMad/2/base 2025-10-10T00:44:16.2308250Z * [new branch] gh/SherlockNoMad/2/head -> origin/gh/SherlockNoMad/2/head 2025-10-10T00:44:16.2310709Z * [new branch] gh/SherlockNoMad/3/base -> origin/gh/SherlockNoMad/3/base 2025-10-10T00:44:16.2312504Z * [new branch] gh/SherlockNoMad/3/head -> origin/gh/SherlockNoMad/3/head 2025-10-10T00:44:16.2314906Z * [new branch] gh/SherlockNoMad/4/base -> origin/gh/SherlockNoMad/4/base 2025-10-10T00:44:16.2316751Z * [new branch] gh/SherlockNoMad/4/head -> origin/gh/SherlockNoMad/4/head 2025-10-10T00:44:16.2319273Z * [new branch] gh/SherlockNoMad/5/base -> origin/gh/SherlockNoMad/5/base 2025-10-10T00:44:16.2321027Z * [new branch] gh/SherlockNoMad/5/head -> origin/gh/SherlockNoMad/5/head 2025-10-10T00:44:16.2323352Z * [new branch] gh/SherlockNoMad/6/base -> origin/gh/SherlockNoMad/6/base 2025-10-10T00:44:16.2325279Z * [new branch] gh/SherlockNoMad/6/head -> origin/gh/SherlockNoMad/6/head 2025-10-10T00:44:16.2327161Z * [new branch] gh/SherlockNoMad/6/orig -> origin/gh/SherlockNoMad/6/orig 2025-10-10T00:44:16.2329982Z * [new branch] gh/SherlockNoMad/7/base -> origin/gh/SherlockNoMad/7/base 2025-10-10T00:44:16.2331817Z * [new branch] gh/SherlockNoMad/7/head -> origin/gh/SherlockNoMad/7/head 2025-10-10T00:44:16.2333660Z * [new branch] gh/SherlockNoMad/7/orig -> origin/gh/SherlockNoMad/7/orig 2025-10-10T00:44:16.2336084Z * [new branch] gh/SherlockNoMad/8/base -> origin/gh/SherlockNoMad/8/base 2025-10-10T00:44:16.2338050Z * [new branch] gh/SherlockNoMad/8/head -> origin/gh/SherlockNoMad/8/head 2025-10-10T00:44:16.2339923Z * [new branch] gh/SherlockNoMad/8/orig -> origin/gh/SherlockNoMad/8/orig 2025-10-10T00:44:16.2342318Z * [new branch] gh/SherlockNoMad/9/base -> origin/gh/SherlockNoMad/9/base 2025-10-10T00:44:16.2344167Z * [new branch] gh/SherlockNoMad/9/orig -> origin/gh/SherlockNoMad/9/orig 2025-10-10T00:44:16.2347373Z * [new branch] gh/Sidharth123-cpu/24/base -> origin/gh/Sidharth123-cpu/24/base 2025-10-10T00:44:16.2349823Z * [new branch] gh/Sidharth123-cpu/25/base -> origin/gh/Sidharth123-cpu/25/base 2025-10-10T00:44:16.2352265Z * [new branch] gh/Sidharth123-cpu/26/base -> origin/gh/Sidharth123-cpu/26/base 2025-10-10T00:44:16.2354884Z * [new branch] gh/Sidharth123-cpu/27/base -> origin/gh/Sidharth123-cpu/27/base 2025-10-10T00:44:16.2358105Z * [new branch] gh/StrongerXi/1/base -> origin/gh/StrongerXi/1/base 2025-10-10T00:44:16.2359993Z * [new branch] gh/StrongerXi/1/head -> origin/gh/StrongerXi/1/head 2025-10-10T00:44:16.2362615Z * [new branch] gh/StrongerXi/133/base -> origin/gh/StrongerXi/133/base 2025-10-10T00:44:16.2364486Z * [new branch] gh/StrongerXi/133/head -> origin/gh/StrongerXi/133/head 2025-10-10T00:44:16.2366428Z * [new branch] gh/StrongerXi/133/orig -> origin/gh/StrongerXi/133/orig 2025-10-10T00:44:16.2369153Z * [new branch] gh/StrongerXi/134/base -> origin/gh/StrongerXi/134/base 2025-10-10T00:44:16.2371518Z * [new branch] gh/StrongerXi/134/head -> origin/gh/StrongerXi/134/head 2025-10-10T00:44:16.2373396Z * [new branch] gh/StrongerXi/134/orig -> origin/gh/StrongerXi/134/orig 2025-10-10T00:44:16.2376040Z * [new branch] gh/StrongerXi/136/base -> origin/gh/StrongerXi/136/base 2025-10-10T00:44:16.2377981Z * [new branch] gh/StrongerXi/136/head -> origin/gh/StrongerXi/136/head 2025-10-10T00:44:16.2379917Z * [new branch] gh/StrongerXi/136/orig -> origin/gh/StrongerXi/136/orig 2025-10-10T00:44:16.2382480Z * [new branch] gh/StrongerXi/137/base -> origin/gh/StrongerXi/137/base 2025-10-10T00:44:16.2384422Z * [new branch] gh/StrongerXi/137/head -> origin/gh/StrongerXi/137/head 2025-10-10T00:44:16.2386234Z * [new branch] gh/StrongerXi/137/orig -> origin/gh/StrongerXi/137/orig 2025-10-10T00:44:16.2388683Z * [new branch] gh/StrongerXi/138/base -> origin/gh/StrongerXi/138/base 2025-10-10T00:44:16.2390568Z * [new branch] gh/StrongerXi/138/head -> origin/gh/StrongerXi/138/head 2025-10-10T00:44:16.2392919Z * [new branch] gh/StrongerXi/138/orig -> origin/gh/StrongerXi/138/orig 2025-10-10T00:44:16.2395460Z * [new branch] gh/StrongerXi/71/base -> origin/gh/StrongerXi/71/base 2025-10-10T00:44:16.2397421Z * [new branch] gh/StrongerXi/71/head -> origin/gh/StrongerXi/71/head 2025-10-10T00:44:16.2400431Z * [new branch] gh/StrongerXi/72/base -> origin/gh/StrongerXi/72/base 2025-10-10T00:44:16.2401649Z * [new branch] gh/StrongerXi/72/head -> origin/gh/StrongerXi/72/head 2025-10-10T00:44:16.2405016Z * [new branch] gh/XilunWu/147/base -> origin/gh/XilunWu/147/base 2025-10-10T00:44:16.2407006Z * [new branch] gh/XilunWu/147/head -> origin/gh/XilunWu/147/head 2025-10-10T00:44:16.2409168Z * [new branch] gh/XilunWu/147/orig -> origin/gh/XilunWu/147/orig 2025-10-10T00:44:16.2411534Z * [new branch] gh/XilunWu/148/base -> origin/gh/XilunWu/148/base 2025-10-10T00:44:16.2413347Z * [new branch] gh/XilunWu/148/head -> origin/gh/XilunWu/148/head 2025-10-10T00:44:16.2415348Z * [new branch] gh/XilunWu/148/orig -> origin/gh/XilunWu/148/orig 2025-10-10T00:44:16.2417781Z * [new branch] gh/XilunWu/149/base -> origin/gh/XilunWu/149/base 2025-10-10T00:44:16.2419540Z * [new branch] gh/XilunWu/149/head -> origin/gh/XilunWu/149/head 2025-10-10T00:44:16.2421513Z * [new branch] gh/XilunWu/149/orig -> origin/gh/XilunWu/149/orig 2025-10-10T00:44:16.2423830Z * [new branch] gh/XilunWu/150/base -> origin/gh/XilunWu/150/base 2025-10-10T00:44:16.2425799Z * [new branch] gh/XilunWu/150/head -> origin/gh/XilunWu/150/head 2025-10-10T00:44:16.2427662Z * [new branch] gh/XilunWu/150/orig -> origin/gh/XilunWu/150/orig 2025-10-10T00:44:16.2430379Z * [new branch] gh/XilunWu/151/base -> origin/gh/XilunWu/151/base 2025-10-10T00:44:16.2432331Z * [new branch] gh/XilunWu/151/head -> origin/gh/XilunWu/151/head 2025-10-10T00:44:16.2434204Z * [new branch] gh/XilunWu/151/orig -> origin/gh/XilunWu/151/orig 2025-10-10T00:44:16.2436620Z * [new branch] gh/XilunWu/152/base -> origin/gh/XilunWu/152/base 2025-10-10T00:44:16.2438551Z * [new branch] gh/XilunWu/152/head -> origin/gh/XilunWu/152/head 2025-10-10T00:44:16.2440262Z * [new branch] gh/XilunWu/152/orig -> origin/gh/XilunWu/152/orig 2025-10-10T00:44:16.2442937Z * [new branch] gh/XilunWu/153/base -> origin/gh/XilunWu/153/base 2025-10-10T00:44:16.2444802Z * [new branch] gh/XilunWu/153/head -> origin/gh/XilunWu/153/head 2025-10-10T00:44:16.2446672Z * [new branch] gh/XilunWu/153/orig -> origin/gh/XilunWu/153/orig 2025-10-10T00:44:16.2449555Z * [new branch] gh/XilunWu/160/base -> origin/gh/XilunWu/160/base 2025-10-10T00:44:16.2451388Z * [new branch] gh/XilunWu/160/head -> origin/gh/XilunWu/160/head 2025-10-10T00:44:16.2453323Z * [new branch] gh/XilunWu/160/orig -> origin/gh/XilunWu/160/orig 2025-10-10T00:44:16.2455850Z * [new branch] gh/XilunWu/163/base -> origin/gh/XilunWu/163/base 2025-10-10T00:44:16.2457858Z * [new branch] gh/XilunWu/163/head -> origin/gh/XilunWu/163/head 2025-10-10T00:44:16.2459757Z * [new branch] gh/XilunWu/163/orig -> origin/gh/XilunWu/163/orig 2025-10-10T00:44:16.2462607Z * [new branch] gh/XilunWu/166/base -> origin/gh/XilunWu/166/base 2025-10-10T00:44:16.2464634Z * [new branch] gh/XilunWu/166/head -> origin/gh/XilunWu/166/head 2025-10-10T00:44:16.2466622Z * [new branch] gh/XilunWu/166/orig -> origin/gh/XilunWu/166/orig 2025-10-10T00:44:16.2469265Z * [new branch] gh/XilunWu/168/base -> origin/gh/XilunWu/168/base 2025-10-10T00:44:16.2471093Z * [new branch] gh/XilunWu/168/head -> origin/gh/XilunWu/168/head 2025-10-10T00:44:16.2473055Z * [new branch] gh/XilunWu/168/orig -> origin/gh/XilunWu/168/orig 2025-10-10T00:44:16.2475548Z * [new branch] gh/XilunWu/169/base -> origin/gh/XilunWu/169/base 2025-10-10T00:44:16.2477528Z * [new branch] gh/XilunWu/169/head -> origin/gh/XilunWu/169/head 2025-10-10T00:44:16.2479323Z * [new branch] gh/XilunWu/169/orig -> origin/gh/XilunWu/169/orig 2025-10-10T00:44:16.2481743Z * [new branch] gh/XilunWu/170/base -> origin/gh/XilunWu/170/base 2025-10-10T00:44:16.2483623Z * [new branch] gh/XilunWu/170/head -> origin/gh/XilunWu/170/head 2025-10-10T00:44:16.2485497Z * [new branch] gh/XilunWu/170/orig -> origin/gh/XilunWu/170/orig 2025-10-10T00:44:16.2488651Z * [new branch] gh/XilunWu/171/base -> origin/gh/XilunWu/171/base 2025-10-10T00:44:16.2490560Z * [new branch] gh/XilunWu/171/head -> origin/gh/XilunWu/171/head 2025-10-10T00:44:16.2492373Z * [new branch] gh/XilunWu/171/orig -> origin/gh/XilunWu/171/orig 2025-10-10T00:44:16.2494870Z * [new branch] gh/XilunWu/172/base -> origin/gh/XilunWu/172/base 2025-10-10T00:44:16.2497089Z * [new branch] gh/XilunWu/172/head -> origin/gh/XilunWu/172/head 2025-10-10T00:44:16.2499176Z * [new branch] gh/XilunWu/172/orig -> origin/gh/XilunWu/172/orig 2025-10-10T00:44:16.2501760Z * [new branch] gh/XilunWu/173/base -> origin/gh/XilunWu/173/base 2025-10-10T00:44:16.2503572Z * [new branch] gh/XilunWu/173/head -> origin/gh/XilunWu/173/head 2025-10-10T00:44:16.2505392Z * [new branch] gh/XilunWu/173/orig -> origin/gh/XilunWu/173/orig 2025-10-10T00:44:16.2508584Z * [new branch] gh/XilunWu/174/base -> origin/gh/XilunWu/174/base 2025-10-10T00:44:16.2510758Z * [new branch] gh/XilunWu/174/head -> origin/gh/XilunWu/174/head 2025-10-10T00:44:16.2512606Z * [new branch] gh/XilunWu/174/orig -> origin/gh/XilunWu/174/orig 2025-10-10T00:44:16.2515105Z * [new branch] gh/XilunWu/175/base -> origin/gh/XilunWu/175/base 2025-10-10T00:44:16.2517045Z * [new branch] gh/XilunWu/175/head -> origin/gh/XilunWu/175/head 2025-10-10T00:44:16.2518888Z * [new branch] gh/XilunWu/175/orig -> origin/gh/XilunWu/175/orig 2025-10-10T00:44:16.2522251Z * [new branch] gh/XuehaiPan/14/base -> origin/gh/XuehaiPan/14/base 2025-10-10T00:44:16.2524267Z * [new branch] gh/XuehaiPan/14/head -> origin/gh/XuehaiPan/14/head 2025-10-10T00:44:16.2526326Z * [new branch] gh/XuehaiPan/14/orig -> origin/gh/XuehaiPan/14/orig 2025-10-10T00:44:16.2529011Z * [new branch] gh/XuehaiPan/179/base -> origin/gh/XuehaiPan/179/base 2025-10-10T00:44:16.2530906Z * [new branch] gh/XuehaiPan/179/head -> origin/gh/XuehaiPan/179/head 2025-10-10T00:44:16.2532764Z * [new branch] gh/XuehaiPan/179/orig -> origin/gh/XuehaiPan/179/orig 2025-10-10T00:44:16.2535551Z * [new branch] gh/XuehaiPan/189/base -> origin/gh/XuehaiPan/189/base 2025-10-10T00:44:16.2537521Z * [new branch] gh/XuehaiPan/189/head -> origin/gh/XuehaiPan/189/head 2025-10-10T00:44:16.2539491Z * [new branch] gh/XuehaiPan/189/orig -> origin/gh/XuehaiPan/189/orig 2025-10-10T00:44:16.2542116Z * [new branch] gh/XuehaiPan/249/base -> origin/gh/XuehaiPan/249/base 2025-10-10T00:44:16.2544068Z * [new branch] gh/XuehaiPan/249/head -> origin/gh/XuehaiPan/249/head 2025-10-10T00:44:16.2546001Z * [new branch] gh/XuehaiPan/249/orig -> origin/gh/XuehaiPan/249/orig 2025-10-10T00:44:16.2548446Z * [new branch] gh/XuehaiPan/253/base -> origin/gh/XuehaiPan/253/base 2025-10-10T00:44:16.2550541Z * [new branch] gh/XuehaiPan/253/head -> origin/gh/XuehaiPan/253/head 2025-10-10T00:44:16.2552327Z * [new branch] gh/XuehaiPan/253/orig -> origin/gh/XuehaiPan/253/orig 2025-10-10T00:44:16.2554883Z * [new branch] gh/XuehaiPan/254/base -> origin/gh/XuehaiPan/254/base 2025-10-10T00:44:16.2556884Z * [new branch] gh/XuehaiPan/254/head -> origin/gh/XuehaiPan/254/head 2025-10-10T00:44:16.2558786Z * [new branch] gh/XuehaiPan/254/orig -> origin/gh/XuehaiPan/254/orig 2025-10-10T00:44:16.2561276Z * [new branch] gh/XuehaiPan/255/base -> origin/gh/XuehaiPan/255/base 2025-10-10T00:44:16.2563153Z * [new branch] gh/XuehaiPan/255/head -> origin/gh/XuehaiPan/255/head 2025-10-10T00:44:16.2565048Z * [new branch] gh/XuehaiPan/255/orig -> origin/gh/XuehaiPan/255/orig 2025-10-10T00:44:16.2567956Z * [new branch] gh/XuehaiPan/257/base -> origin/gh/XuehaiPan/257/base 2025-10-10T00:44:16.2569740Z * [new branch] gh/XuehaiPan/257/head -> origin/gh/XuehaiPan/257/head 2025-10-10T00:44:16.2571539Z * [new branch] gh/XuehaiPan/257/orig -> origin/gh/XuehaiPan/257/orig 2025-10-10T00:44:16.2574225Z * [new branch] gh/XuehaiPan/271/base -> origin/gh/XuehaiPan/271/base 2025-10-10T00:44:16.2576231Z * [new branch] gh/XuehaiPan/271/head -> origin/gh/XuehaiPan/271/head 2025-10-10T00:44:16.2577965Z * [new branch] gh/XuehaiPan/271/orig -> origin/gh/XuehaiPan/271/orig 2025-10-10T00:44:16.2580532Z * [new branch] gh/XuehaiPan/290/base -> origin/gh/XuehaiPan/290/base 2025-10-10T00:44:16.2582405Z * [new branch] gh/XuehaiPan/290/head -> origin/gh/XuehaiPan/290/head 2025-10-10T00:44:16.2584300Z * [new branch] gh/XuehaiPan/290/orig -> origin/gh/XuehaiPan/290/orig 2025-10-10T00:44:16.2586978Z * [new branch] gh/XuehaiPan/343/base -> origin/gh/XuehaiPan/343/base 2025-10-10T00:44:16.2589346Z * [new branch] gh/XuehaiPan/343/head -> origin/gh/XuehaiPan/343/head 2025-10-10T00:44:16.2590566Z * [new branch] gh/XuehaiPan/343/orig -> origin/gh/XuehaiPan/343/orig 2025-10-10T00:44:16.2593506Z * [new branch] gh/XuehaiPan/347/base -> origin/gh/XuehaiPan/347/base 2025-10-10T00:44:16.2595319Z * [new branch] gh/XuehaiPan/347/head -> origin/gh/XuehaiPan/347/head 2025-10-10T00:44:16.2597170Z * [new branch] gh/XuehaiPan/347/orig -> origin/gh/XuehaiPan/347/orig 2025-10-10T00:44:16.2601742Z * [new branch] gh/XuehaiPan/348/base -> origin/gh/XuehaiPan/348/base 2025-10-10T00:44:16.2603647Z * [new branch] gh/XuehaiPan/348/head -> origin/gh/XuehaiPan/348/head 2025-10-10T00:44:16.2605554Z * [new branch] gh/XuehaiPan/348/orig -> origin/gh/XuehaiPan/348/orig 2025-10-10T00:44:16.2608327Z * [new branch] gh/XuehaiPan/350/base -> origin/gh/XuehaiPan/350/base 2025-10-10T00:44:16.2610148Z * [new branch] gh/XuehaiPan/350/head -> origin/gh/XuehaiPan/350/head 2025-10-10T00:44:16.2612120Z * [new branch] gh/XuehaiPan/350/orig -> origin/gh/XuehaiPan/350/orig 2025-10-10T00:44:16.2614783Z * [new branch] gh/XuehaiPan/356/base -> origin/gh/XuehaiPan/356/base 2025-10-10T00:44:16.2616769Z * [new branch] gh/XuehaiPan/356/head -> origin/gh/XuehaiPan/356/head 2025-10-10T00:44:16.2618597Z * [new branch] gh/XuehaiPan/356/orig -> origin/gh/XuehaiPan/356/orig 2025-10-10T00:44:16.2621227Z * [new branch] gh/XuehaiPan/357/base -> origin/gh/XuehaiPan/357/base 2025-10-10T00:44:16.2623464Z * [new branch] gh/XuehaiPan/357/head -> origin/gh/XuehaiPan/357/head 2025-10-10T00:44:16.2625264Z * [new branch] gh/XuehaiPan/357/orig -> origin/gh/XuehaiPan/357/orig 2025-10-10T00:44:16.2627951Z * [new branch] gh/XuehaiPan/358/base -> origin/gh/XuehaiPan/358/base 2025-10-10T00:44:16.2629772Z * [new branch] gh/XuehaiPan/358/head -> origin/gh/XuehaiPan/358/head 2025-10-10T00:44:16.2631626Z * [new branch] gh/XuehaiPan/358/orig -> origin/gh/XuehaiPan/358/orig 2025-10-10T00:44:16.2634576Z * [new branch] gh/XuehaiPan/359/base -> origin/gh/XuehaiPan/359/base 2025-10-10T00:44:16.2636498Z * [new branch] gh/XuehaiPan/359/head -> origin/gh/XuehaiPan/359/head 2025-10-10T00:44:16.2638454Z * [new branch] gh/XuehaiPan/359/orig -> origin/gh/XuehaiPan/359/orig 2025-10-10T00:44:16.2641006Z * [new branch] gh/XuehaiPan/360/base -> origin/gh/XuehaiPan/360/base 2025-10-10T00:44:16.2642902Z * [new branch] gh/XuehaiPan/360/head -> origin/gh/XuehaiPan/360/head 2025-10-10T00:44:16.2644740Z * [new branch] gh/XuehaiPan/360/orig -> origin/gh/XuehaiPan/360/orig 2025-10-10T00:44:16.2647443Z * [new branch] gh/XuehaiPan/365/base -> origin/gh/XuehaiPan/365/base 2025-10-10T00:44:16.2649455Z * [new branch] gh/XuehaiPan/365/head -> origin/gh/XuehaiPan/365/head 2025-10-10T00:44:16.2662638Z * [new branch] gh/XuehaiPan/365/orig -> origin/gh/XuehaiPan/365/orig 2025-10-10T00:44:16.2663201Z * [new branch] gh/XuehaiPan/366/base -> origin/gh/XuehaiPan/366/base 2025-10-10T00:44:16.2663748Z * [new branch] gh/XuehaiPan/366/head -> origin/gh/XuehaiPan/366/head 2025-10-10T00:44:16.2664269Z * [new branch] gh/XuehaiPan/370/base -> origin/gh/XuehaiPan/370/base 2025-10-10T00:44:16.2664796Z * [new branch] gh/XuehaiPan/370/head -> origin/gh/XuehaiPan/370/head 2025-10-10T00:44:16.2665317Z * [new branch] gh/XuehaiPan/370/orig -> origin/gh/XuehaiPan/370/orig 2025-10-10T00:44:16.2665860Z * [new branch] gh/XuehaiPan/384/base -> origin/gh/XuehaiPan/384/base 2025-10-10T00:44:16.2666422Z * [new branch] gh/XuehaiPan/384/head -> origin/gh/XuehaiPan/384/head 2025-10-10T00:44:16.2668580Z * [new branch] gh/XuehaiPan/384/orig -> origin/gh/XuehaiPan/384/orig 2025-10-10T00:44:16.2671372Z * [new branch] gh/XuehaiPan/385/base -> origin/gh/XuehaiPan/385/base 2025-10-10T00:44:16.2673172Z * [new branch] gh/XuehaiPan/385/head -> origin/gh/XuehaiPan/385/head 2025-10-10T00:44:16.2675045Z * [new branch] gh/XuehaiPan/385/orig -> origin/gh/XuehaiPan/385/orig 2025-10-10T00:44:16.2677748Z * [new branch] gh/XuehaiPan/386/base -> origin/gh/XuehaiPan/386/base 2025-10-10T00:44:16.2679486Z * [new branch] gh/XuehaiPan/386/head -> origin/gh/XuehaiPan/386/head 2025-10-10T00:44:16.2681349Z * [new branch] gh/XuehaiPan/386/orig -> origin/gh/XuehaiPan/386/orig 2025-10-10T00:44:16.2683998Z * [new branch] gh/XuehaiPan/387/base -> origin/gh/XuehaiPan/387/base 2025-10-10T00:44:16.2685851Z * [new branch] gh/XuehaiPan/387/head -> origin/gh/XuehaiPan/387/head 2025-10-10T00:44:16.2687772Z * [new branch] gh/XuehaiPan/387/orig -> origin/gh/XuehaiPan/387/orig 2025-10-10T00:44:16.2690495Z * [new branch] gh/XuehaiPan/388/base -> origin/gh/XuehaiPan/388/base 2025-10-10T00:44:16.2692465Z * [new branch] gh/XuehaiPan/388/head -> origin/gh/XuehaiPan/388/head 2025-10-10T00:44:16.2694306Z * [new branch] gh/XuehaiPan/388/orig -> origin/gh/XuehaiPan/388/orig 2025-10-10T00:44:16.2696966Z * [new branch] gh/XuehaiPan/389/base -> origin/gh/XuehaiPan/389/base 2025-10-10T00:44:16.2698859Z * [new branch] gh/XuehaiPan/389/head -> origin/gh/XuehaiPan/389/head 2025-10-10T00:44:16.2701176Z * [new branch] gh/XuehaiPan/389/orig -> origin/gh/XuehaiPan/389/orig 2025-10-10T00:44:16.2704032Z * [new branch] gh/ZhiweiYan-96/39/base -> origin/gh/ZhiweiYan-96/39/base 2025-10-10T00:44:16.2705884Z * [new branch] gh/ZhiweiYan-96/39/head -> origin/gh/ZhiweiYan-96/39/head 2025-10-10T00:44:16.2707813Z * [new branch] gh/ZhiweiYan-96/39/orig -> origin/gh/ZhiweiYan-96/39/orig 2025-10-10T00:44:16.2710476Z * [new branch] gh/ZhiweiYan-96/44/base -> origin/gh/ZhiweiYan-96/44/base 2025-10-10T00:44:16.2712308Z * [new branch] gh/ZhiweiYan-96/44/head -> origin/gh/ZhiweiYan-96/44/head 2025-10-10T00:44:16.2714836Z * [new branch] gh/ZhiweiYan-96/45/base -> origin/gh/ZhiweiYan-96/45/base 2025-10-10T00:44:16.2716621Z * [new branch] gh/ZhiweiYan-96/45/head -> origin/gh/ZhiweiYan-96/45/head 2025-10-10T00:44:16.2719436Z * [new branch] gh/ZhiweiYan-96/49/base -> origin/gh/ZhiweiYan-96/49/base 2025-10-10T00:44:16.2721344Z * [new branch] gh/ZhiweiYan-96/49/head -> origin/gh/ZhiweiYan-96/49/head 2025-10-10T00:44:16.2723847Z * [new branch] gh/ZhiweiYan-96/62/base -> origin/gh/ZhiweiYan-96/62/base 2025-10-10T00:44:16.2725703Z * [new branch] gh/ZhiweiYan-96/62/head -> origin/gh/ZhiweiYan-96/62/head 2025-10-10T00:44:16.2728339Z * [new branch] gh/ZhiweiYan-96/64/base -> origin/gh/ZhiweiYan-96/64/base 2025-10-10T00:44:16.2730216Z * [new branch] gh/ZhiweiYan-96/64/head -> origin/gh/ZhiweiYan-96/64/head 2025-10-10T00:44:16.2732138Z * [new branch] gh/ZhiweiYan-96/64/orig -> origin/gh/ZhiweiYan-96/64/orig 2025-10-10T00:44:16.2734687Z * [new branch] gh/ZhiweiYan-96/66/base -> origin/gh/ZhiweiYan-96/66/base 2025-10-10T00:44:16.2736559Z * [new branch] gh/ZhiweiYan-96/66/head -> origin/gh/ZhiweiYan-96/66/head 2025-10-10T00:44:16.2739066Z * [new branch] gh/ZhiweiYan-96/67/base -> origin/gh/ZhiweiYan-96/67/base 2025-10-10T00:44:16.2741042Z * [new branch] gh/ZhiweiYan-96/67/head -> origin/gh/ZhiweiYan-96/67/head 2025-10-10T00:44:16.2743459Z * [new branch] gh/ZhiweiYan-96/68/base -> origin/gh/ZhiweiYan-96/68/base 2025-10-10T00:44:16.2745214Z * [new branch] gh/ZhiweiYan-96/68/head -> origin/gh/ZhiweiYan-96/68/head 2025-10-10T00:44:16.2747153Z * [new branch] gh/ZhiweiYan-96/68/orig -> origin/gh/ZhiweiYan-96/68/orig 2025-10-10T00:44:16.2750341Z * [new branch] gh/aakhundov/1/base -> origin/gh/aakhundov/1/base 2025-10-10T00:44:16.2752333Z * [new branch] gh/aakhundov/1/head -> origin/gh/aakhundov/1/head 2025-10-10T00:44:16.2754732Z * [new branch] gh/aakhundov/2/base -> origin/gh/aakhundov/2/base 2025-10-10T00:44:16.2756593Z * [new branch] gh/aakhundov/2/head -> origin/gh/aakhundov/2/head 2025-10-10T00:44:16.2759160Z * [new branch] gh/aakhundov/3/base -> origin/gh/aakhundov/3/base 2025-10-10T00:44:16.2760954Z * [new branch] gh/aakhundov/3/head -> origin/gh/aakhundov/3/head 2025-10-10T00:44:16.2762898Z * [new branch] gh/aakhundov/3/orig -> origin/gh/aakhundov/3/orig 2025-10-10T00:44:16.2765595Z * [new branch] gh/aditew01/openblas -> origin/gh/aditew01/openblas 2025-10-10T00:44:16.2767436Z * [new branch] gh/aditew01/sbgemm -> origin/gh/aditew01/sbgemm 2025-10-10T00:44:16.2769478Z * [new branch] gh/aditew01/vecbf16 -> origin/gh/aditew01/vecbf16 2025-10-10T00:44:16.2772505Z * [new branch] gh/albanD/1/base -> origin/gh/albanD/1/base 2025-10-10T00:44:16.2774363Z * [new branch] gh/albanD/1/head -> origin/gh/albanD/1/head 2025-10-10T00:44:16.2776215Z * [new branch] gh/albanD/1/orig -> origin/gh/albanD/1/orig 2025-10-10T00:44:16.2779171Z * [new branch] gh/albanD/2/base -> origin/gh/albanD/2/base 2025-10-10T00:44:16.2780402Z * [new branch] gh/albanD/2/head -> origin/gh/albanD/2/head 2025-10-10T00:44:16.2782534Z * [new branch] gh/albanD/2/orig -> origin/gh/albanD/2/orig 2025-10-10T00:44:16.2785137Z * [new branch] gh/albanD/3/base -> origin/gh/albanD/3/base 2025-10-10T00:44:16.2786830Z * [new branch] gh/albanD/3/head -> origin/gh/albanD/3/head 2025-10-10T00:44:16.2789064Z * [new branch] gh/albanD/3/orig -> origin/gh/albanD/3/orig 2025-10-10T00:44:16.2791409Z * [new branch] gh/albanD/4/base -> origin/gh/albanD/4/base 2025-10-10T00:44:16.2793241Z * [new branch] gh/albanD/4/head -> origin/gh/albanD/4/head 2025-10-10T00:44:16.2795153Z * [new branch] gh/albanD/4/orig -> origin/gh/albanD/4/orig 2025-10-10T00:44:16.2798019Z * [new branch] gh/alexbrauckmann/paddedtensor_faketensor_init -> origin/gh/alexbrauckmann/paddedtensor_faketensor_init 2025-10-10T00:44:16.2801132Z * [new branch] gh/alexsamardzic/10/base -> origin/gh/alexsamardzic/10/base 2025-10-10T00:44:16.2803004Z * [new branch] gh/alexsamardzic/10/head -> origin/gh/alexsamardzic/10/head 2025-10-10T00:44:16.2804904Z * [new branch] gh/alexsamardzic/10/orig -> origin/gh/alexsamardzic/10/orig 2025-10-10T00:44:16.2807473Z * [new branch] gh/alexsamardzic/11/base -> origin/gh/alexsamardzic/11/base 2025-10-10T00:44:16.2809654Z * [new branch] gh/alexsamardzic/11/head -> origin/gh/alexsamardzic/11/head 2025-10-10T00:44:16.2811519Z * [new branch] gh/alexsamardzic/11/orig -> origin/gh/alexsamardzic/11/orig 2025-10-10T00:44:16.2814206Z * [new branch] gh/alexsamardzic/12/base -> origin/gh/alexsamardzic/12/base 2025-10-10T00:44:16.2816136Z * [new branch] gh/alexsamardzic/12/head -> origin/gh/alexsamardzic/12/head 2025-10-10T00:44:16.2818022Z * [new branch] gh/alexsamardzic/12/orig -> origin/gh/alexsamardzic/12/orig 2025-10-10T00:44:16.2821200Z * [new branch] gh/amjames/18/base -> origin/gh/amjames/18/base 2025-10-10T00:44:16.2823067Z * [new branch] gh/amjames/18/head -> origin/gh/amjames/18/head 2025-10-10T00:44:16.2824863Z * [new branch] gh/amjames/18/orig -> origin/gh/amjames/18/orig 2025-10-10T00:44:16.2828390Z * [new branch] gh/andrewor14/35/base -> origin/gh/andrewor14/35/base 2025-10-10T00:44:16.2830244Z * [new branch] gh/andrewor14/35/head -> origin/gh/andrewor14/35/head 2025-10-10T00:44:16.2832137Z * [new branch] gh/andrewor14/35/orig -> origin/gh/andrewor14/35/orig 2025-10-10T00:44:16.2834901Z * [new branch] gh/andrewor14/50/base -> origin/gh/andrewor14/50/base 2025-10-10T00:44:16.2837000Z * [new branch] gh/andrewor14/50/head -> origin/gh/andrewor14/50/head 2025-10-10T00:44:16.2838946Z * [new branch] gh/andrewor14/50/orig -> origin/gh/andrewor14/50/orig 2025-10-10T00:44:16.2842085Z * [new branch] gh/andyanwang/28/base -> origin/gh/andyanwang/28/base 2025-10-10T00:44:16.2844221Z * [new branch] gh/andyanwang/28/head -> origin/gh/andyanwang/28/head 2025-10-10T00:44:16.2845992Z * [new branch] gh/andyanwang/28/orig -> origin/gh/andyanwang/28/orig 2025-10-10T00:44:16.2848770Z * [new branch] gh/andyanwang/30/base -> origin/gh/andyanwang/30/base 2025-10-10T00:44:16.2850842Z * [new branch] gh/andyanwang/30/orig -> origin/gh/andyanwang/30/orig 2025-10-10T00:44:16.2853487Z * [new branch] gh/andyanwang/31/base -> origin/gh/andyanwang/31/base 2025-10-10T00:44:16.2855518Z * [new branch] gh/andyanwang/31/orig -> origin/gh/andyanwang/31/orig 2025-10-10T00:44:16.2858688Z * [new branch] gh/andyanwang/32/base -> origin/gh/andyanwang/32/base 2025-10-10T00:44:16.2860319Z * [new branch] gh/andyanwang/32/head -> origin/gh/andyanwang/32/head 2025-10-10T00:44:16.2862492Z * [new branch] gh/andyanwang/32/orig -> origin/gh/andyanwang/32/orig 2025-10-10T00:44:16.2865152Z * [new branch] gh/andyanwang/39/base -> origin/gh/andyanwang/39/base 2025-10-10T00:44:16.2867089Z * [new branch] gh/andyanwang/39/head -> origin/gh/andyanwang/39/head 2025-10-10T00:44:16.2868863Z * [new branch] gh/andyanwang/39/orig -> origin/gh/andyanwang/39/orig 2025-10-10T00:44:16.2872673Z * [new branch] gh/angelayi/107/base -> origin/gh/angelayi/107/base 2025-10-10T00:44:16.2874486Z * [new branch] gh/angelayi/107/head -> origin/gh/angelayi/107/head 2025-10-10T00:44:16.2877166Z * [new branch] gh/angelayi/114/base -> origin/gh/angelayi/114/base 2025-10-10T00:44:16.2879094Z * [new branch] gh/angelayi/114/head -> origin/gh/angelayi/114/head 2025-10-10T00:44:16.2881029Z * [new branch] gh/angelayi/114/orig -> origin/gh/angelayi/114/orig 2025-10-10T00:44:16.2883574Z * [new branch] gh/angelayi/116/base -> origin/gh/angelayi/116/base 2025-10-10T00:44:16.2885510Z * [new branch] gh/angelayi/116/head -> origin/gh/angelayi/116/head 2025-10-10T00:44:16.2887453Z * [new branch] gh/angelayi/116/orig -> origin/gh/angelayi/116/orig 2025-10-10T00:44:16.2890485Z * [new branch] gh/angelayi/117/base -> origin/gh/angelayi/117/base 2025-10-10T00:44:16.2892314Z * [new branch] gh/angelayi/117/head -> origin/gh/angelayi/117/head 2025-10-10T00:44:16.2894189Z * [new branch] gh/angelayi/117/orig -> origin/gh/angelayi/117/orig 2025-10-10T00:44:16.2896775Z * [new branch] gh/angelayi/118/base -> origin/gh/angelayi/118/base 2025-10-10T00:44:16.2898823Z * [new branch] gh/angelayi/118/head -> origin/gh/angelayi/118/head 2025-10-10T00:44:16.2900690Z * [new branch] gh/angelayi/118/orig -> origin/gh/angelayi/118/orig 2025-10-10T00:44:16.2903381Z * [new branch] gh/angelayi/119/base -> origin/gh/angelayi/119/base 2025-10-10T00:44:16.2905288Z * [new branch] gh/angelayi/119/head -> origin/gh/angelayi/119/head 2025-10-10T00:44:16.2907176Z * [new branch] gh/angelayi/119/orig -> origin/gh/angelayi/119/orig 2025-10-10T00:44:16.2909766Z * [new branch] gh/angelayi/120/base -> origin/gh/angelayi/120/base 2025-10-10T00:44:16.2911686Z * [new branch] gh/angelayi/120/head -> origin/gh/angelayi/120/head 2025-10-10T00:44:16.2913656Z * [new branch] gh/angelayi/120/orig -> origin/gh/angelayi/120/orig 2025-10-10T00:44:16.2916162Z * [new branch] gh/angelayi/121/base -> origin/gh/angelayi/121/base 2025-10-10T00:44:16.2918031Z * [new branch] gh/angelayi/121/head -> origin/gh/angelayi/121/head 2025-10-10T00:44:16.2920025Z * [new branch] gh/angelayi/121/orig -> origin/gh/angelayi/121/orig 2025-10-10T00:44:16.2922604Z * [new branch] gh/angelayi/122/base -> origin/gh/angelayi/122/base 2025-10-10T00:44:16.2924378Z * [new branch] gh/angelayi/122/head -> origin/gh/angelayi/122/head 2025-10-10T00:44:16.2926268Z * [new branch] gh/angelayi/122/orig -> origin/gh/angelayi/122/orig 2025-10-10T00:44:16.2928702Z * [new branch] gh/angelayi/123/base -> origin/gh/angelayi/123/base 2025-10-10T00:44:16.2930734Z * [new branch] gh/angelayi/123/head -> origin/gh/angelayi/123/head 2025-10-10T00:44:16.2932730Z * [new branch] gh/angelayi/123/orig -> origin/gh/angelayi/123/orig 2025-10-10T00:44:16.2935415Z * [new branch] gh/angelayi/124/base -> origin/gh/angelayi/124/base 2025-10-10T00:44:16.2937182Z * [new branch] gh/angelayi/124/head -> origin/gh/angelayi/124/head 2025-10-10T00:44:16.2939128Z * [new branch] gh/angelayi/124/orig -> origin/gh/angelayi/124/orig 2025-10-10T00:44:16.2941706Z * [new branch] gh/angelayi/125/base -> origin/gh/angelayi/125/base 2025-10-10T00:44:16.2943661Z * [new branch] gh/angelayi/125/head -> origin/gh/angelayi/125/head 2025-10-10T00:44:16.2945553Z * [new branch] gh/angelayi/125/orig -> origin/gh/angelayi/125/orig 2025-10-10T00:44:16.2948230Z * [new branch] gh/angelayi/126/base -> origin/gh/angelayi/126/base 2025-10-10T00:44:16.2950147Z * [new branch] gh/angelayi/126/head -> origin/gh/angelayi/126/head 2025-10-10T00:44:16.2952037Z * [new branch] gh/angelayi/126/orig -> origin/gh/angelayi/126/orig 2025-10-10T00:44:16.2954913Z * [new branch] gh/angelayi/127/base -> origin/gh/angelayi/127/base 2025-10-10T00:44:16.2957237Z * [new branch] gh/angelayi/127/head -> origin/gh/angelayi/127/head 2025-10-10T00:44:16.2959431Z * [new branch] gh/angelayi/127/orig -> origin/gh/angelayi/127/orig 2025-10-10T00:44:16.2962160Z * [new branch] gh/angelayi/128/base -> origin/gh/angelayi/128/base 2025-10-10T00:44:16.2963973Z * [new branch] gh/angelayi/128/head -> origin/gh/angelayi/128/head 2025-10-10T00:44:16.2965966Z * [new branch] gh/angelayi/128/orig -> origin/gh/angelayi/128/orig 2025-10-10T00:44:16.2968784Z * [new branch] gh/angelayi/129/base -> origin/gh/angelayi/129/base 2025-10-10T00:44:16.2970591Z * [new branch] gh/angelayi/129/head -> origin/gh/angelayi/129/head 2025-10-10T00:44:16.2972444Z * [new branch] gh/angelayi/129/orig -> origin/gh/angelayi/129/orig 2025-10-10T00:44:16.2975389Z * [new branch] gh/angelayi/130/base -> origin/gh/angelayi/130/base 2025-10-10T00:44:16.2977407Z * [new branch] gh/angelayi/130/head -> origin/gh/angelayi/130/head 2025-10-10T00:44:16.2979111Z * [new branch] gh/angelayi/130/orig -> origin/gh/angelayi/130/orig 2025-10-10T00:44:16.2982340Z * [new branch] gh/anijain2305/753/base -> origin/gh/anijain2305/753/base 2025-10-10T00:44:16.2984366Z * [new branch] gh/anijain2305/753/head -> origin/gh/anijain2305/753/head 2025-10-10T00:44:16.2986174Z * [new branch] gh/anijain2305/753/orig -> origin/gh/anijain2305/753/orig 2025-10-10T00:44:16.2988777Z * [new branch] gh/anijain2305/790/base -> origin/gh/anijain2305/790/base 2025-10-10T00:44:16.2990809Z * [new branch] gh/anijain2305/790/head -> origin/gh/anijain2305/790/head 2025-10-10T00:44:16.2992850Z * [new branch] gh/anijain2305/790/orig -> origin/gh/anijain2305/790/orig 2025-10-10T00:44:16.2995941Z * [new branch] gh/anijain2305/792/base -> origin/gh/anijain2305/792/base 2025-10-10T00:44:16.2997863Z * [new branch] gh/anijain2305/792/head -> origin/gh/anijain2305/792/head 2025-10-10T00:44:16.2999891Z * [new branch] gh/anijain2305/792/orig -> origin/gh/anijain2305/792/orig 2025-10-10T00:44:16.3002390Z * [new branch] gh/anijain2305/805/base -> origin/gh/anijain2305/805/base 2025-10-10T00:44:16.3004174Z * [new branch] gh/anijain2305/805/head -> origin/gh/anijain2305/805/head 2025-10-10T00:44:16.3006042Z * [new branch] gh/anijain2305/805/orig -> origin/gh/anijain2305/805/orig 2025-10-10T00:44:16.3008939Z * [new branch] gh/anijain2305/810/base -> origin/gh/anijain2305/810/base 2025-10-10T00:44:16.3011025Z * [new branch] gh/anijain2305/810/head -> origin/gh/anijain2305/810/head 2025-10-10T00:44:16.3012700Z * [new branch] gh/anijain2305/810/orig -> origin/gh/anijain2305/810/orig 2025-10-10T00:44:16.3015400Z * [new branch] gh/anijain2305/812/base -> origin/gh/anijain2305/812/base 2025-10-10T00:44:16.3017456Z * [new branch] gh/anijain2305/812/head -> origin/gh/anijain2305/812/head 2025-10-10T00:44:16.3019307Z * [new branch] gh/anijain2305/812/orig -> origin/gh/anijain2305/812/orig 2025-10-10T00:44:16.3022022Z * [new branch] gh/anijain2305/854/base -> origin/gh/anijain2305/854/base 2025-10-10T00:44:16.3023991Z * [new branch] gh/anijain2305/854/head -> origin/gh/anijain2305/854/head 2025-10-10T00:44:16.3025786Z * [new branch] gh/anijain2305/854/orig -> origin/gh/anijain2305/854/orig 2025-10-10T00:44:16.3028524Z * [new branch] gh/anijain2305/855/base -> origin/gh/anijain2305/855/base 2025-10-10T00:44:16.3030561Z * [new branch] gh/anijain2305/855/head -> origin/gh/anijain2305/855/head 2025-10-10T00:44:16.3032482Z * [new branch] gh/anijain2305/855/orig -> origin/gh/anijain2305/855/orig 2025-10-10T00:44:16.3035568Z * [new branch] gh/anijain2305/864/base -> origin/gh/anijain2305/864/base 2025-10-10T00:44:16.3037832Z * [new branch] gh/anijain2305/864/head -> origin/gh/anijain2305/864/head 2025-10-10T00:44:16.3038786Z * [new branch] gh/anijain2305/864/orig -> origin/gh/anijain2305/864/orig 2025-10-10T00:44:16.3041773Z * [new branch] gh/anijain2305/867/base -> origin/gh/anijain2305/867/base 2025-10-10T00:44:16.3043842Z * [new branch] gh/anijain2305/867/head -> origin/gh/anijain2305/867/head 2025-10-10T00:44:16.3045669Z * [new branch] gh/anijain2305/867/orig -> origin/gh/anijain2305/867/orig 2025-10-10T00:44:16.3048823Z * [new branch] gh/anijain2305/868/base -> origin/gh/anijain2305/868/base 2025-10-10T00:44:16.3050671Z * [new branch] gh/anijain2305/868/head -> origin/gh/anijain2305/868/head 2025-10-10T00:44:16.3052525Z * [new branch] gh/anijain2305/868/orig -> origin/gh/anijain2305/868/orig 2025-10-10T00:44:16.3055301Z * [new branch] gh/anijain2305/869/base -> origin/gh/anijain2305/869/base 2025-10-10T00:44:16.3057133Z * [new branch] gh/anijain2305/869/head -> origin/gh/anijain2305/869/head 2025-10-10T00:44:16.3058902Z * [new branch] gh/anijain2305/869/orig -> origin/gh/anijain2305/869/orig 2025-10-10T00:44:16.3061621Z * [new branch] gh/anijain2305/870/base -> origin/gh/anijain2305/870/base 2025-10-10T00:44:16.3063379Z * [new branch] gh/anijain2305/870/head -> origin/gh/anijain2305/870/head 2025-10-10T00:44:16.3065251Z * [new branch] gh/anijain2305/870/orig -> origin/gh/anijain2305/870/orig 2025-10-10T00:44:16.3067929Z * [new branch] gh/anijain2305/871/base -> origin/gh/anijain2305/871/base 2025-10-10T00:44:16.3070019Z * [new branch] gh/anijain2305/871/head -> origin/gh/anijain2305/871/head 2025-10-10T00:44:16.3071892Z * [new branch] gh/anijain2305/871/orig -> origin/gh/anijain2305/871/orig 2025-10-10T00:44:16.3074399Z * [new branch] gh/anijain2305/872/base -> origin/gh/anijain2305/872/base 2025-10-10T00:44:16.3076318Z * [new branch] gh/anijain2305/872/head -> origin/gh/anijain2305/872/head 2025-10-10T00:44:16.3078225Z * [new branch] gh/anijain2305/872/orig -> origin/gh/anijain2305/872/orig 2025-10-10T00:44:16.3080843Z * [new branch] gh/anijain2305/873/base -> origin/gh/anijain2305/873/base 2025-10-10T00:44:16.3082670Z * [new branch] gh/anijain2305/873/head -> origin/gh/anijain2305/873/head 2025-10-10T00:44:16.3084711Z * [new branch] gh/anijain2305/873/orig -> origin/gh/anijain2305/873/orig 2025-10-10T00:44:16.3087847Z * [new branch] gh/anijain2305/874/base -> origin/gh/anijain2305/874/base 2025-10-10T00:44:16.3089859Z * [new branch] gh/anijain2305/874/head -> origin/gh/anijain2305/874/head 2025-10-10T00:44:16.3091751Z * [new branch] gh/anijain2305/874/orig -> origin/gh/anijain2305/874/orig 2025-10-10T00:44:16.3094488Z * [new branch] gh/anijain2305/875/base -> origin/gh/anijain2305/875/base 2025-10-10T00:44:16.3096479Z * [new branch] gh/anijain2305/875/head -> origin/gh/anijain2305/875/head 2025-10-10T00:44:16.3098249Z * [new branch] gh/anijain2305/875/orig -> origin/gh/anijain2305/875/orig 2025-10-10T00:44:16.3102558Z * [new branch] gh/anijain2305/876/base -> origin/gh/anijain2305/876/base 2025-10-10T00:44:16.3104369Z * [new branch] gh/anijain2305/876/head -> origin/gh/anijain2305/876/head 2025-10-10T00:44:16.3107181Z * [new branch] gh/anijain2305/877/base -> origin/gh/anijain2305/877/base 2025-10-10T00:44:16.3108997Z * [new branch] gh/anijain2305/877/head -> origin/gh/anijain2305/877/head 2025-10-10T00:44:16.3110867Z * [new branch] gh/anijain2305/877/orig -> origin/gh/anijain2305/877/orig 2025-10-10T00:44:16.3113467Z * [new branch] gh/anijain2305/878/base -> origin/gh/anijain2305/878/base 2025-10-10T00:44:16.3115427Z * [new branch] gh/anijain2305/878/head -> origin/gh/anijain2305/878/head 2025-10-10T00:44:16.3117286Z * [new branch] gh/anijain2305/878/orig -> origin/gh/anijain2305/878/orig 2025-10-10T00:44:16.3119969Z * [new branch] gh/anijain2305/879/base -> origin/gh/anijain2305/879/base 2025-10-10T00:44:16.3121815Z * [new branch] gh/anijain2305/879/head -> origin/gh/anijain2305/879/head 2025-10-10T00:44:16.3123870Z * [new branch] gh/anijain2305/879/orig -> origin/gh/anijain2305/879/orig 2025-10-10T00:44:16.3126481Z * [new branch] gh/anijain2305/880/base -> origin/gh/anijain2305/880/base 2025-10-10T00:44:16.3128602Z * [new branch] gh/anijain2305/880/head -> origin/gh/anijain2305/880/head 2025-10-10T00:44:16.3130437Z * [new branch] gh/anijain2305/880/orig -> origin/gh/anijain2305/880/orig 2025-10-10T00:44:16.3133120Z * [new branch] gh/anijain2305/881/base -> origin/gh/anijain2305/881/base 2025-10-10T00:44:16.3134938Z * [new branch] gh/anijain2305/881/head -> origin/gh/anijain2305/881/head 2025-10-10T00:44:16.3136795Z * [new branch] gh/anijain2305/881/orig -> origin/gh/anijain2305/881/orig 2025-10-10T00:44:16.3139540Z * [new branch] gh/anijain2305/882/base -> origin/gh/anijain2305/882/base 2025-10-10T00:44:16.3141478Z * [new branch] gh/anijain2305/882/head -> origin/gh/anijain2305/882/head 2025-10-10T00:44:16.3143352Z * [new branch] gh/anijain2305/882/orig -> origin/gh/anijain2305/882/orig 2025-10-10T00:44:16.3146159Z * [new branch] gh/anijain2305/883/base -> origin/gh/anijain2305/883/base 2025-10-10T00:44:16.3148144Z * [new branch] gh/anijain2305/883/head -> origin/gh/anijain2305/883/head 2025-10-10T00:44:16.3150117Z * [new branch] gh/anijain2305/883/orig -> origin/gh/anijain2305/883/orig 2025-10-10T00:44:16.3152775Z * [new branch] gh/anijain2305/884/base -> origin/gh/anijain2305/884/base 2025-10-10T00:44:16.3154839Z * [new branch] gh/anijain2305/884/head -> origin/gh/anijain2305/884/head 2025-10-10T00:44:16.3156660Z * [new branch] gh/anijain2305/884/orig -> origin/gh/anijain2305/884/orig 2025-10-10T00:44:16.3159307Z * [new branch] gh/anijain2305/885/base -> origin/gh/anijain2305/885/base 2025-10-10T00:44:16.3161395Z * [new branch] gh/anijain2305/885/head -> origin/gh/anijain2305/885/head 2025-10-10T00:44:16.3162992Z * [new branch] gh/anijain2305/885/orig -> origin/gh/anijain2305/885/orig 2025-10-10T00:44:16.3165656Z * [new branch] gh/anijain2305/886/base -> origin/gh/anijain2305/886/base 2025-10-10T00:44:16.3167869Z * [new branch] gh/anijain2305/886/head -> origin/gh/anijain2305/886/head 2025-10-10T00:44:16.3169741Z * [new branch] gh/anijain2305/886/orig -> origin/gh/anijain2305/886/orig 2025-10-10T00:44:16.3172585Z * [new branch] gh/anijain2305/887/base -> origin/gh/anijain2305/887/base 2025-10-10T00:44:16.3174526Z * [new branch] gh/anijain2305/887/head -> origin/gh/anijain2305/887/head 2025-10-10T00:44:16.3176517Z * [new branch] gh/anijain2305/887/orig -> origin/gh/anijain2305/887/orig 2025-10-10T00:44:16.3179217Z * [new branch] gh/anijain2305/888/base -> origin/gh/anijain2305/888/base 2025-10-10T00:44:16.3181183Z * [new branch] gh/anijain2305/888/head -> origin/gh/anijain2305/888/head 2025-10-10T00:44:16.3183575Z * [new branch] gh/anijain2305/888/orig -> origin/gh/anijain2305/888/orig 2025-10-10T00:44:16.3186047Z * [new branch] gh/anijain2305/889/base -> origin/gh/anijain2305/889/base 2025-10-10T00:44:16.3187982Z * [new branch] gh/anijain2305/889/head -> origin/gh/anijain2305/889/head 2025-10-10T00:44:16.3189757Z * [new branch] gh/anijain2305/889/orig -> origin/gh/anijain2305/889/orig 2025-10-10T00:44:16.3192598Z * [new branch] gh/anijain2305/890/base -> origin/gh/anijain2305/890/base 2025-10-10T00:44:16.3194767Z * [new branch] gh/anijain2305/890/head -> origin/gh/anijain2305/890/head 2025-10-10T00:44:16.3196391Z * [new branch] gh/anijain2305/890/orig -> origin/gh/anijain2305/890/orig 2025-10-10T00:44:16.3199220Z * [new branch] gh/anijain2305/891/base -> origin/gh/anijain2305/891/base 2025-10-10T00:44:16.3201216Z * [new branch] gh/anijain2305/891/head -> origin/gh/anijain2305/891/head 2025-10-10T00:44:16.3203159Z * [new branch] gh/anijain2305/891/orig -> origin/gh/anijain2305/891/orig 2025-10-10T00:44:16.3206002Z * [new branch] gh/anijain2305/892/base -> origin/gh/anijain2305/892/base 2025-10-10T00:44:16.3208066Z * [new branch] gh/anijain2305/892/head -> origin/gh/anijain2305/892/head 2025-10-10T00:44:16.3209922Z * [new branch] gh/anijain2305/892/orig -> origin/gh/anijain2305/892/orig 2025-10-10T00:44:16.3212673Z * [new branch] gh/anijain2305/893/base -> origin/gh/anijain2305/893/base 2025-10-10T00:44:16.3214718Z * [new branch] gh/anijain2305/893/head -> origin/gh/anijain2305/893/head 2025-10-10T00:44:16.3216636Z * [new branch] gh/anijain2305/893/orig -> origin/gh/anijain2305/893/orig 2025-10-10T00:44:16.3219175Z * [new branch] gh/anijain2305/894/base -> origin/gh/anijain2305/894/base 2025-10-10T00:44:16.3220902Z * [new branch] gh/anijain2305/894/head -> origin/gh/anijain2305/894/head 2025-10-10T00:44:16.3223161Z * [new branch] gh/anijain2305/894/orig -> origin/gh/anijain2305/894/orig 2025-10-10T00:44:16.3225558Z * [new branch] gh/anijain2305/895/base -> origin/gh/anijain2305/895/base 2025-10-10T00:44:16.3227499Z * [new branch] gh/anijain2305/895/head -> origin/gh/anijain2305/895/head 2025-10-10T00:44:16.3229485Z * [new branch] gh/anijain2305/895/orig -> origin/gh/anijain2305/895/orig 2025-10-10T00:44:16.3232620Z * [new branch] gh/anijain2305/896/base -> origin/gh/anijain2305/896/base 2025-10-10T00:44:16.3234281Z * [new branch] gh/anijain2305/896/head -> origin/gh/anijain2305/896/head 2025-10-10T00:44:16.3236102Z * [new branch] gh/anijain2305/896/orig -> origin/gh/anijain2305/896/orig 2025-10-10T00:44:16.3239019Z * [new branch] gh/anijain2305/897/base -> origin/gh/anijain2305/897/base 2025-10-10T00:44:16.3240816Z * [new branch] gh/anijain2305/897/head -> origin/gh/anijain2305/897/head 2025-10-10T00:44:16.3242688Z * [new branch] gh/anijain2305/897/orig -> origin/gh/anijain2305/897/orig 2025-10-10T00:44:16.3245524Z * [new branch] gh/anijain2305/898/base -> origin/gh/anijain2305/898/base 2025-10-10T00:44:16.3247874Z * [new branch] gh/anijain2305/898/head -> origin/gh/anijain2305/898/head 2025-10-10T00:44:16.3249710Z * [new branch] gh/anijain2305/898/orig -> origin/gh/anijain2305/898/orig 2025-10-10T00:44:16.3252472Z * [new branch] gh/anijain2305/899/base -> origin/gh/anijain2305/899/base 2025-10-10T00:44:16.3254408Z * [new branch] gh/anijain2305/899/head -> origin/gh/anijain2305/899/head 2025-10-10T00:44:16.3256385Z * [new branch] gh/anijain2305/899/orig -> origin/gh/anijain2305/899/orig 2025-10-10T00:44:16.3259130Z * [new branch] gh/anijain2305/900/base -> origin/gh/anijain2305/900/base 2025-10-10T00:44:16.3260914Z * [new branch] gh/anijain2305/900/head -> origin/gh/anijain2305/900/head 2025-10-10T00:44:16.3262644Z * [new branch] gh/anijain2305/900/orig -> origin/gh/anijain2305/900/orig 2025-10-10T00:44:16.3265215Z * [new branch] gh/anijain2305/901/base -> origin/gh/anijain2305/901/base 2025-10-10T00:44:16.3267092Z * [new branch] gh/anijain2305/901/head -> origin/gh/anijain2305/901/head 2025-10-10T00:44:16.3268926Z * [new branch] gh/anijain2305/901/orig -> origin/gh/anijain2305/901/orig 2025-10-10T00:44:16.3271672Z * [new branch] gh/anijain2305/902/base -> origin/gh/anijain2305/902/base 2025-10-10T00:44:16.3273561Z * [new branch] gh/anijain2305/902/head -> origin/gh/anijain2305/902/head 2025-10-10T00:44:16.3275463Z * [new branch] gh/anijain2305/902/orig -> origin/gh/anijain2305/902/orig 2025-10-10T00:44:16.3278237Z * [new branch] gh/anijain2305/903/base -> origin/gh/anijain2305/903/base 2025-10-10T00:44:16.3280098Z * [new branch] gh/anijain2305/903/head -> origin/gh/anijain2305/903/head 2025-10-10T00:44:16.3281913Z * [new branch] gh/anijain2305/903/orig -> origin/gh/anijain2305/903/orig 2025-10-10T00:44:16.3284698Z * [new branch] gh/anijain2305/904/base -> origin/gh/anijain2305/904/base 2025-10-10T00:44:16.3286557Z * [new branch] gh/anijain2305/904/head -> origin/gh/anijain2305/904/head 2025-10-10T00:44:16.3288734Z * [new branch] gh/anijain2305/904/orig -> origin/gh/anijain2305/904/orig 2025-10-10T00:44:16.3291974Z * [new branch] gh/anjali411/216/base -> origin/gh/anjali411/216/base 2025-10-10T00:44:16.3293967Z * [new branch] gh/anjali411/216/head -> origin/gh/anjali411/216/head 2025-10-10T00:44:16.3295826Z * [new branch] gh/anjali411/216/orig -> origin/gh/anjali411/216/orig 2025-10-10T00:44:16.3299206Z * [new branch] gh/ankitageorge/17/base -> origin/gh/ankitageorge/17/base 2025-10-10T00:44:16.3301626Z * [new branch] gh/ankitageorge/17/head -> origin/gh/ankitageorge/17/head 2025-10-10T00:44:16.3303468Z * [new branch] gh/ankitageorge/17/orig -> origin/gh/ankitageorge/17/orig 2025-10-10T00:44:16.3306763Z * [new branch] gh/anshul-si/1/base -> origin/gh/anshul-si/1/base 2025-10-10T00:44:16.3308731Z * [new branch] gh/anshul-si/1/head -> origin/gh/anshul-si/1/head 2025-10-10T00:44:16.3311194Z * [new branch] gh/anshul-si/2/base -> origin/gh/anshul-si/2/base 2025-10-10T00:44:16.3312913Z * [new branch] gh/anshul-si/2/head -> origin/gh/anshul-si/2/head 2025-10-10T00:44:16.3315841Z * [new branch] gh/anshul-si/29/base -> origin/gh/anshul-si/29/base 2025-10-10T00:44:16.3317762Z * [new branch] gh/anshul-si/29/head -> origin/gh/anshul-si/29/head 2025-10-10T00:44:16.3319556Z * [new branch] gh/anshul-si/29/orig -> origin/gh/anshul-si/29/orig 2025-10-10T00:44:16.3321933Z * [new branch] gh/anshul-si/3/base -> origin/gh/anshul-si/3/base 2025-10-10T00:44:16.3323744Z * [new branch] gh/anshul-si/3/head -> origin/gh/anshul-si/3/head 2025-10-10T00:44:16.3326325Z * [new branch] gh/anshul-si/30/base -> origin/gh/anshul-si/30/base 2025-10-10T00:44:16.3328516Z * [new branch] gh/anshul-si/30/head -> origin/gh/anshul-si/30/head 2025-10-10T00:44:16.3330418Z * [new branch] gh/anshul-si/30/orig -> origin/gh/anshul-si/30/orig 2025-10-10T00:44:16.3332856Z * [new branch] gh/anshul-si/31/base -> origin/gh/anshul-si/31/base 2025-10-10T00:44:16.3334724Z * [new branch] gh/anshul-si/31/head -> origin/gh/anshul-si/31/head 2025-10-10T00:44:16.3336620Z * [new branch] gh/anshul-si/31/orig -> origin/gh/anshul-si/31/orig 2025-10-10T00:44:16.3338968Z * [new branch] gh/anshul-si/32/base -> origin/gh/anshul-si/32/base 2025-10-10T00:44:16.3340852Z * [new branch] gh/anshul-si/32/head -> origin/gh/anshul-si/32/head 2025-10-10T00:44:16.3342721Z * [new branch] gh/anshul-si/32/orig -> origin/gh/anshul-si/32/orig 2025-10-10T00:44:16.3345397Z * [new branch] gh/anshul-si/33/base -> origin/gh/anshul-si/33/base 2025-10-10T00:44:16.3347281Z * [new branch] gh/anshul-si/33/head -> origin/gh/anshul-si/33/head 2025-10-10T00:44:16.3349183Z * [new branch] gh/anshul-si/33/orig -> origin/gh/anshul-si/33/orig 2025-10-10T00:44:16.3352016Z * [new branch] gh/anshul-si/34/base -> origin/gh/anshul-si/34/base 2025-10-10T00:44:16.3353864Z * [new branch] gh/anshul-si/34/head -> origin/gh/anshul-si/34/head 2025-10-10T00:44:16.3355857Z * [new branch] gh/anshul-si/34/orig -> origin/gh/anshul-si/34/orig 2025-10-10T00:44:16.3358504Z * [new branch] gh/anshul-si/35/base -> origin/gh/anshul-si/35/base 2025-10-10T00:44:16.3360527Z * [new branch] gh/anshul-si/35/head -> origin/gh/anshul-si/35/head 2025-10-10T00:44:16.3362372Z * [new branch] gh/anshul-si/35/orig -> origin/gh/anshul-si/35/orig 2025-10-10T00:44:16.3365169Z * [new branch] gh/anshul-si/36/base -> origin/gh/anshul-si/36/base 2025-10-10T00:44:16.3366958Z * [new branch] gh/anshul-si/36/head -> origin/gh/anshul-si/36/head 2025-10-10T00:44:16.3369029Z * [new branch] gh/anshul-si/36/orig -> origin/gh/anshul-si/36/orig 2025-10-10T00:44:16.3371822Z * [new branch] gh/anshul-si/37/base -> origin/gh/anshul-si/37/base 2025-10-10T00:44:16.3373672Z * [new branch] gh/anshul-si/37/head -> origin/gh/anshul-si/37/head 2025-10-10T00:44:16.3375485Z * [new branch] gh/anshul-si/37/orig -> origin/gh/anshul-si/37/orig 2025-10-10T00:44:16.3378109Z * [new branch] gh/anshul-si/38/base -> origin/gh/anshul-si/38/base 2025-10-10T00:44:16.3380085Z * [new branch] gh/anshul-si/38/head -> origin/gh/anshul-si/38/head 2025-10-10T00:44:16.3382139Z * [new branch] gh/anshul-si/38/orig -> origin/gh/anshul-si/38/orig 2025-10-10T00:44:16.3384566Z * [new branch] gh/anshul-si/39/base -> origin/gh/anshul-si/39/base 2025-10-10T00:44:16.3386497Z * [new branch] gh/anshul-si/39/head -> origin/gh/anshul-si/39/head 2025-10-10T00:44:16.3388351Z * [new branch] gh/anshul-si/39/orig -> origin/gh/anshul-si/39/orig 2025-10-10T00:44:16.3390851Z * [new branch] gh/anshul-si/4/base -> origin/gh/anshul-si/4/base 2025-10-10T00:44:16.3392646Z * [new branch] gh/anshul-si/4/head -> origin/gh/anshul-si/4/head 2025-10-10T00:44:16.3395327Z * [new branch] gh/anshul-si/40/base -> origin/gh/anshul-si/40/base 2025-10-10T00:44:16.3397183Z * [new branch] gh/anshul-si/40/head -> origin/gh/anshul-si/40/head 2025-10-10T00:44:16.3399244Z * [new branch] gh/anshul-si/40/orig -> origin/gh/anshul-si/40/orig 2025-10-10T00:44:16.3402017Z * [new branch] gh/anshul-si/41/base -> origin/gh/anshul-si/41/base 2025-10-10T00:44:16.3403919Z * [new branch] gh/anshul-si/41/head -> origin/gh/anshul-si/41/head 2025-10-10T00:44:16.3405775Z * [new branch] gh/anshul-si/41/orig -> origin/gh/anshul-si/41/orig 2025-10-10T00:44:16.3408804Z * [new branch] gh/anshul-si/42/base -> origin/gh/anshul-si/42/base 2025-10-10T00:44:16.3410643Z * [new branch] gh/anshul-si/42/head -> origin/gh/anshul-si/42/head 2025-10-10T00:44:16.3412481Z * [new branch] gh/anshul-si/42/orig -> origin/gh/anshul-si/42/orig 2025-10-10T00:44:16.3415069Z * [new branch] gh/anshul-si/43/base -> origin/gh/anshul-si/43/base 2025-10-10T00:44:16.3416795Z * [new branch] gh/anshul-si/43/head -> origin/gh/anshul-si/43/head 2025-10-10T00:44:16.3418619Z * [new branch] gh/anshul-si/43/orig -> origin/gh/anshul-si/43/orig 2025-10-10T00:44:16.3421837Z * [new branch] gh/anshul-si/44/base -> origin/gh/anshul-si/44/base 2025-10-10T00:44:16.3423879Z * [new branch] gh/anshul-si/44/head -> origin/gh/anshul-si/44/head 2025-10-10T00:44:16.3425761Z * [new branch] gh/anshul-si/44/orig -> origin/gh/anshul-si/44/orig 2025-10-10T00:44:16.3428462Z * [new branch] gh/anshul-si/45/base -> origin/gh/anshul-si/45/base 2025-10-10T00:44:16.3430529Z * [new branch] gh/anshul-si/45/head -> origin/gh/anshul-si/45/head 2025-10-10T00:44:16.3432697Z * [new branch] gh/anshul-si/45/orig -> origin/gh/anshul-si/45/orig 2025-10-10T00:44:16.3435067Z * [new branch] gh/anshul-si/46/base -> origin/gh/anshul-si/46/base 2025-10-10T00:44:16.3436988Z * [new branch] gh/anshul-si/46/head -> origin/gh/anshul-si/46/head 2025-10-10T00:44:16.3438895Z * [new branch] gh/anshul-si/46/orig -> origin/gh/anshul-si/46/orig 2025-10-10T00:44:16.3441855Z * [new branch] gh/anshul-si/47/base -> origin/gh/anshul-si/47/base 2025-10-10T00:44:16.3443733Z * [new branch] gh/anshul-si/47/head -> origin/gh/anshul-si/47/head 2025-10-10T00:44:16.3445631Z * [new branch] gh/anshul-si/47/orig -> origin/gh/anshul-si/47/orig 2025-10-10T00:44:16.3448263Z * [new branch] gh/anshul-si/48/base -> origin/gh/anshul-si/48/base 2025-10-10T00:44:16.3450113Z * [new branch] gh/anshul-si/48/head -> origin/gh/anshul-si/48/head 2025-10-10T00:44:16.3451958Z * [new branch] gh/anshul-si/48/orig -> origin/gh/anshul-si/48/orig 2025-10-10T00:44:16.3454649Z * [new branch] gh/anshul-si/49/base -> origin/gh/anshul-si/49/base 2025-10-10T00:44:16.3456744Z * [new branch] gh/anshul-si/49/head -> origin/gh/anshul-si/49/head 2025-10-10T00:44:16.3458531Z * [new branch] gh/anshul-si/49/orig -> origin/gh/anshul-si/49/orig 2025-10-10T00:44:16.3461138Z * [new branch] gh/anshul-si/5/base -> origin/gh/anshul-si/5/base 2025-10-10T00:44:16.3463015Z * [new branch] gh/anshul-si/5/head -> origin/gh/anshul-si/5/head 2025-10-10T00:44:16.3465710Z * [new branch] gh/anshul-si/50/base -> origin/gh/anshul-si/50/base 2025-10-10T00:44:16.3467817Z * [new branch] gh/anshul-si/50/head -> origin/gh/anshul-si/50/head 2025-10-10T00:44:16.3469642Z * [new branch] gh/anshul-si/50/orig -> origin/gh/anshul-si/50/orig 2025-10-10T00:44:16.3472210Z * [new branch] gh/anshul-si/51/base -> origin/gh/anshul-si/51/base 2025-10-10T00:44:16.3473981Z * [new branch] gh/anshul-si/51/head -> origin/gh/anshul-si/51/head 2025-10-10T00:44:16.3475852Z * [new branch] gh/anshul-si/51/orig -> origin/gh/anshul-si/51/orig 2025-10-10T00:44:16.3478370Z * [new branch] gh/anshul-si/52/base -> origin/gh/anshul-si/52/base 2025-10-10T00:44:16.3480372Z * [new branch] gh/anshul-si/52/head -> origin/gh/anshul-si/52/head 2025-10-10T00:44:16.3482238Z * [new branch] gh/anshul-si/52/orig -> origin/gh/anshul-si/52/orig 2025-10-10T00:44:16.3485624Z * [new branch] gh/aorenste/132/base -> origin/gh/aorenste/132/base 2025-10-10T00:44:16.3487786Z * [new branch] gh/aorenste/132/head -> origin/gh/aorenste/132/head 2025-10-10T00:44:16.3490378Z * [new branch] gh/aorenste/133/base -> origin/gh/aorenste/133/base 2025-10-10T00:44:16.3492289Z * [new branch] gh/aorenste/133/head -> origin/gh/aorenste/133/head 2025-10-10T00:44:16.3494149Z * [new branch] gh/aorenste/133/orig -> origin/gh/aorenste/133/orig 2025-10-10T00:44:16.3496670Z * [new branch] gh/aorenste/134/base -> origin/gh/aorenste/134/base 2025-10-10T00:44:16.3498961Z * [new branch] gh/aorenste/134/head -> origin/gh/aorenste/134/head 2025-10-10T00:44:16.3503626Z * [new branch] gh/aorenste/134/orig -> origin/gh/aorenste/134/orig 2025-10-10T00:44:16.3505919Z * [new branch] gh/aorenste/135/base -> origin/gh/aorenste/135/base 2025-10-10T00:44:16.3507821Z * [new branch] gh/aorenste/135/head -> origin/gh/aorenste/135/head 2025-10-10T00:44:16.3509713Z * [new branch] gh/aorenste/135/orig -> origin/gh/aorenste/135/orig 2025-10-10T00:44:16.3512494Z * [new branch] gh/aorenste/136/base -> origin/gh/aorenste/136/base 2025-10-10T00:44:16.3514579Z * [new branch] gh/aorenste/136/head -> origin/gh/aorenste/136/head 2025-10-10T00:44:16.3516448Z * [new branch] gh/aorenste/136/orig -> origin/gh/aorenste/136/orig 2025-10-10T00:44:16.3519174Z * [new branch] gh/aorenste/137/base -> origin/gh/aorenste/137/base 2025-10-10T00:44:16.3521073Z * [new branch] gh/aorenste/137/head -> origin/gh/aorenste/137/head 2025-10-10T00:44:16.3523023Z * [new branch] gh/aorenste/137/orig -> origin/gh/aorenste/137/orig 2025-10-10T00:44:16.3525618Z * [new branch] gh/aorenste/138/base -> origin/gh/aorenste/138/base 2025-10-10T00:44:16.3527486Z * [new branch] gh/aorenste/138/head -> origin/gh/aorenste/138/head 2025-10-10T00:44:16.3529495Z * [new branch] gh/aorenste/138/orig -> origin/gh/aorenste/138/orig 2025-10-10T00:44:16.3532049Z * [new branch] gh/aorenste/139/base -> origin/gh/aorenste/139/base 2025-10-10T00:44:16.3534092Z * [new branch] gh/aorenste/139/head -> origin/gh/aorenste/139/head 2025-10-10T00:44:16.3535849Z * [new branch] gh/aorenste/139/orig -> origin/gh/aorenste/139/orig 2025-10-10T00:44:16.3539076Z * [new branch] gh/avikchaudhuri/1/base -> origin/gh/avikchaudhuri/1/base 2025-10-10T00:44:16.3541142Z * [new branch] gh/avikchaudhuri/1/head -> origin/gh/avikchaudhuri/1/head 2025-10-10T00:44:16.3543520Z * [new branch] gh/avikchaudhuri/2/base -> origin/gh/avikchaudhuri/2/base 2025-10-10T00:44:16.3545282Z * [new branch] gh/avikchaudhuri/2/head -> origin/gh/avikchaudhuri/2/head 2025-10-10T00:44:16.3547289Z * [new branch] gh/avikchaudhuri/2/orig -> origin/gh/avikchaudhuri/2/orig 2025-10-10T00:44:16.3550318Z * [new branch] gh/bdhirsh/650/base -> origin/gh/bdhirsh/650/base 2025-10-10T00:44:16.3552308Z * [new branch] gh/bdhirsh/650/head -> origin/gh/bdhirsh/650/head 2025-10-10T00:44:16.3554192Z * [new branch] gh/bdhirsh/650/orig -> origin/gh/bdhirsh/650/orig 2025-10-10T00:44:16.3556866Z * [new branch] gh/bdhirsh/665/base -> origin/gh/bdhirsh/665/base 2025-10-10T00:44:16.3558746Z * [new branch] gh/bdhirsh/665/head -> origin/gh/bdhirsh/665/head 2025-10-10T00:44:16.3560598Z * [new branch] gh/bdhirsh/665/orig -> origin/gh/bdhirsh/665/orig 2025-10-10T00:44:16.3563548Z * [new branch] gh/bdhirsh/666/base -> origin/gh/bdhirsh/666/base 2025-10-10T00:44:16.3565413Z * [new branch] gh/bdhirsh/666/head -> origin/gh/bdhirsh/666/head 2025-10-10T00:44:16.3567618Z * [new branch] gh/bdhirsh/666/orig -> origin/gh/bdhirsh/666/orig 2025-10-10T00:44:16.3570132Z * [new branch] gh/bdhirsh/668/base -> origin/gh/bdhirsh/668/base 2025-10-10T00:44:16.3572137Z * [new branch] gh/bdhirsh/668/head -> origin/gh/bdhirsh/668/head 2025-10-10T00:44:16.3573975Z * [new branch] gh/bdhirsh/668/orig -> origin/gh/bdhirsh/668/orig 2025-10-10T00:44:16.3576563Z * [new branch] gh/bdhirsh/669/base -> origin/gh/bdhirsh/669/base 2025-10-10T00:44:16.3578485Z * [new branch] gh/bdhirsh/669/head -> origin/gh/bdhirsh/669/head 2025-10-10T00:44:16.3580278Z * [new branch] gh/bdhirsh/669/orig -> origin/gh/bdhirsh/669/orig 2025-10-10T00:44:16.3583064Z * [new branch] gh/bdhirsh/670/base -> origin/gh/bdhirsh/670/base 2025-10-10T00:44:16.3585138Z * [new branch] gh/bdhirsh/670/head -> origin/gh/bdhirsh/670/head 2025-10-10T00:44:16.3587032Z * [new branch] gh/bdhirsh/670/orig -> origin/gh/bdhirsh/670/orig 2025-10-10T00:44:16.3589728Z * [new branch] gh/bdhirsh/671/base -> origin/gh/bdhirsh/671/base 2025-10-10T00:44:16.3591782Z * [new branch] gh/bdhirsh/671/head -> origin/gh/bdhirsh/671/head 2025-10-10T00:44:16.3593630Z * [new branch] gh/bdhirsh/671/orig -> origin/gh/bdhirsh/671/orig 2025-10-10T00:44:16.3596234Z * [new branch] gh/bdhirsh/672/base -> origin/gh/bdhirsh/672/base 2025-10-10T00:44:16.3598218Z * [new branch] gh/bdhirsh/672/head -> origin/gh/bdhirsh/672/head 2025-10-10T00:44:16.3600322Z * [new branch] gh/bdhirsh/672/orig -> origin/gh/bdhirsh/672/orig 2025-10-10T00:44:16.3603874Z * [new branch] gh/benjaminglass1/101/base -> origin/gh/benjaminglass1/101/base 2025-10-10T00:44:16.3605813Z * [new branch] gh/benjaminglass1/101/head -> origin/gh/benjaminglass1/101/head 2025-10-10T00:44:16.3607934Z * [new branch] gh/benjaminglass1/101/orig -> origin/gh/benjaminglass1/101/orig 2025-10-10T00:44:16.3610414Z * [new branch] gh/benjaminglass1/102/base -> origin/gh/benjaminglass1/102/base 2025-10-10T00:44:16.3612251Z * [new branch] gh/benjaminglass1/102/head -> origin/gh/benjaminglass1/102/head 2025-10-10T00:44:16.3614170Z * [new branch] gh/benjaminglass1/102/orig -> origin/gh/benjaminglass1/102/orig 2025-10-10T00:44:16.3616806Z * [new branch] gh/benjaminglass1/106/base -> origin/gh/benjaminglass1/106/base 2025-10-10T00:44:16.3619039Z * [new branch] gh/benjaminglass1/106/head -> origin/gh/benjaminglass1/106/head 2025-10-10T00:44:16.3620466Z * [new branch] gh/benjaminglass1/106/orig -> origin/gh/benjaminglass1/106/orig 2025-10-10T00:44:16.3623066Z * [new branch] gh/benjaminglass1/107/base -> origin/gh/benjaminglass1/107/base 2025-10-10T00:44:16.3625078Z * [new branch] gh/benjaminglass1/107/head -> origin/gh/benjaminglass1/107/head 2025-10-10T00:44:16.3626737Z * [new branch] gh/benjaminglass1/107/orig -> origin/gh/benjaminglass1/107/orig 2025-10-10T00:44:16.3629295Z * [new branch] gh/benjaminglass1/108/base -> origin/gh/benjaminglass1/108/base 2025-10-10T00:44:16.3631154Z * [new branch] gh/benjaminglass1/108/head -> origin/gh/benjaminglass1/108/head 2025-10-10T00:44:16.3633028Z * [new branch] gh/benjaminglass1/108/orig -> origin/gh/benjaminglass1/108/orig 2025-10-10T00:44:16.3635629Z * [new branch] gh/benjaminglass1/79/base -> origin/gh/benjaminglass1/79/base 2025-10-10T00:44:16.3637579Z * [new branch] gh/benjaminglass1/79/head -> origin/gh/benjaminglass1/79/head 2025-10-10T00:44:16.3639848Z * [new branch] gh/benjaminglass1/79/orig -> origin/gh/benjaminglass1/79/orig 2025-10-10T00:44:16.3642331Z * [new branch] gh/benjaminglass1/86/base -> origin/gh/benjaminglass1/86/base 2025-10-10T00:44:16.3644205Z * [new branch] gh/benjaminglass1/86/head -> origin/gh/benjaminglass1/86/head 2025-10-10T00:44:16.3646195Z * [new branch] gh/benjaminglass1/86/orig -> origin/gh/benjaminglass1/86/orig 2025-10-10T00:44:16.3649142Z * [new branch] gh/benjaminglass1/95/base -> origin/gh/benjaminglass1/95/base 2025-10-10T00:44:16.3650825Z * [new branch] gh/benjaminglass1/95/head -> origin/gh/benjaminglass1/95/head 2025-10-10T00:44:16.3652670Z * [new branch] gh/benjaminglass1/95/orig -> origin/gh/benjaminglass1/95/orig 2025-10-10T00:44:16.3655241Z * [new branch] gh/benjaminglass1/97/base -> origin/gh/benjaminglass1/97/base 2025-10-10T00:44:16.3657091Z * [new branch] gh/benjaminglass1/97/head -> origin/gh/benjaminglass1/97/head 2025-10-10T00:44:16.3659467Z * [new branch] gh/benjaminglass1/97/orig -> origin/gh/benjaminglass1/97/orig 2025-10-10T00:44:16.3662688Z * [new branch] gh/bobrenjc93/542/base -> origin/gh/bobrenjc93/542/base 2025-10-10T00:44:16.3664621Z * [new branch] gh/bobrenjc93/542/head -> origin/gh/bobrenjc93/542/head 2025-10-10T00:44:16.3666461Z * [new branch] gh/bobrenjc93/542/orig -> origin/gh/bobrenjc93/542/orig 2025-10-10T00:44:16.3669033Z * [new branch] gh/bobrenjc93/543/base -> origin/gh/bobrenjc93/543/base 2025-10-10T00:44:16.3670963Z * [new branch] gh/bobrenjc93/543/head -> origin/gh/bobrenjc93/543/head 2025-10-10T00:44:16.3672919Z * [new branch] gh/bobrenjc93/543/orig -> origin/gh/bobrenjc93/543/orig 2025-10-10T00:44:16.3675294Z * [new branch] gh/bobrenjc93/545/base -> origin/gh/bobrenjc93/545/base 2025-10-10T00:44:16.3677323Z * [new branch] gh/bobrenjc93/545/head -> origin/gh/bobrenjc93/545/head 2025-10-10T00:44:16.3679251Z * [new branch] gh/bobrenjc93/545/orig -> origin/gh/bobrenjc93/545/orig 2025-10-10T00:44:16.3682272Z * [new branch] gh/bobrenjc93/547/base -> origin/gh/bobrenjc93/547/base 2025-10-10T00:44:16.3684172Z * [new branch] gh/bobrenjc93/547/head -> origin/gh/bobrenjc93/547/head 2025-10-10T00:44:16.3686065Z * [new branch] gh/bobrenjc93/547/orig -> origin/gh/bobrenjc93/547/orig 2025-10-10T00:44:16.3688689Z * [new branch] gh/bobrenjc93/548/base -> origin/gh/bobrenjc93/548/base 2025-10-10T00:44:16.3690588Z * [new branch] gh/bobrenjc93/548/head -> origin/gh/bobrenjc93/548/head 2025-10-10T00:44:16.3692424Z * [new branch] gh/bobrenjc93/548/orig -> origin/gh/bobrenjc93/548/orig 2025-10-10T00:44:16.3694857Z * [new branch] gh/bobrenjc93/553/base -> origin/gh/bobrenjc93/553/base 2025-10-10T00:44:16.3696741Z * [new branch] gh/bobrenjc93/553/head -> origin/gh/bobrenjc93/553/head 2025-10-10T00:44:16.3699109Z * [new branch] gh/bobrenjc93/553/orig -> origin/gh/bobrenjc93/553/orig 2025-10-10T00:44:16.3701275Z * [new branch] gh/bobrenjc93/554/base -> origin/gh/bobrenjc93/554/base 2025-10-10T00:44:16.3703036Z * [new branch] gh/bobrenjc93/554/head -> origin/gh/bobrenjc93/554/head 2025-10-10T00:44:16.3704980Z * [new branch] gh/bobrenjc93/554/orig -> origin/gh/bobrenjc93/554/orig 2025-10-10T00:44:16.3707713Z * [new branch] gh/bobrenjc93/555/base -> origin/gh/bobrenjc93/555/base 2025-10-10T00:44:16.3709506Z * [new branch] gh/bobrenjc93/555/head -> origin/gh/bobrenjc93/555/head 2025-10-10T00:44:16.3711696Z * [new branch] gh/bobrenjc93/555/orig -> origin/gh/bobrenjc93/555/orig 2025-10-10T00:44:16.3714037Z * [new branch] gh/bobrenjc93/557/base -> origin/gh/bobrenjc93/557/base 2025-10-10T00:44:16.3715980Z * [new branch] gh/bobrenjc93/557/head -> origin/gh/bobrenjc93/557/head 2025-10-10T00:44:16.3717847Z * [new branch] gh/bobrenjc93/557/orig -> origin/gh/bobrenjc93/557/orig 2025-10-10T00:44:16.3720446Z * [new branch] gh/bobrenjc93/558/base -> origin/gh/bobrenjc93/558/base 2025-10-10T00:44:16.3722391Z * [new branch] gh/bobrenjc93/558/head -> origin/gh/bobrenjc93/558/head 2025-10-10T00:44:16.3724259Z * [new branch] gh/bobrenjc93/558/orig -> origin/gh/bobrenjc93/558/orig 2025-10-10T00:44:16.3726946Z * [new branch] gh/bobrenjc93/559/base -> origin/gh/bobrenjc93/559/base 2025-10-10T00:44:16.3728959Z * [new branch] gh/bobrenjc93/559/head -> origin/gh/bobrenjc93/559/head 2025-10-10T00:44:16.3730769Z * [new branch] gh/bobrenjc93/559/orig -> origin/gh/bobrenjc93/559/orig 2025-10-10T00:44:16.3733209Z * [new branch] gh/bobrenjc93/560/base -> origin/gh/bobrenjc93/560/base 2025-10-10T00:44:16.3735141Z * [new branch] gh/bobrenjc93/560/head -> origin/gh/bobrenjc93/560/head 2025-10-10T00:44:16.3737000Z * [new branch] gh/bobrenjc93/560/orig -> origin/gh/bobrenjc93/560/orig 2025-10-10T00:44:16.3739784Z * [new branch] gh/bobrenjc93/561/base -> origin/gh/bobrenjc93/561/base 2025-10-10T00:44:16.3741758Z * [new branch] gh/bobrenjc93/561/head -> origin/gh/bobrenjc93/561/head 2025-10-10T00:44:16.3743583Z * [new branch] gh/bobrenjc93/561/orig -> origin/gh/bobrenjc93/561/orig 2025-10-10T00:44:16.3746207Z * [new branch] gh/bobrenjc93/562/base -> origin/gh/bobrenjc93/562/base 2025-10-10T00:44:16.3748083Z * [new branch] gh/bobrenjc93/562/head -> origin/gh/bobrenjc93/562/head 2025-10-10T00:44:16.3749919Z * [new branch] gh/bobrenjc93/562/orig -> origin/gh/bobrenjc93/562/orig 2025-10-10T00:44:16.3752577Z * [new branch] gh/bobrenjc93/563/base -> origin/gh/bobrenjc93/563/base 2025-10-10T00:44:16.3754431Z * [new branch] gh/bobrenjc93/563/head -> origin/gh/bobrenjc93/563/head 2025-10-10T00:44:16.3756329Z * [new branch] gh/bobrenjc93/563/orig -> origin/gh/bobrenjc93/563/orig 2025-10-10T00:44:16.3758975Z * [new branch] gh/bobrenjc93/564/base -> origin/gh/bobrenjc93/564/base 2025-10-10T00:44:16.3760794Z * [new branch] gh/bobrenjc93/564/head -> origin/gh/bobrenjc93/564/head 2025-10-10T00:44:16.3762639Z * [new branch] gh/bobrenjc93/564/orig -> origin/gh/bobrenjc93/564/orig 2025-10-10T00:44:16.3765663Z * [new branch] gh/bobrenjc93/565/base -> origin/gh/bobrenjc93/565/base 2025-10-10T00:44:16.3767935Z * [new branch] gh/bobrenjc93/565/head -> origin/gh/bobrenjc93/565/head 2025-10-10T00:44:16.3769707Z * [new branch] gh/bobrenjc93/565/orig -> origin/gh/bobrenjc93/565/orig 2025-10-10T00:44:16.3772350Z * [new branch] gh/bobrenjc93/566/base -> origin/gh/bobrenjc93/566/base 2025-10-10T00:44:16.3774092Z * [new branch] gh/bobrenjc93/566/head -> origin/gh/bobrenjc93/566/head 2025-10-10T00:44:16.3775939Z * [new branch] gh/bobrenjc93/566/orig -> origin/gh/bobrenjc93/566/orig 2025-10-10T00:44:16.3778396Z * [new branch] gh/bobrenjc93/567/base -> origin/gh/bobrenjc93/567/base 2025-10-10T00:44:16.3780330Z * [new branch] gh/bobrenjc93/567/head -> origin/gh/bobrenjc93/567/head 2025-10-10T00:44:16.3782200Z * [new branch] gh/bobrenjc93/567/orig -> origin/gh/bobrenjc93/567/orig 2025-10-10T00:44:16.3784729Z * [new branch] gh/bobrenjc93/568/base -> origin/gh/bobrenjc93/568/base 2025-10-10T00:44:16.3786556Z * [new branch] gh/bobrenjc93/568/head -> origin/gh/bobrenjc93/568/head 2025-10-10T00:44:16.3788375Z * [new branch] gh/bobrenjc93/568/orig -> origin/gh/bobrenjc93/568/orig 2025-10-10T00:44:16.3790891Z * [new branch] gh/bobrenjc93/569/base -> origin/gh/bobrenjc93/569/base 2025-10-10T00:44:16.3792946Z * [new branch] gh/bobrenjc93/569/head -> origin/gh/bobrenjc93/569/head 2025-10-10T00:44:16.3795339Z * [new branch] gh/bobrenjc93/569/orig -> origin/gh/bobrenjc93/569/orig 2025-10-10T00:44:16.3797986Z * [new branch] gh/bobrenjc93/570/base -> origin/gh/bobrenjc93/570/base 2025-10-10T00:44:16.3800224Z * [new branch] gh/bobrenjc93/570/head -> origin/gh/bobrenjc93/570/head 2025-10-10T00:44:16.3801954Z * [new branch] gh/bobrenjc93/570/orig -> origin/gh/bobrenjc93/570/orig 2025-10-10T00:44:16.3804466Z * [new branch] gh/bobrenjc93/571/base -> origin/gh/bobrenjc93/571/base 2025-10-10T00:44:16.3806333Z * [new branch] gh/bobrenjc93/571/head -> origin/gh/bobrenjc93/571/head 2025-10-10T00:44:16.3808434Z * [new branch] gh/bobrenjc93/571/orig -> origin/gh/bobrenjc93/571/orig 2025-10-10T00:44:16.3811027Z * [new branch] gh/bobrenjc93/572/base -> origin/gh/bobrenjc93/572/base 2025-10-10T00:44:16.3812948Z * [new branch] gh/bobrenjc93/572/head -> origin/gh/bobrenjc93/572/head 2025-10-10T00:44:16.3814811Z * [new branch] gh/bobrenjc93/572/orig -> origin/gh/bobrenjc93/572/orig 2025-10-10T00:44:16.3817412Z * [new branch] gh/bobrenjc93/573/base -> origin/gh/bobrenjc93/573/base 2025-10-10T00:44:16.3819550Z * [new branch] gh/bobrenjc93/573/head -> origin/gh/bobrenjc93/573/head 2025-10-10T00:44:16.3821394Z * [new branch] gh/bobrenjc93/573/orig -> origin/gh/bobrenjc93/573/orig 2025-10-10T00:44:16.3824021Z * [new branch] gh/bobrenjc93/574/base -> origin/gh/bobrenjc93/574/base 2025-10-10T00:44:16.3825866Z * [new branch] gh/bobrenjc93/574/head -> origin/gh/bobrenjc93/574/head 2025-10-10T00:44:16.3827734Z * [new branch] gh/bobrenjc93/574/orig -> origin/gh/bobrenjc93/574/orig 2025-10-10T00:44:16.3830448Z * [new branch] gh/bobrenjc93/575/base -> origin/gh/bobrenjc93/575/base 2025-10-10T00:44:16.3832355Z * [new branch] gh/bobrenjc93/575/head -> origin/gh/bobrenjc93/575/head 2025-10-10T00:44:16.3834241Z * [new branch] gh/bobrenjc93/575/orig -> origin/gh/bobrenjc93/575/orig 2025-10-10T00:44:16.3836827Z * [new branch] gh/bobrenjc93/576/base -> origin/gh/bobrenjc93/576/base 2025-10-10T00:44:16.3838756Z * [new branch] gh/bobrenjc93/576/head -> origin/gh/bobrenjc93/576/head 2025-10-10T00:44:16.3840597Z * [new branch] gh/bobrenjc93/576/orig -> origin/gh/bobrenjc93/576/orig 2025-10-10T00:44:16.3843260Z * [new branch] gh/bobrenjc93/577/base -> origin/gh/bobrenjc93/577/base 2025-10-10T00:44:16.3845441Z * [new branch] gh/bobrenjc93/577/head -> origin/gh/bobrenjc93/577/head 2025-10-10T00:44:16.3847201Z * [new branch] gh/bobrenjc93/577/orig -> origin/gh/bobrenjc93/577/orig 2025-10-10T00:44:16.3849911Z * [new branch] gh/bobrenjc93/578/base -> origin/gh/bobrenjc93/578/base 2025-10-10T00:44:16.3851826Z * [new branch] gh/bobrenjc93/578/head -> origin/gh/bobrenjc93/578/head 2025-10-10T00:44:16.3853698Z * [new branch] gh/bobrenjc93/578/orig -> origin/gh/bobrenjc93/578/orig 2025-10-10T00:44:16.3856288Z * [new branch] gh/bobrenjc93/579/base -> origin/gh/bobrenjc93/579/base 2025-10-10T00:44:16.3858224Z * [new branch] gh/bobrenjc93/579/head -> origin/gh/bobrenjc93/579/head 2025-10-10T00:44:16.3860065Z * [new branch] gh/bobrenjc93/579/orig -> origin/gh/bobrenjc93/579/orig 2025-10-10T00:44:16.3862646Z * [new branch] gh/bobrenjc93/580/base -> origin/gh/bobrenjc93/580/base 2025-10-10T00:44:16.3864535Z * [new branch] gh/bobrenjc93/580/head -> origin/gh/bobrenjc93/580/head 2025-10-10T00:44:16.3866442Z * [new branch] gh/bobrenjc93/580/orig -> origin/gh/bobrenjc93/580/orig 2025-10-10T00:44:16.3869028Z * [new branch] gh/bobrenjc93/581/base -> origin/gh/bobrenjc93/581/base 2025-10-10T00:44:16.3870966Z * [new branch] gh/bobrenjc93/581/head -> origin/gh/bobrenjc93/581/head 2025-10-10T00:44:16.3872852Z * [new branch] gh/bobrenjc93/581/orig -> origin/gh/bobrenjc93/581/orig 2025-10-10T00:44:16.3875482Z * [new branch] gh/bobrenjc93/582/base -> origin/gh/bobrenjc93/582/base 2025-10-10T00:44:16.3877392Z * [new branch] gh/bobrenjc93/582/head -> origin/gh/bobrenjc93/582/head 2025-10-10T00:44:16.3879296Z * [new branch] gh/bobrenjc93/582/orig -> origin/gh/bobrenjc93/582/orig 2025-10-10T00:44:16.3881954Z * [new branch] gh/bobrenjc93/583/base -> origin/gh/bobrenjc93/583/base 2025-10-10T00:44:16.3883733Z * [new branch] gh/bobrenjc93/583/head -> origin/gh/bobrenjc93/583/head 2025-10-10T00:44:16.3885573Z * [new branch] gh/bobrenjc93/583/orig -> origin/gh/bobrenjc93/583/orig 2025-10-10T00:44:16.3888510Z * [new branch] gh/bobrenjc93/584/base -> origin/gh/bobrenjc93/584/base 2025-10-10T00:44:16.3890352Z * [new branch] gh/bobrenjc93/584/head -> origin/gh/bobrenjc93/584/head 2025-10-10T00:44:16.3892199Z * [new branch] gh/bobrenjc93/584/orig -> origin/gh/bobrenjc93/584/orig 2025-10-10T00:44:16.3894712Z * [new branch] gh/bobrenjc93/585/base -> origin/gh/bobrenjc93/585/base 2025-10-10T00:44:16.3896805Z * [new branch] gh/bobrenjc93/585/head -> origin/gh/bobrenjc93/585/head 2025-10-10T00:44:16.3898869Z * [new branch] gh/bobrenjc93/585/orig -> origin/gh/bobrenjc93/585/orig 2025-10-10T00:44:16.3904448Z * [new branch] gh/bobrenjc93/586/base -> origin/gh/bobrenjc93/586/base 2025-10-10T00:44:16.3906019Z * [new branch] gh/bobrenjc93/586/head -> origin/gh/bobrenjc93/586/head 2025-10-10T00:44:16.3907886Z * [new branch] gh/bobrenjc93/586/orig -> origin/gh/bobrenjc93/586/orig 2025-10-10T00:44:16.3910535Z * [new branch] gh/bobrenjc93/587/base -> origin/gh/bobrenjc93/587/base 2025-10-10T00:44:16.3912488Z * [new branch] gh/bobrenjc93/587/head -> origin/gh/bobrenjc93/587/head 2025-10-10T00:44:16.3914814Z * [new branch] gh/bobrenjc93/587/orig -> origin/gh/bobrenjc93/587/orig 2025-10-10T00:44:16.3917447Z * [new branch] gh/bobrenjc93/588/base -> origin/gh/bobrenjc93/588/base 2025-10-10T00:44:16.3919305Z * [new branch] gh/bobrenjc93/588/head -> origin/gh/bobrenjc93/588/head 2025-10-10T00:44:16.3921131Z * [new branch] gh/bobrenjc93/588/orig -> origin/gh/bobrenjc93/588/orig 2025-10-10T00:44:16.3923870Z * [new branch] gh/bobrenjc93/589/base -> origin/gh/bobrenjc93/589/base 2025-10-10T00:44:16.3925712Z * [new branch] gh/bobrenjc93/589/head -> origin/gh/bobrenjc93/589/head 2025-10-10T00:44:16.3927817Z * [new branch] gh/bobrenjc93/589/orig -> origin/gh/bobrenjc93/589/orig 2025-10-10T00:44:16.3930395Z * [new branch] gh/bobrenjc93/590/base -> origin/gh/bobrenjc93/590/base 2025-10-10T00:44:16.3932254Z * [new branch] gh/bobrenjc93/590/head -> origin/gh/bobrenjc93/590/head 2025-10-10T00:44:16.3934159Z * [new branch] gh/bobrenjc93/590/orig -> origin/gh/bobrenjc93/590/orig 2025-10-10T00:44:16.3936839Z * [new branch] gh/bobrenjc93/591/base -> origin/gh/bobrenjc93/591/base 2025-10-10T00:44:16.3938600Z * [new branch] gh/bobrenjc93/591/head -> origin/gh/bobrenjc93/591/head 2025-10-10T00:44:16.3940480Z * [new branch] gh/bobrenjc93/591/orig -> origin/gh/bobrenjc93/591/orig 2025-10-10T00:44:16.3943376Z * [new branch] gh/bobrenjc93/592/base -> origin/gh/bobrenjc93/592/base 2025-10-10T00:44:16.3945293Z * [new branch] gh/bobrenjc93/592/head -> origin/gh/bobrenjc93/592/head 2025-10-10T00:44:16.3947152Z * [new branch] gh/bobrenjc93/592/orig -> origin/gh/bobrenjc93/592/orig 2025-10-10T00:44:16.3949890Z * [new branch] gh/bobrenjc93/593/base -> origin/gh/bobrenjc93/593/base 2025-10-10T00:44:16.3963646Z * [new branch] gh/bobrenjc93/593/head -> origin/gh/bobrenjc93/593/head 2025-10-10T00:44:16.3963966Z * [new branch] gh/bobrenjc93/593/orig -> origin/gh/bobrenjc93/593/orig 2025-10-10T00:44:16.3964207Z * [new branch] gh/bobrenjc93/594/base -> origin/gh/bobrenjc93/594/base 2025-10-10T00:44:16.3964420Z * [new branch] gh/bobrenjc93/594/head -> origin/gh/bobrenjc93/594/head 2025-10-10T00:44:16.3964640Z * [new branch] gh/bobrenjc93/594/orig -> origin/gh/bobrenjc93/594/orig 2025-10-10T00:44:16.3964858Z * [new branch] gh/bobrenjc93/595/base -> origin/gh/bobrenjc93/595/base 2025-10-10T00:44:16.3965063Z * [new branch] gh/bobrenjc93/595/head -> origin/gh/bobrenjc93/595/head 2025-10-10T00:44:16.3966041Z * [new branch] gh/bobrenjc93/595/orig -> origin/gh/bobrenjc93/595/orig 2025-10-10T00:44:16.3969018Z * [new branch] gh/bobrenjc93/596/base -> origin/gh/bobrenjc93/596/base 2025-10-10T00:44:16.3970797Z * [new branch] gh/bobrenjc93/596/head -> origin/gh/bobrenjc93/596/head 2025-10-10T00:44:16.3973102Z * [new branch] gh/bobrenjc93/596/orig -> origin/gh/bobrenjc93/596/orig 2025-10-10T00:44:16.3975645Z * [new branch] gh/bobrenjc93/597/base -> origin/gh/bobrenjc93/597/base 2025-10-10T00:44:16.3977672Z * [new branch] gh/bobrenjc93/597/head -> origin/gh/bobrenjc93/597/head 2025-10-10T00:44:16.3979620Z * [new branch] gh/bobrenjc93/597/orig -> origin/gh/bobrenjc93/597/orig 2025-10-10T00:44:16.3982081Z * [new branch] gh/bobrenjc93/598/base -> origin/gh/bobrenjc93/598/base 2025-10-10T00:44:16.3983902Z * [new branch] gh/bobrenjc93/598/head -> origin/gh/bobrenjc93/598/head 2025-10-10T00:44:16.3985792Z * [new branch] gh/bobrenjc93/598/orig -> origin/gh/bobrenjc93/598/orig 2025-10-10T00:44:16.3988496Z * [new branch] gh/bobrenjc93/599/base -> origin/gh/bobrenjc93/599/base 2025-10-10T00:44:16.3990315Z * [new branch] gh/bobrenjc93/599/head -> origin/gh/bobrenjc93/599/head 2025-10-10T00:44:16.3992209Z * [new branch] gh/bobrenjc93/599/orig -> origin/gh/bobrenjc93/599/orig 2025-10-10T00:44:16.3994672Z * [new branch] gh/bobrenjc93/600/base -> origin/gh/bobrenjc93/600/base 2025-10-10T00:44:16.3997248Z * [new branch] gh/bobrenjc93/600/head -> origin/gh/bobrenjc93/600/head 2025-10-10T00:44:16.3999546Z * [new branch] gh/bobrenjc93/600/orig -> origin/gh/bobrenjc93/600/orig 2025-10-10T00:44:16.4002883Z * [new branch] gh/bobrenjc93/601/base -> origin/gh/bobrenjc93/601/base 2025-10-10T00:44:16.4005736Z * [new branch] gh/bobrenjc93/601/head -> origin/gh/bobrenjc93/601/head 2025-10-10T00:44:16.4008589Z * [new branch] gh/bobrenjc93/601/orig -> origin/gh/bobrenjc93/601/orig 2025-10-10T00:44:16.4011522Z * [new branch] gh/bobrenjc93/602/base -> origin/gh/bobrenjc93/602/base 2025-10-10T00:44:16.4013452Z * [new branch] gh/bobrenjc93/602/head -> origin/gh/bobrenjc93/602/head 2025-10-10T00:44:16.4015648Z * [new branch] gh/bobrenjc93/602/orig -> origin/gh/bobrenjc93/602/orig 2025-10-10T00:44:16.4018144Z * [new branch] gh/bobrenjc93/603/base -> origin/gh/bobrenjc93/603/base 2025-10-10T00:44:16.4020032Z * [new branch] gh/bobrenjc93/603/head -> origin/gh/bobrenjc93/603/head 2025-10-10T00:44:16.4021954Z * [new branch] gh/bobrenjc93/603/orig -> origin/gh/bobrenjc93/603/orig 2025-10-10T00:44:16.4024393Z * [new branch] gh/bobrenjc93/604/base -> origin/gh/bobrenjc93/604/base 2025-10-10T00:44:16.4026229Z * [new branch] gh/bobrenjc93/604/head -> origin/gh/bobrenjc93/604/head 2025-10-10T00:44:16.4028103Z * [new branch] gh/bobrenjc93/604/orig -> origin/gh/bobrenjc93/604/orig 2025-10-10T00:44:16.4030634Z * [new branch] gh/bobrenjc93/605/base -> origin/gh/bobrenjc93/605/base 2025-10-10T00:44:16.4032720Z * [new branch] gh/bobrenjc93/605/head -> origin/gh/bobrenjc93/605/head 2025-10-10T00:44:16.4034821Z * [new branch] gh/bobrenjc93/605/orig -> origin/gh/bobrenjc93/605/orig 2025-10-10T00:44:16.4037631Z * [new branch] gh/bobrenjc93/606/base -> origin/gh/bobrenjc93/606/base 2025-10-10T00:44:16.4039413Z * [new branch] gh/bobrenjc93/606/head -> origin/gh/bobrenjc93/606/head 2025-10-10T00:44:16.4041310Z * [new branch] gh/bobrenjc93/606/orig -> origin/gh/bobrenjc93/606/orig 2025-10-10T00:44:16.4044021Z * [new branch] gh/bobrenjc93/607/base -> origin/gh/bobrenjc93/607/base 2025-10-10T00:44:16.4046050Z * [new branch] gh/bobrenjc93/607/head -> origin/gh/bobrenjc93/607/head 2025-10-10T00:44:16.4047983Z * [new branch] gh/bobrenjc93/607/orig -> origin/gh/bobrenjc93/607/orig 2025-10-10T00:44:16.4050766Z * [new branch] gh/bobrenjc93/608/base -> origin/gh/bobrenjc93/608/base 2025-10-10T00:44:16.4053041Z * [new branch] gh/bobrenjc93/608/head -> origin/gh/bobrenjc93/608/head 2025-10-10T00:44:16.4054913Z * [new branch] gh/bobrenjc93/608/orig -> origin/gh/bobrenjc93/608/orig 2025-10-10T00:44:16.4057587Z * [new branch] gh/bobrenjc93/609/base -> origin/gh/bobrenjc93/609/base 2025-10-10T00:44:16.4059507Z * [new branch] gh/bobrenjc93/609/head -> origin/gh/bobrenjc93/609/head 2025-10-10T00:44:16.4061357Z * [new branch] gh/bobrenjc93/609/orig -> origin/gh/bobrenjc93/609/orig 2025-10-10T00:44:16.4064227Z * [new branch] gh/bobrenjc93/610/base -> origin/gh/bobrenjc93/610/base 2025-10-10T00:44:16.4066101Z * [new branch] gh/bobrenjc93/610/head -> origin/gh/bobrenjc93/610/head 2025-10-10T00:44:16.4067915Z * [new branch] gh/bobrenjc93/610/orig -> origin/gh/bobrenjc93/610/orig 2025-10-10T00:44:16.4070683Z * [new branch] gh/bobrenjc93/611/base -> origin/gh/bobrenjc93/611/base 2025-10-10T00:44:16.4072528Z * [new branch] gh/bobrenjc93/611/head -> origin/gh/bobrenjc93/611/head 2025-10-10T00:44:16.4074360Z * [new branch] gh/bobrenjc93/611/orig -> origin/gh/bobrenjc93/611/orig 2025-10-10T00:44:16.4077100Z * [new branch] gh/bobrenjc93/612/base -> origin/gh/bobrenjc93/612/base 2025-10-10T00:44:16.4078957Z * [new branch] gh/bobrenjc93/612/head -> origin/gh/bobrenjc93/612/head 2025-10-10T00:44:16.4080748Z * [new branch] gh/bobrenjc93/612/orig -> origin/gh/bobrenjc93/612/orig 2025-10-10T00:44:16.4083539Z * [new branch] gh/bobrenjc93/613/base -> origin/gh/bobrenjc93/613/base 2025-10-10T00:44:16.4085454Z * [new branch] gh/bobrenjc93/613/head -> origin/gh/bobrenjc93/613/head 2025-10-10T00:44:16.4087355Z * [new branch] gh/bobrenjc93/613/orig -> origin/gh/bobrenjc93/613/orig 2025-10-10T00:44:16.4090119Z * [new branch] gh/bobrenjc93/614/base -> origin/gh/bobrenjc93/614/base 2025-10-10T00:44:16.4092265Z * [new branch] gh/bobrenjc93/614/head -> origin/gh/bobrenjc93/614/head 2025-10-10T00:44:16.4093990Z * [new branch] gh/bobrenjc93/614/orig -> origin/gh/bobrenjc93/614/orig 2025-10-10T00:44:16.4096607Z * [new branch] gh/bobrenjc93/615/base -> origin/gh/bobrenjc93/615/base 2025-10-10T00:44:16.4098653Z * [new branch] gh/bobrenjc93/615/head -> origin/gh/bobrenjc93/615/head 2025-10-10T00:44:16.4100555Z * [new branch] gh/bobrenjc93/615/orig -> origin/gh/bobrenjc93/615/orig 2025-10-10T00:44:16.4103176Z * [new branch] gh/bobrenjc93/616/base -> origin/gh/bobrenjc93/616/base 2025-10-10T00:44:16.4104990Z * [new branch] gh/bobrenjc93/616/head -> origin/gh/bobrenjc93/616/head 2025-10-10T00:44:16.4106899Z * [new branch] gh/bobrenjc93/616/orig -> origin/gh/bobrenjc93/616/orig 2025-10-10T00:44:16.4109532Z * [new branch] gh/bobrenjc93/617/base -> origin/gh/bobrenjc93/617/base 2025-10-10T00:44:16.4112085Z * [new branch] gh/bobrenjc93/617/head -> origin/gh/bobrenjc93/617/head 2025-10-10T00:44:16.4114024Z * [new branch] gh/bobrenjc93/617/orig -> origin/gh/bobrenjc93/617/orig 2025-10-10T00:44:16.4116622Z * [new branch] gh/bobrenjc93/618/base -> origin/gh/bobrenjc93/618/base 2025-10-10T00:44:16.4118465Z * [new branch] gh/bobrenjc93/618/head -> origin/gh/bobrenjc93/618/head 2025-10-10T00:44:16.4120321Z * [new branch] gh/bobrenjc93/618/orig -> origin/gh/bobrenjc93/618/orig 2025-10-10T00:44:16.4122883Z * [new branch] gh/bobrenjc93/619/base -> origin/gh/bobrenjc93/619/base 2025-10-10T00:44:16.4124753Z * [new branch] gh/bobrenjc93/619/head -> origin/gh/bobrenjc93/619/head 2025-10-10T00:44:16.4126636Z * [new branch] gh/bobrenjc93/619/orig -> origin/gh/bobrenjc93/619/orig 2025-10-10T00:44:16.4129402Z * [new branch] gh/bobrenjc93/620/base -> origin/gh/bobrenjc93/620/base 2025-10-10T00:44:16.4131232Z * [new branch] gh/bobrenjc93/620/head -> origin/gh/bobrenjc93/620/head 2025-10-10T00:44:16.4133129Z * [new branch] gh/bobrenjc93/620/orig -> origin/gh/bobrenjc93/620/orig 2025-10-10T00:44:16.4135654Z * [new branch] gh/bobrenjc93/621/base -> origin/gh/bobrenjc93/621/base 2025-10-10T00:44:16.4137601Z * [new branch] gh/bobrenjc93/621/head -> origin/gh/bobrenjc93/621/head 2025-10-10T00:44:16.4139484Z * [new branch] gh/bobrenjc93/621/orig -> origin/gh/bobrenjc93/621/orig 2025-10-10T00:44:16.4142223Z * [new branch] gh/bobrenjc93/622/base -> origin/gh/bobrenjc93/622/base 2025-10-10T00:44:16.4144083Z * [new branch] gh/bobrenjc93/622/head -> origin/gh/bobrenjc93/622/head 2025-10-10T00:44:16.4145895Z * [new branch] gh/bobrenjc93/622/orig -> origin/gh/bobrenjc93/622/orig 2025-10-10T00:44:16.4148840Z * [new branch] gh/bobrenjc93/623/base -> origin/gh/bobrenjc93/623/base 2025-10-10T00:44:16.4150780Z * [new branch] gh/bobrenjc93/623/head -> origin/gh/bobrenjc93/623/head 2025-10-10T00:44:16.4152482Z * [new branch] gh/bobrenjc93/623/orig -> origin/gh/bobrenjc93/623/orig 2025-10-10T00:44:16.4154952Z * [new branch] gh/bobrenjc93/624/base -> origin/gh/bobrenjc93/624/base 2025-10-10T00:44:16.4156902Z * [new branch] gh/bobrenjc93/624/head -> origin/gh/bobrenjc93/624/head 2025-10-10T00:44:16.4158744Z * [new branch] gh/bobrenjc93/624/orig -> origin/gh/bobrenjc93/624/orig 2025-10-10T00:44:16.4162007Z * [new branch] gh/bobrenjc93/625/base -> origin/gh/bobrenjc93/625/base 2025-10-10T00:44:16.4163897Z * [new branch] gh/bobrenjc93/625/head -> origin/gh/bobrenjc93/625/head 2025-10-10T00:44:16.4165597Z * [new branch] gh/bobrenjc93/625/orig -> origin/gh/bobrenjc93/625/orig 2025-10-10T00:44:16.4168434Z * [new branch] gh/bobrenjc93/626/base -> origin/gh/bobrenjc93/626/base 2025-10-10T00:44:16.4170374Z * [new branch] gh/bobrenjc93/626/head -> origin/gh/bobrenjc93/626/head 2025-10-10T00:44:16.4172159Z * [new branch] gh/bobrenjc93/626/orig -> origin/gh/bobrenjc93/626/orig 2025-10-10T00:44:16.4174813Z * [new branch] gh/bobrenjc93/627/base -> origin/gh/bobrenjc93/627/base 2025-10-10T00:44:16.4176724Z * [new branch] gh/bobrenjc93/627/head -> origin/gh/bobrenjc93/627/head 2025-10-10T00:44:16.4178509Z * [new branch] gh/bobrenjc93/627/orig -> origin/gh/bobrenjc93/627/orig 2025-10-10T00:44:16.4181244Z * [new branch] gh/bobrenjc93/628/base -> origin/gh/bobrenjc93/628/base 2025-10-10T00:44:16.4183113Z * [new branch] gh/bobrenjc93/628/head -> origin/gh/bobrenjc93/628/head 2025-10-10T00:44:16.4184924Z * [new branch] gh/bobrenjc93/628/orig -> origin/gh/bobrenjc93/628/orig 2025-10-10T00:44:16.4187410Z * [new branch] gh/bobrenjc93/629/base -> origin/gh/bobrenjc93/629/base 2025-10-10T00:44:16.4189456Z * [new branch] gh/bobrenjc93/629/head -> origin/gh/bobrenjc93/629/head 2025-10-10T00:44:16.4191292Z * [new branch] gh/bobrenjc93/629/orig -> origin/gh/bobrenjc93/629/orig 2025-10-10T00:44:16.4193983Z * [new branch] gh/bobrenjc93/630/base -> origin/gh/bobrenjc93/630/base 2025-10-10T00:44:16.4195867Z * [new branch] gh/bobrenjc93/630/head -> origin/gh/bobrenjc93/630/head 2025-10-10T00:44:16.4198235Z * [new branch] gh/bobrenjc93/630/orig -> origin/gh/bobrenjc93/630/orig 2025-10-10T00:44:16.4201228Z * [new branch] gh/bobrenjc93/631/base -> origin/gh/bobrenjc93/631/base 2025-10-10T00:44:16.4202924Z * [new branch] gh/bobrenjc93/631/head -> origin/gh/bobrenjc93/631/head 2025-10-10T00:44:16.4204799Z * [new branch] gh/bobrenjc93/631/orig -> origin/gh/bobrenjc93/631/orig 2025-10-10T00:44:16.4208011Z * [new branch] gh/bobrenjc93/632/base -> origin/gh/bobrenjc93/632/base 2025-10-10T00:44:16.4209623Z * [new branch] gh/bobrenjc93/632/head -> origin/gh/bobrenjc93/632/head 2025-10-10T00:44:16.4211494Z * [new branch] gh/bobrenjc93/632/orig -> origin/gh/bobrenjc93/632/orig 2025-10-10T00:44:16.4214627Z * [new branch] gh/bobrenjc93/633/base -> origin/gh/bobrenjc93/633/base 2025-10-10T00:44:16.4216654Z * [new branch] gh/bobrenjc93/633/head -> origin/gh/bobrenjc93/633/head 2025-10-10T00:44:16.4218474Z * [new branch] gh/bobrenjc93/633/orig -> origin/gh/bobrenjc93/633/orig 2025-10-10T00:44:16.4221045Z * [new branch] gh/bobrenjc93/634/base -> origin/gh/bobrenjc93/634/base 2025-10-10T00:44:16.4222998Z * [new branch] gh/bobrenjc93/634/head -> origin/gh/bobrenjc93/634/head 2025-10-10T00:44:16.4225002Z * [new branch] gh/bobrenjc93/634/orig -> origin/gh/bobrenjc93/634/orig 2025-10-10T00:44:16.4227357Z * [new branch] gh/bobrenjc93/635/base -> origin/gh/bobrenjc93/635/base 2025-10-10T00:44:16.4229212Z * [new branch] gh/bobrenjc93/635/head -> origin/gh/bobrenjc93/635/head 2025-10-10T00:44:16.4231128Z * [new branch] gh/bobrenjc93/635/orig -> origin/gh/bobrenjc93/635/orig 2025-10-10T00:44:16.4233747Z * [new branch] gh/bobrenjc93/636/base -> origin/gh/bobrenjc93/636/base 2025-10-10T00:44:16.4235659Z * [new branch] gh/bobrenjc93/636/head -> origin/gh/bobrenjc93/636/head 2025-10-10T00:44:16.4237552Z * [new branch] gh/bobrenjc93/636/orig -> origin/gh/bobrenjc93/636/orig 2025-10-10T00:44:16.4240178Z * [new branch] gh/bobrenjc93/637/base -> origin/gh/bobrenjc93/637/base 2025-10-10T00:44:16.4242080Z * [new branch] gh/bobrenjc93/637/head -> origin/gh/bobrenjc93/637/head 2025-10-10T00:44:16.4243982Z * [new branch] gh/bobrenjc93/637/orig -> origin/gh/bobrenjc93/637/orig 2025-10-10T00:44:16.4246648Z * [new branch] gh/bobrenjc93/638/base -> origin/gh/bobrenjc93/638/base 2025-10-10T00:44:16.4248973Z * [new branch] gh/bobrenjc93/638/head -> origin/gh/bobrenjc93/638/head 2025-10-10T00:44:16.4250800Z * [new branch] gh/bobrenjc93/638/orig -> origin/gh/bobrenjc93/638/orig 2025-10-10T00:44:16.4253518Z * [new branch] gh/bobrenjc93/639/base -> origin/gh/bobrenjc93/639/base 2025-10-10T00:44:16.4255330Z * [new branch] gh/bobrenjc93/639/head -> origin/gh/bobrenjc93/639/head 2025-10-10T00:44:16.4257172Z * [new branch] gh/bobrenjc93/639/orig -> origin/gh/bobrenjc93/639/orig 2025-10-10T00:44:16.4259716Z * [new branch] gh/bobrenjc93/640/base -> origin/gh/bobrenjc93/640/base 2025-10-10T00:44:16.4261574Z * [new branch] gh/bobrenjc93/640/head -> origin/gh/bobrenjc93/640/head 2025-10-10T00:44:16.4263387Z * [new branch] gh/bobrenjc93/640/orig -> origin/gh/bobrenjc93/640/orig 2025-10-10T00:44:16.4266094Z * [new branch] gh/bobrenjc93/641/base -> origin/gh/bobrenjc93/641/base 2025-10-10T00:44:16.4268096Z * [new branch] gh/bobrenjc93/641/head -> origin/gh/bobrenjc93/641/head 2025-10-10T00:44:16.4269946Z * [new branch] gh/bobrenjc93/641/orig -> origin/gh/bobrenjc93/641/orig 2025-10-10T00:44:16.4272589Z * [new branch] gh/bobrenjc93/642/base -> origin/gh/bobrenjc93/642/base 2025-10-10T00:44:16.4274486Z * [new branch] gh/bobrenjc93/642/head -> origin/gh/bobrenjc93/642/head 2025-10-10T00:44:16.4276324Z * [new branch] gh/bobrenjc93/642/orig -> origin/gh/bobrenjc93/642/orig 2025-10-10T00:44:16.4279021Z * [new branch] gh/bobrenjc93/643/base -> origin/gh/bobrenjc93/643/base 2025-10-10T00:44:16.4280952Z * [new branch] gh/bobrenjc93/643/head -> origin/gh/bobrenjc93/643/head 2025-10-10T00:44:16.4282747Z * [new branch] gh/bobrenjc93/643/orig -> origin/gh/bobrenjc93/643/orig 2025-10-10T00:44:16.4285369Z * [new branch] gh/bobrenjc93/644/base -> origin/gh/bobrenjc93/644/base 2025-10-10T00:44:16.4287246Z * [new branch] gh/bobrenjc93/644/head -> origin/gh/bobrenjc93/644/head 2025-10-10T00:44:16.4289235Z * [new branch] gh/bobrenjc93/644/orig -> origin/gh/bobrenjc93/644/orig 2025-10-10T00:44:16.4291838Z * [new branch] gh/bobrenjc93/645/base -> origin/gh/bobrenjc93/645/base 2025-10-10T00:44:16.4293858Z * [new branch] gh/bobrenjc93/645/head -> origin/gh/bobrenjc93/645/head 2025-10-10T00:44:16.4295703Z * [new branch] gh/bobrenjc93/645/orig -> origin/gh/bobrenjc93/645/orig 2025-10-10T00:44:16.4298575Z * [new branch] gh/bobrenjc93/646/base -> origin/gh/bobrenjc93/646/base 2025-10-10T00:44:16.4302806Z * [new branch] gh/bobrenjc93/646/head -> origin/gh/bobrenjc93/646/head 2025-10-10T00:44:16.4304537Z * [new branch] gh/bobrenjc93/646/orig -> origin/gh/bobrenjc93/646/orig 2025-10-10T00:44:16.4307333Z * [new branch] gh/bobrenjc93/647/base -> origin/gh/bobrenjc93/647/base 2025-10-10T00:44:16.4309261Z * [new branch] gh/bobrenjc93/647/head -> origin/gh/bobrenjc93/647/head 2025-10-10T00:44:16.4311614Z * [new branch] gh/bobrenjc93/647/orig -> origin/gh/bobrenjc93/647/orig 2025-10-10T00:44:16.4313977Z * [new branch] gh/bobrenjc93/648/base -> origin/gh/bobrenjc93/648/base 2025-10-10T00:44:16.4315879Z * [new branch] gh/bobrenjc93/648/head -> origin/gh/bobrenjc93/648/head 2025-10-10T00:44:16.4317674Z * [new branch] gh/bobrenjc93/648/orig -> origin/gh/bobrenjc93/648/orig 2025-10-10T00:44:16.4320214Z * [new branch] gh/bobrenjc93/649/base -> origin/gh/bobrenjc93/649/base 2025-10-10T00:44:16.4322271Z * [new branch] gh/bobrenjc93/649/head -> origin/gh/bobrenjc93/649/head 2025-10-10T00:44:16.4324032Z * [new branch] gh/bobrenjc93/649/orig -> origin/gh/bobrenjc93/649/orig 2025-10-10T00:44:16.4326683Z * [new branch] gh/bobrenjc93/650/base -> origin/gh/bobrenjc93/650/base 2025-10-10T00:44:16.4328728Z * [new branch] gh/bobrenjc93/650/head -> origin/gh/bobrenjc93/650/head 2025-10-10T00:44:16.4330542Z * [new branch] gh/bobrenjc93/650/orig -> origin/gh/bobrenjc93/650/orig 2025-10-10T00:44:16.4333873Z * [new branch] gh/briancoutinho/2/base -> origin/gh/briancoutinho/2/base 2025-10-10T00:44:16.4335827Z * [new branch] gh/briancoutinho/2/head -> origin/gh/briancoutinho/2/head 2025-10-10T00:44:16.4339455Z * [new branch] gh/c00w/23/base -> origin/gh/c00w/23/base 2025-10-10T00:44:16.4340883Z * [new branch] gh/c00w/23/head -> origin/gh/c00w/23/head 2025-10-10T00:44:16.4343541Z * [new branch] gh/c00w/53/base -> origin/gh/c00w/53/base 2025-10-10T00:44:16.4345524Z * [new branch] gh/c00w/53/head -> origin/gh/c00w/53/head 2025-10-10T00:44:16.4347318Z * [new branch] gh/c00w/53/orig -> origin/gh/c00w/53/orig 2025-10-10T00:44:16.4349749Z * [new branch] gh/c00w/54/base -> origin/gh/c00w/54/base 2025-10-10T00:44:16.4351606Z * [new branch] gh/c00w/54/head -> origin/gh/c00w/54/head 2025-10-10T00:44:16.4353526Z * [new branch] gh/c00w/54/orig -> origin/gh/c00w/54/orig 2025-10-10T00:44:16.4356450Z * [new branch] gh/c00w/57/base -> origin/gh/c00w/57/base 2025-10-10T00:44:16.4358876Z * [new branch] gh/c00w/57/head -> origin/gh/c00w/57/head 2025-10-10T00:44:16.4360584Z * [new branch] gh/c00w/57/orig -> origin/gh/c00w/57/orig 2025-10-10T00:44:16.4363652Z * [new branch] gh/clee2000/1/base -> origin/gh/clee2000/1/base 2025-10-10T00:44:16.4365605Z * [new branch] gh/clee2000/1/head -> origin/gh/clee2000/1/head 2025-10-10T00:44:16.4367582Z * [new branch] gh/clee2000/1/orig -> origin/gh/clee2000/1/orig 2025-10-10T00:44:16.4370983Z * [new branch] gh/coconutruben/1/base -> origin/gh/coconutruben/1/base 2025-10-10T00:44:16.4372926Z * [new branch] gh/coconutruben/1/head -> origin/gh/coconutruben/1/head 2025-10-10T00:44:16.4375702Z * [new branch] gh/coconutruben/20/base -> origin/gh/coconutruben/20/base 2025-10-10T00:44:16.4377644Z * [new branch] gh/coconutruben/20/head -> origin/gh/coconutruben/20/head 2025-10-10T00:44:16.4379589Z * [new branch] gh/coconutruben/20/orig -> origin/gh/coconutruben/20/orig 2025-10-10T00:44:16.4382289Z * [new branch] gh/coconutruben/22/base -> origin/gh/coconutruben/22/base 2025-10-10T00:44:16.4384005Z * [new branch] gh/coconutruben/22/head -> origin/gh/coconutruben/22/head 2025-10-10T00:44:16.4385983Z * [new branch] gh/coconutruben/22/orig -> origin/gh/coconutruben/22/orig 2025-10-10T00:44:16.4388841Z * [new branch] gh/coconutruben/24/base -> origin/gh/coconutruben/24/base 2025-10-10T00:44:16.4390886Z * [new branch] gh/coconutruben/24/head -> origin/gh/coconutruben/24/head 2025-10-10T00:44:16.4392846Z * [new branch] gh/coconutruben/24/orig -> origin/gh/coconutruben/24/orig 2025-10-10T00:44:16.4395901Z * [new branch] gh/coconutruben/25/base -> origin/gh/coconutruben/25/base 2025-10-10T00:44:16.4398169Z * [new branch] gh/coconutruben/25/head -> origin/gh/coconutruben/25/head 2025-10-10T00:44:16.4400673Z * [new branch] gh/coconutruben/25/orig -> origin/gh/coconutruben/25/orig 2025-10-10T00:44:16.4404489Z * [new branch] gh/coconutruben/36/base -> origin/gh/coconutruben/36/base 2025-10-10T00:44:16.4406833Z * [new branch] gh/coconutruben/36/head -> origin/gh/coconutruben/36/head 2025-10-10T00:44:16.4409702Z * [new branch] gh/coconutruben/36/orig -> origin/gh/coconutruben/36/orig 2025-10-10T00:44:16.4412379Z * [new branch] gh/coconutruben/48/base -> origin/gh/coconutruben/48/base 2025-10-10T00:44:16.4414372Z * [new branch] gh/coconutruben/48/head -> origin/gh/coconutruben/48/head 2025-10-10T00:44:16.4416372Z * [new branch] gh/coconutruben/48/orig -> origin/gh/coconutruben/48/orig 2025-10-10T00:44:16.4418982Z * [new branch] gh/coconutruben/49/base -> origin/gh/coconutruben/49/base 2025-10-10T00:44:16.4420941Z * [new branch] gh/coconutruben/49/head -> origin/gh/coconutruben/49/head 2025-10-10T00:44:16.4422854Z * [new branch] gh/coconutruben/49/orig -> origin/gh/coconutruben/49/orig 2025-10-10T00:44:16.4426233Z * [new branch] gh/coconutruben/50/base -> origin/gh/coconutruben/50/base 2025-10-10T00:44:16.4428229Z * [new branch] gh/coconutruben/50/head -> origin/gh/coconutruben/50/head 2025-10-10T00:44:16.4430169Z * [new branch] gh/coconutruben/50/orig -> origin/gh/coconutruben/50/orig 2025-10-10T00:44:16.4433053Z * [new branch] gh/coconutruben/51/base -> origin/gh/coconutruben/51/base 2025-10-10T00:44:16.4434940Z * [new branch] gh/coconutruben/51/head -> origin/gh/coconutruben/51/head 2025-10-10T00:44:16.4436905Z * [new branch] gh/coconutruben/51/orig -> origin/gh/coconutruben/51/orig 2025-10-10T00:44:16.4439682Z * [new branch] gh/coconutruben/52/base -> origin/gh/coconutruben/52/base 2025-10-10T00:44:16.4441669Z * [new branch] gh/coconutruben/52/head -> origin/gh/coconutruben/52/head 2025-10-10T00:44:16.4443704Z * [new branch] gh/coconutruben/52/orig -> origin/gh/coconutruben/52/orig 2025-10-10T00:44:16.4446362Z * [new branch] gh/coconutruben/53/base -> origin/gh/coconutruben/53/base 2025-10-10T00:44:16.4448559Z * [new branch] gh/coconutruben/53/head -> origin/gh/coconutruben/53/head 2025-10-10T00:44:16.4450652Z * [new branch] gh/coconutruben/53/orig -> origin/gh/coconutruben/53/orig 2025-10-10T00:44:16.4453513Z * [new branch] gh/coconutruben/54/base -> origin/gh/coconutruben/54/base 2025-10-10T00:44:16.4455502Z * [new branch] gh/coconutruben/54/head -> origin/gh/coconutruben/54/head 2025-10-10T00:44:16.4457460Z * [new branch] gh/coconutruben/54/orig -> origin/gh/coconutruben/54/orig 2025-10-10T00:44:16.4460252Z * [new branch] gh/coconutruben/55/base -> origin/gh/coconutruben/55/base 2025-10-10T00:44:16.4462305Z * [new branch] gh/coconutruben/55/head -> origin/gh/coconutruben/55/head 2025-10-10T00:44:16.4464212Z * [new branch] gh/coconutruben/55/orig -> origin/gh/coconutruben/55/orig 2025-10-10T00:44:16.4466815Z * [new branch] gh/coconutruben/56/base -> origin/gh/coconutruben/56/base 2025-10-10T00:44:16.4468780Z * [new branch] gh/coconutruben/56/head -> origin/gh/coconutruben/56/head 2025-10-10T00:44:16.4470724Z * [new branch] gh/coconutruben/56/orig -> origin/gh/coconutruben/56/orig 2025-10-10T00:44:16.4473498Z * [new branch] gh/coconutruben/57/base -> origin/gh/coconutruben/57/base 2025-10-10T00:44:16.4475529Z * [new branch] gh/coconutruben/57/head -> origin/gh/coconutruben/57/head 2025-10-10T00:44:16.4477463Z * [new branch] gh/coconutruben/57/orig -> origin/gh/coconutruben/57/orig 2025-10-10T00:44:16.4480500Z * [new branch] gh/coconutruben/58/base -> origin/gh/coconutruben/58/base 2025-10-10T00:44:16.4482498Z * [new branch] gh/coconutruben/58/head -> origin/gh/coconutruben/58/head 2025-10-10T00:44:16.4484412Z * [new branch] gh/coconutruben/58/orig -> origin/gh/coconutruben/58/orig 2025-10-10T00:44:16.4487129Z * [new branch] gh/coconutruben/59/base -> origin/gh/coconutruben/59/base 2025-10-10T00:44:16.4489000Z * [new branch] gh/coconutruben/59/head -> origin/gh/coconutruben/59/head 2025-10-10T00:44:16.4490850Z * [new branch] gh/coconutruben/59/orig -> origin/gh/coconutruben/59/orig 2025-10-10T00:44:16.4494257Z * [new branch] gh/coconutruben/62/base -> origin/gh/coconutruben/62/base 2025-10-10T00:44:16.4496303Z * [new branch] gh/coconutruben/62/head -> origin/gh/coconutruben/62/head 2025-10-10T00:44:16.4498307Z * [new branch] gh/coconutruben/62/orig -> origin/gh/coconutruben/62/orig 2025-10-10T00:44:16.4503825Z * [new branch] gh/coconutruben/64/base -> origin/gh/coconutruben/64/base 2025-10-10T00:44:16.4505746Z * [new branch] gh/coconutruben/64/head -> origin/gh/coconutruben/64/head 2025-10-10T00:44:16.4507550Z * [new branch] gh/coconutruben/64/orig -> origin/gh/coconutruben/64/orig 2025-10-10T00:44:16.4510425Z * [new branch] gh/coconutruben/65/base -> origin/gh/coconutruben/65/base 2025-10-10T00:44:16.4512364Z * [new branch] gh/coconutruben/65/head -> origin/gh/coconutruben/65/head 2025-10-10T00:44:16.4514288Z * [new branch] gh/coconutruben/65/orig -> origin/gh/coconutruben/65/orig 2025-10-10T00:44:16.4516996Z * [new branch] gh/coconutruben/66/base -> origin/gh/coconutruben/66/base 2025-10-10T00:44:16.4518940Z * [new branch] gh/coconutruben/66/head -> origin/gh/coconutruben/66/head 2025-10-10T00:44:16.4520864Z * [new branch] gh/coconutruben/66/orig -> origin/gh/coconutruben/66/orig 2025-10-10T00:44:16.4523590Z * [new branch] gh/coconutruben/67/base -> origin/gh/coconutruben/67/base 2025-10-10T00:44:16.4525531Z * [new branch] gh/coconutruben/67/head -> origin/gh/coconutruben/67/head 2025-10-10T00:44:16.4527481Z * [new branch] gh/coconutruben/67/orig -> origin/gh/coconutruben/67/orig 2025-10-10T00:44:16.4530318Z * [new branch] gh/coconutruben/68/base -> origin/gh/coconutruben/68/base 2025-10-10T00:44:16.4532271Z * [new branch] gh/coconutruben/68/head -> origin/gh/coconutruben/68/head 2025-10-10T00:44:16.4534145Z * [new branch] gh/coconutruben/68/orig -> origin/gh/coconutruben/68/orig 2025-10-10T00:44:16.4536992Z * [new branch] gh/coconutruben/69/base -> origin/gh/coconutruben/69/base 2025-10-10T00:44:16.4538930Z * [new branch] gh/coconutruben/69/head -> origin/gh/coconutruben/69/head 2025-10-10T00:44:16.4540975Z * [new branch] gh/coconutruben/69/orig -> origin/gh/coconutruben/69/orig 2025-10-10T00:44:16.4543654Z * [new branch] gh/coconutruben/70/base -> origin/gh/coconutruben/70/base 2025-10-10T00:44:16.4545485Z * [new branch] gh/coconutruben/70/head -> origin/gh/coconutruben/70/head 2025-10-10T00:44:16.4547415Z * [new branch] gh/coconutruben/70/orig -> origin/gh/coconutruben/70/orig 2025-10-10T00:44:16.4549893Z * [new branch] gh/coconutruben/71/base -> origin/gh/coconutruben/71/base 2025-10-10T00:44:16.4551876Z * [new branch] gh/coconutruben/71/head -> origin/gh/coconutruben/71/head 2025-10-10T00:44:16.4553793Z * [new branch] gh/coconutruben/71/orig -> origin/gh/coconutruben/71/orig 2025-10-10T00:44:16.4556313Z * [new branch] gh/coconutruben/72/base -> origin/gh/coconutruben/72/base 2025-10-10T00:44:16.4558335Z * [new branch] gh/coconutruben/72/head -> origin/gh/coconutruben/72/head 2025-10-10T00:44:16.4560227Z * [new branch] gh/coconutruben/72/orig -> origin/gh/coconutruben/72/orig 2025-10-10T00:44:16.4562876Z * [new branch] gh/coconutruben/73/base -> origin/gh/coconutruben/73/base 2025-10-10T00:44:16.4564785Z * [new branch] gh/coconutruben/73/head -> origin/gh/coconutruben/73/head 2025-10-10T00:44:16.4566712Z * [new branch] gh/coconutruben/73/orig -> origin/gh/coconutruben/73/orig 2025-10-10T00:44:16.4569727Z * [new branch] gh/coconutruben/74/base -> origin/gh/coconutruben/74/base 2025-10-10T00:44:16.4571711Z * [new branch] gh/coconutruben/74/head -> origin/gh/coconutruben/74/head 2025-10-10T00:44:16.4573641Z * [new branch] gh/coconutruben/74/orig -> origin/gh/coconutruben/74/orig 2025-10-10T00:44:16.4576348Z * [new branch] gh/coconutruben/75/base -> origin/gh/coconutruben/75/base 2025-10-10T00:44:16.4578357Z * [new branch] gh/coconutruben/75/head -> origin/gh/coconutruben/75/head 2025-10-10T00:44:16.4580270Z * [new branch] gh/coconutruben/75/orig -> origin/gh/coconutruben/75/orig 2025-10-10T00:44:16.4583058Z * [new branch] gh/coconutruben/76/base -> origin/gh/coconutruben/76/base 2025-10-10T00:44:16.4585076Z * [new branch] gh/coconutruben/76/head -> origin/gh/coconutruben/76/head 2025-10-10T00:44:16.4587002Z * [new branch] gh/coconutruben/76/orig -> origin/gh/coconutruben/76/orig 2025-10-10T00:44:16.4589802Z * [new branch] gh/coconutruben/77/base -> origin/gh/coconutruben/77/base 2025-10-10T00:44:16.4591801Z * [new branch] gh/coconutruben/77/head -> origin/gh/coconutruben/77/head 2025-10-10T00:44:16.4593685Z * [new branch] gh/coconutruben/77/orig -> origin/gh/coconutruben/77/orig 2025-10-10T00:44:16.4596328Z * [new branch] gh/coconutruben/78/base -> origin/gh/coconutruben/78/base 2025-10-10T00:44:16.4598501Z * [new branch] gh/coconutruben/78/head -> origin/gh/coconutruben/78/head 2025-10-10T00:44:16.4601087Z * [new branch] gh/coconutruben/78/orig -> origin/gh/coconutruben/78/orig 2025-10-10T00:44:16.4603678Z * [new branch] gh/coconutruben/79/base -> origin/gh/coconutruben/79/base 2025-10-10T00:44:16.4605759Z * [new branch] gh/coconutruben/79/head -> origin/gh/coconutruben/79/head 2025-10-10T00:44:16.4607761Z * [new branch] gh/coconutruben/79/orig -> origin/gh/coconutruben/79/orig 2025-10-10T00:44:16.4610538Z * [new branch] gh/coconutruben/80/base -> origin/gh/coconutruben/80/base 2025-10-10T00:44:16.4612513Z * [new branch] gh/coconutruben/80/head -> origin/gh/coconutruben/80/head 2025-10-10T00:44:16.4614422Z * [new branch] gh/coconutruben/80/orig -> origin/gh/coconutruben/80/orig 2025-10-10T00:44:16.4617357Z * [new branch] gh/coconutruben/81/base -> origin/gh/coconutruben/81/base 2025-10-10T00:44:16.4619167Z * [new branch] gh/coconutruben/81/head -> origin/gh/coconutruben/81/head 2025-10-10T00:44:16.4621092Z * [new branch] gh/coconutruben/81/orig -> origin/gh/coconutruben/81/orig 2025-10-10T00:44:16.4623607Z * [new branch] gh/coconutruben/82/base -> origin/gh/coconutruben/82/base 2025-10-10T00:44:16.4625502Z * [new branch] gh/coconutruben/82/head -> origin/gh/coconutruben/82/head 2025-10-10T00:44:16.4627440Z * [new branch] gh/coconutruben/82/orig -> origin/gh/coconutruben/82/orig 2025-10-10T00:44:16.4629916Z * [new branch] gh/coconutruben/83/base -> origin/gh/coconutruben/83/base 2025-10-10T00:44:16.4631844Z * [new branch] gh/coconutruben/83/head -> origin/gh/coconutruben/83/head 2025-10-10T00:44:16.4633757Z * [new branch] gh/coconutruben/83/orig -> origin/gh/coconutruben/83/orig 2025-10-10T00:44:16.4636978Z * [new branch] gh/colinchan15/1/base -> origin/gh/colinchan15/1/base 2025-10-10T00:44:16.4638860Z * [new branch] gh/colinchan15/1/head -> origin/gh/colinchan15/1/head 2025-10-10T00:44:16.4641377Z * [new branch] gh/colinchan15/2/base -> origin/gh/colinchan15/2/base 2025-10-10T00:44:16.4643321Z * [new branch] gh/colinchan15/2/head -> origin/gh/colinchan15/2/head 2025-10-10T00:44:16.4645619Z * [new branch] gh/colinchan15/3/base -> origin/gh/colinchan15/3/base 2025-10-10T00:44:16.4647474Z * [new branch] gh/colinchan15/3/head -> origin/gh/colinchan15/3/head 2025-10-10T00:44:16.4650097Z * [new branch] gh/colinchan15/6/base -> origin/gh/colinchan15/6/base 2025-10-10T00:44:16.4652003Z * [new branch] gh/colinchan15/6/head -> origin/gh/colinchan15/6/head 2025-10-10T00:44:16.4655273Z * [new branch] gh/davidberard98/382/base -> origin/gh/davidberard98/382/base 2025-10-10T00:44:16.4657332Z * [new branch] gh/davidberard98/382/head -> origin/gh/davidberard98/382/head 2025-10-10T00:44:16.4659228Z * [new branch] gh/davidberard98/382/orig -> origin/gh/davidberard98/382/orig 2025-10-10T00:44:16.4662146Z * [new branch] gh/davidberard98/386/base -> origin/gh/davidberard98/386/base 2025-10-10T00:44:16.4664137Z * [new branch] gh/davidberard98/386/head -> origin/gh/davidberard98/386/head 2025-10-10T00:44:16.4666096Z * [new branch] gh/davidberard98/386/orig -> origin/gh/davidberard98/386/orig 2025-10-10T00:44:16.4668804Z * [new branch] gh/davidberard98/391/base -> origin/gh/davidberard98/391/base 2025-10-10T00:44:16.4670961Z * [new branch] gh/davidberard98/391/head -> origin/gh/davidberard98/391/head 2025-10-10T00:44:16.4672704Z * [new branch] gh/davidberard98/391/orig -> origin/gh/davidberard98/391/orig 2025-10-10T00:44:16.4675497Z * [new branch] gh/davidberard98/392/base -> origin/gh/davidberard98/392/base 2025-10-10T00:44:16.4677862Z * [new branch] gh/davidberard98/392/head -> origin/gh/davidberard98/392/head 2025-10-10T00:44:16.4679935Z * [new branch] gh/davidberard98/392/orig -> origin/gh/davidberard98/392/orig 2025-10-10T00:44:16.4683510Z * [new branch] gh/davidberard98/399/base -> origin/gh/davidberard98/399/base 2025-10-10T00:44:16.4685433Z * [new branch] gh/davidberard98/399/head -> origin/gh/davidberard98/399/head 2025-10-10T00:44:16.4687372Z * [new branch] gh/davidberard98/399/orig -> origin/gh/davidberard98/399/orig 2025-10-10T00:44:16.4690077Z * [new branch] gh/davidberard98/401/base -> origin/gh/davidberard98/401/base 2025-10-10T00:44:16.4691996Z * [new branch] gh/davidberard98/401/head -> origin/gh/davidberard98/401/head 2025-10-10T00:44:16.4694073Z * [new branch] gh/davidberard98/401/orig -> origin/gh/davidberard98/401/orig 2025-10-10T00:44:16.4697497Z * [new branch] gh/davidberard98/405/base -> origin/gh/davidberard98/405/base 2025-10-10T00:44:16.4698162Z * [new branch] gh/davidberard98/405/head -> origin/gh/davidberard98/405/head 2025-10-10T00:44:16.4700975Z * [new branch] gh/davidberard98/405/orig -> origin/gh/davidberard98/405/orig 2025-10-10T00:44:16.4703270Z * [new branch] gh/davidberard98/410/base -> origin/gh/davidberard98/410/base 2025-10-10T00:44:16.4705107Z * [new branch] gh/davidberard98/410/head -> origin/gh/davidberard98/410/head 2025-10-10T00:44:16.4706970Z * [new branch] gh/davidberard98/410/orig -> origin/gh/davidberard98/410/orig 2025-10-10T00:44:16.4709625Z * [new branch] gh/davidberard98/411/base -> origin/gh/davidberard98/411/base 2025-10-10T00:44:16.4711476Z * [new branch] gh/davidberard98/411/head -> origin/gh/davidberard98/411/head 2025-10-10T00:44:16.4713355Z * [new branch] gh/davidberard98/411/orig -> origin/gh/davidberard98/411/orig 2025-10-10T00:44:16.4715919Z * [new branch] gh/davidberard98/412/base -> origin/gh/davidberard98/412/base 2025-10-10T00:44:16.4717865Z * [new branch] gh/davidberard98/412/head -> origin/gh/davidberard98/412/head 2025-10-10T00:44:16.4719863Z * [new branch] gh/davidberard98/412/orig -> origin/gh/davidberard98/412/orig 2025-10-10T00:44:16.4722977Z * [new branch] gh/desertfire/594/base -> origin/gh/desertfire/594/base 2025-10-10T00:44:16.4724863Z * [new branch] gh/desertfire/594/head -> origin/gh/desertfire/594/head 2025-10-10T00:44:16.4726808Z * [new branch] gh/desertfire/594/orig -> origin/gh/desertfire/594/orig 2025-10-10T00:44:16.4729667Z * [new branch] gh/desertfire/595/base -> origin/gh/desertfire/595/base 2025-10-10T00:44:16.4731557Z * [new branch] gh/desertfire/595/head -> origin/gh/desertfire/595/head 2025-10-10T00:44:16.4733425Z * [new branch] gh/desertfire/595/orig -> origin/gh/desertfire/595/orig 2025-10-10T00:44:16.4736009Z * [new branch] gh/desertfire/597/base -> origin/gh/desertfire/597/base 2025-10-10T00:44:16.4737916Z * [new branch] gh/desertfire/597/head -> origin/gh/desertfire/597/head 2025-10-10T00:44:16.4740177Z * [new branch] gh/desertfire/597/orig -> origin/gh/desertfire/597/orig 2025-10-10T00:44:16.4742745Z * [new branch] gh/desertfire/598/base -> origin/gh/desertfire/598/base 2025-10-10T00:44:16.4744866Z * [new branch] gh/desertfire/598/head -> origin/gh/desertfire/598/head 2025-10-10T00:44:16.4746766Z * [new branch] gh/desertfire/598/orig -> origin/gh/desertfire/598/orig 2025-10-10T00:44:16.4749133Z * [new branch] gh/desertfire/599/base -> origin/gh/desertfire/599/base 2025-10-10T00:44:16.4751068Z * [new branch] gh/desertfire/599/head -> origin/gh/desertfire/599/head 2025-10-10T00:44:16.4752909Z * [new branch] gh/desertfire/599/orig -> origin/gh/desertfire/599/orig 2025-10-10T00:44:16.4755540Z * [new branch] gh/desertfire/600/base -> origin/gh/desertfire/600/base 2025-10-10T00:44:16.4757565Z * [new branch] gh/desertfire/600/head -> origin/gh/desertfire/600/head 2025-10-10T00:44:16.4759903Z * [new branch] gh/desertfire/600/orig -> origin/gh/desertfire/600/orig 2025-10-10T00:44:16.4762523Z * [new branch] gh/desertfire/601/base -> origin/gh/desertfire/601/base 2025-10-10T00:44:16.4764425Z * [new branch] gh/desertfire/601/head -> origin/gh/desertfire/601/head 2025-10-10T00:44:16.4766273Z * [new branch] gh/desertfire/601/orig -> origin/gh/desertfire/601/orig 2025-10-10T00:44:16.4769754Z * [new branch] gh/dharakk/1/base -> origin/gh/dharakk/1/base 2025-10-10T00:44:16.4771856Z * [new branch] gh/dharakk/1/head -> origin/gh/dharakk/1/head 2025-10-10T00:44:16.4774873Z * [new branch] gh/drisspg/159/base -> origin/gh/drisspg/159/base 2025-10-10T00:44:16.4776754Z * [new branch] gh/drisspg/159/head -> origin/gh/drisspg/159/head 2025-10-10T00:44:16.4778610Z * [new branch] gh/drisspg/159/orig -> origin/gh/drisspg/159/orig 2025-10-10T00:44:16.4781217Z * [new branch] gh/drisspg/166/base -> origin/gh/drisspg/166/base 2025-10-10T00:44:16.4783075Z * [new branch] gh/drisspg/166/head -> origin/gh/drisspg/166/head 2025-10-10T00:44:16.4784918Z * [new branch] gh/drisspg/166/orig -> origin/gh/drisspg/166/orig 2025-10-10T00:44:16.4787462Z * [new branch] gh/drisspg/170/base -> origin/gh/drisspg/170/base 2025-10-10T00:44:16.4789552Z * [new branch] gh/drisspg/170/head -> origin/gh/drisspg/170/head 2025-10-10T00:44:16.4791299Z * [new branch] gh/drisspg/170/orig -> origin/gh/drisspg/170/orig 2025-10-10T00:44:16.4794040Z * [new branch] gh/drisspg/177/base -> origin/gh/drisspg/177/base 2025-10-10T00:44:16.4798175Z * [new branch] gh/drisspg/177/head -> origin/gh/drisspg/177/head 2025-10-10T00:44:16.4798765Z * [new branch] gh/drisspg/177/orig -> origin/gh/drisspg/177/orig 2025-10-10T00:44:16.4800754Z * [new branch] gh/drisspg/178/base -> origin/gh/drisspg/178/base 2025-10-10T00:44:16.4802333Z * [new branch] gh/drisspg/178/head -> origin/gh/drisspg/178/head 2025-10-10T00:44:16.4804016Z * [new branch] gh/drisspg/178/orig -> origin/gh/drisspg/178/orig 2025-10-10T00:44:16.4806620Z * [new branch] gh/drisspg/182/base -> origin/gh/drisspg/182/base 2025-10-10T00:44:16.4808674Z * [new branch] gh/drisspg/182/head -> origin/gh/drisspg/182/head 2025-10-10T00:44:16.4811219Z * [new branch] gh/drisspg/183/base -> origin/gh/drisspg/183/base 2025-10-10T00:44:16.4813084Z * [new branch] gh/drisspg/183/head -> origin/gh/drisspg/183/head 2025-10-10T00:44:16.4815528Z * [new branch] gh/drisspg/184/base -> origin/gh/drisspg/184/base 2025-10-10T00:44:16.4817340Z * [new branch] gh/drisspg/184/head -> origin/gh/drisspg/184/head 2025-10-10T00:44:16.4820037Z * [new branch] gh/drisspg/185/base -> origin/gh/drisspg/185/base 2025-10-10T00:44:16.4821907Z * [new branch] gh/drisspg/185/head -> origin/gh/drisspg/185/head 2025-10-10T00:44:16.4824475Z * [new branch] gh/drisspg/187/base -> origin/gh/drisspg/187/base 2025-10-10T00:44:16.4826493Z * [new branch] gh/drisspg/187/head -> origin/gh/drisspg/187/head 2025-10-10T00:44:16.4828311Z * [new branch] gh/drisspg/187/orig -> origin/gh/drisspg/187/orig 2025-10-10T00:44:16.4830823Z * [new branch] gh/drisspg/188/base -> origin/gh/drisspg/188/base 2025-10-10T00:44:16.4833006Z * [new branch] gh/drisspg/188/head -> origin/gh/drisspg/188/head 2025-10-10T00:44:16.4834262Z * [new branch] gh/drisspg/188/orig -> origin/gh/drisspg/188/orig 2025-10-10T00:44:16.4837284Z * [new branch] gh/drisspg/189/base -> origin/gh/drisspg/189/base 2025-10-10T00:44:16.4838979Z * [new branch] gh/drisspg/189/head -> origin/gh/drisspg/189/head 2025-10-10T00:44:16.4840867Z * [new branch] gh/drisspg/189/orig -> origin/gh/drisspg/189/orig 2025-10-10T00:44:16.4843518Z * [new branch] gh/drisspg/193/base -> origin/gh/drisspg/193/base 2025-10-10T00:44:16.4845350Z * [new branch] gh/drisspg/193/head -> origin/gh/drisspg/193/head 2025-10-10T00:44:16.4847386Z * [new branch] gh/drisspg/193/orig -> origin/gh/drisspg/193/orig 2025-10-10T00:44:16.4849888Z * [new branch] gh/drisspg/194/base -> origin/gh/drisspg/194/base 2025-10-10T00:44:16.4851649Z * [new branch] gh/drisspg/194/head -> origin/gh/drisspg/194/head 2025-10-10T00:44:16.4853572Z * [new branch] gh/drisspg/194/orig -> origin/gh/drisspg/194/orig 2025-10-10T00:44:16.4856156Z * [new branch] gh/drisspg/196/base -> origin/gh/drisspg/196/base 2025-10-10T00:44:16.4858057Z * [new branch] gh/drisspg/196/head -> origin/gh/drisspg/196/head 2025-10-10T00:44:16.4859877Z * [new branch] gh/drisspg/196/orig -> origin/gh/drisspg/196/orig 2025-10-10T00:44:16.4862910Z * [new branch] gh/drisspg/197/base -> origin/gh/drisspg/197/base 2025-10-10T00:44:16.4865376Z * [new branch] gh/drisspg/197/head -> origin/gh/drisspg/197/head 2025-10-10T00:44:16.4867210Z * [new branch] gh/drisspg/197/orig -> origin/gh/drisspg/197/orig 2025-10-10T00:44:16.4870471Z * [new branch] gh/drisspg/198/base -> origin/gh/drisspg/198/base 2025-10-10T00:44:16.4872440Z * [new branch] gh/drisspg/198/head -> origin/gh/drisspg/198/head 2025-10-10T00:44:16.4874334Z * [new branch] gh/drisspg/198/orig -> origin/gh/drisspg/198/orig 2025-10-10T00:44:16.4876947Z * [new branch] gh/drisspg/199/base -> origin/gh/drisspg/199/base 2025-10-10T00:44:16.4878845Z * [new branch] gh/drisspg/199/head -> origin/gh/drisspg/199/head 2025-10-10T00:44:16.4880773Z * [new branch] gh/drisspg/199/orig -> origin/gh/drisspg/199/orig 2025-10-10T00:44:16.4883498Z * [new branch] gh/drisspg/200/base -> origin/gh/drisspg/200/base 2025-10-10T00:44:16.4885342Z * [new branch] gh/drisspg/200/head -> origin/gh/drisspg/200/head 2025-10-10T00:44:16.4887245Z * [new branch] gh/drisspg/200/orig -> origin/gh/drisspg/200/orig 2025-10-10T00:44:16.4890082Z * [new branch] gh/drisspg/201/base -> origin/gh/drisspg/201/base 2025-10-10T00:44:16.4892003Z * [new branch] gh/drisspg/201/head -> origin/gh/drisspg/201/head 2025-10-10T00:44:16.4893839Z * [new branch] gh/drisspg/201/orig -> origin/gh/drisspg/201/orig 2025-10-10T00:44:16.4896587Z * [new branch] gh/drisspg/202/base -> origin/gh/drisspg/202/base 2025-10-10T00:44:16.4898602Z * [new branch] gh/drisspg/202/head -> origin/gh/drisspg/202/head 2025-10-10T00:44:16.4902939Z * [new branch] gh/drisspg/202/orig -> origin/gh/drisspg/202/orig 2025-10-10T00:44:16.4905575Z * [new branch] gh/drisspg/203/base -> origin/gh/drisspg/203/base 2025-10-10T00:44:16.4907455Z * [new branch] gh/drisspg/203/head -> origin/gh/drisspg/203/head 2025-10-10T00:44:16.4909339Z * [new branch] gh/drisspg/203/orig -> origin/gh/drisspg/203/orig 2025-10-10T00:44:16.4911963Z * [new branch] gh/drisspg/204/base -> origin/gh/drisspg/204/base 2025-10-10T00:44:16.4913800Z * [new branch] gh/drisspg/204/head -> origin/gh/drisspg/204/head 2025-10-10T00:44:16.4915689Z * [new branch] gh/drisspg/204/orig -> origin/gh/drisspg/204/orig 2025-10-10T00:44:16.4919009Z * [new branch] gh/drisspg/205/base -> origin/gh/drisspg/205/base 2025-10-10T00:44:16.4920844Z * [new branch] gh/drisspg/205/head -> origin/gh/drisspg/205/head 2025-10-10T00:44:16.4922681Z * [new branch] gh/drisspg/205/orig -> origin/gh/drisspg/205/orig 2025-10-10T00:44:16.4925396Z * [new branch] gh/drisspg/206/base -> origin/gh/drisspg/206/base 2025-10-10T00:44:16.4927443Z * [new branch] gh/drisspg/206/head -> origin/gh/drisspg/206/head 2025-10-10T00:44:16.4929317Z * [new branch] gh/drisspg/206/orig -> origin/gh/drisspg/206/orig 2025-10-10T00:44:16.4931972Z * [new branch] gh/drisspg/207/base -> origin/gh/drisspg/207/base 2025-10-10T00:44:16.4933801Z * [new branch] gh/drisspg/207/head -> origin/gh/drisspg/207/head 2025-10-10T00:44:16.4935691Z * [new branch] gh/drisspg/207/orig -> origin/gh/drisspg/207/orig 2025-10-10T00:44:16.4938353Z * [new branch] gh/drisspg/208/base -> origin/gh/drisspg/208/base 2025-10-10T00:44:16.4940234Z * [new branch] gh/drisspg/208/head -> origin/gh/drisspg/208/head 2025-10-10T00:44:16.4942151Z * [new branch] gh/drisspg/208/orig -> origin/gh/drisspg/208/orig 2025-10-10T00:44:16.4944730Z * [new branch] gh/drisspg/209/base -> origin/gh/drisspg/209/base 2025-10-10T00:44:16.4946611Z * [new branch] gh/drisspg/209/head -> origin/gh/drisspg/209/head 2025-10-10T00:44:16.4948454Z * [new branch] gh/drisspg/209/orig -> origin/gh/drisspg/209/orig 2025-10-10T00:44:16.4952284Z * [new branch] gh/dsjohns2/1/base -> origin/gh/dsjohns2/1/base 2025-10-10T00:44:16.4954124Z * [new branch] gh/dsjohns2/1/head -> origin/gh/dsjohns2/1/head 2025-10-10T00:44:16.4957367Z * [new branch] gh/eellison/808/base -> origin/gh/eellison/808/base 2025-10-10T00:44:16.4959281Z * [new branch] gh/eellison/808/head -> origin/gh/eellison/808/head 2025-10-10T00:44:16.4961154Z * [new branch] gh/eellison/808/orig -> origin/gh/eellison/808/orig 2025-10-10T00:44:16.4963762Z * [new branch] gh/eellison/809/base -> origin/gh/eellison/809/base 2025-10-10T00:44:16.4965590Z * [new branch] gh/eellison/809/head -> origin/gh/eellison/809/head 2025-10-10T00:44:16.4967803Z * [new branch] gh/eellison/809/orig -> origin/gh/eellison/809/orig 2025-10-10T00:44:16.4970263Z * [new branch] gh/eellison/822/base -> origin/gh/eellison/822/base 2025-10-10T00:44:16.4972131Z * [new branch] gh/eellison/822/head -> origin/gh/eellison/822/head 2025-10-10T00:44:16.4974634Z * [new branch] gh/eellison/822/orig -> origin/gh/eellison/822/orig 2025-10-10T00:44:16.4977416Z * [new branch] gh/eellison/823/base -> origin/gh/eellison/823/base 2025-10-10T00:44:16.4979267Z * [new branch] gh/eellison/823/head -> origin/gh/eellison/823/head 2025-10-10T00:44:16.4981017Z * [new branch] gh/eellison/823/orig -> origin/gh/eellison/823/orig 2025-10-10T00:44:16.4983613Z * [new branch] gh/eellison/824/base -> origin/gh/eellison/824/base 2025-10-10T00:44:16.4985523Z * [new branch] gh/eellison/824/head -> origin/gh/eellison/824/head 2025-10-10T00:44:16.4987384Z * [new branch] gh/eellison/824/orig -> origin/gh/eellison/824/orig 2025-10-10T00:44:16.4990126Z * [new branch] gh/eellison/825/base -> origin/gh/eellison/825/base 2025-10-10T00:44:16.4991987Z * [new branch] gh/eellison/825/head -> origin/gh/eellison/825/head 2025-10-10T00:44:16.4993843Z * [new branch] gh/eellison/825/orig -> origin/gh/eellison/825/orig 2025-10-10T00:44:16.4996415Z * [new branch] gh/eellison/826/base -> origin/gh/eellison/826/base 2025-10-10T00:44:16.4998495Z * [new branch] gh/eellison/826/head -> origin/gh/eellison/826/head 2025-10-10T00:44:16.5000477Z * [new branch] gh/eellison/826/orig -> origin/gh/eellison/826/orig 2025-10-10T00:44:16.5003667Z * [new branch] gh/eellison/827/base -> origin/gh/eellison/827/base 2025-10-10T00:44:16.5005739Z * [new branch] gh/eellison/827/head -> origin/gh/eellison/827/head 2025-10-10T00:44:16.5007541Z * [new branch] gh/eellison/827/orig -> origin/gh/eellison/827/orig 2025-10-10T00:44:16.5010285Z * [new branch] gh/eellison/828/base -> origin/gh/eellison/828/base 2025-10-10T00:44:16.5012080Z * [new branch] gh/eellison/828/head -> origin/gh/eellison/828/head 2025-10-10T00:44:16.5013865Z * [new branch] gh/eellison/828/orig -> origin/gh/eellison/828/orig 2025-10-10T00:44:16.5016711Z * [new branch] gh/eellison/829/base -> origin/gh/eellison/829/base 2025-10-10T00:44:16.5018622Z * [new branch] gh/eellison/829/head -> origin/gh/eellison/829/head 2025-10-10T00:44:16.5020532Z * [new branch] gh/eellison/829/orig -> origin/gh/eellison/829/orig 2025-10-10T00:44:16.5023172Z * [new branch] gh/eellison/830/base -> origin/gh/eellison/830/base 2025-10-10T00:44:16.5025098Z * [new branch] gh/eellison/830/head -> origin/gh/eellison/830/head 2025-10-10T00:44:16.5027545Z * [new branch] gh/eellison/830/orig -> origin/gh/eellison/830/orig 2025-10-10T00:44:16.5030131Z * [new branch] gh/eellison/831/base -> origin/gh/eellison/831/base 2025-10-10T00:44:16.5032088Z * [new branch] gh/eellison/831/head -> origin/gh/eellison/831/head 2025-10-10T00:44:16.5034176Z * [new branch] gh/eellison/831/orig -> origin/gh/eellison/831/orig 2025-10-10T00:44:16.5036765Z * [new branch] gh/eellison/832/base -> origin/gh/eellison/832/base 2025-10-10T00:44:16.5038678Z * [new branch] gh/eellison/832/head -> origin/gh/eellison/832/head 2025-10-10T00:44:16.5040518Z * [new branch] gh/eellison/832/orig -> origin/gh/eellison/832/orig 2025-10-10T00:44:16.5043058Z * [new branch] gh/eellison/833/base -> origin/gh/eellison/833/base 2025-10-10T00:44:16.5044883Z * [new branch] gh/eellison/833/head -> origin/gh/eellison/833/head 2025-10-10T00:44:16.5046716Z * [new branch] gh/eellison/833/orig -> origin/gh/eellison/833/orig 2025-10-10T00:44:16.5049574Z * [new branch] gh/eellison/834/base -> origin/gh/eellison/834/base 2025-10-10T00:44:16.5051477Z * [new branch] gh/eellison/834/head -> origin/gh/eellison/834/head 2025-10-10T00:44:16.5053318Z * [new branch] gh/eellison/834/orig -> origin/gh/eellison/834/orig 2025-10-10T00:44:16.5055936Z * [new branch] gh/eellison/835/base -> origin/gh/eellison/835/base 2025-10-10T00:44:16.5057799Z * [new branch] gh/eellison/835/head -> origin/gh/eellison/835/head 2025-10-10T00:44:16.5059657Z * [new branch] gh/eellison/835/orig -> origin/gh/eellison/835/orig 2025-10-10T00:44:16.5062972Z * [new branch] gh/eellison/836/base -> origin/gh/eellison/836/base 2025-10-10T00:44:16.5064871Z * [new branch] gh/eellison/836/head -> origin/gh/eellison/836/head 2025-10-10T00:44:16.5066753Z * [new branch] gh/eellison/836/orig -> origin/gh/eellison/836/orig 2025-10-10T00:44:16.5069551Z * [new branch] gh/eellison/837/base -> origin/gh/eellison/837/base 2025-10-10T00:44:16.5071376Z * [new branch] gh/eellison/837/head -> origin/gh/eellison/837/head 2025-10-10T00:44:16.5073257Z * [new branch] gh/eellison/837/orig -> origin/gh/eellison/837/orig 2025-10-10T00:44:16.5075945Z * [new branch] gh/eellison/838/base -> origin/gh/eellison/838/base 2025-10-10T00:44:16.5077827Z * [new branch] gh/eellison/838/head -> origin/gh/eellison/838/head 2025-10-10T00:44:16.5079662Z * [new branch] gh/eellison/838/orig -> origin/gh/eellison/838/orig 2025-10-10T00:44:16.5082531Z * [new branch] gh/eellison/839/base -> origin/gh/eellison/839/base 2025-10-10T00:44:16.5084386Z * [new branch] gh/eellison/839/head -> origin/gh/eellison/839/head 2025-10-10T00:44:16.5086378Z * [new branch] gh/eellison/839/orig -> origin/gh/eellison/839/orig 2025-10-10T00:44:16.5089290Z * [new branch] gh/eellison/840/base -> origin/gh/eellison/840/base 2025-10-10T00:44:16.5091150Z * [new branch] gh/eellison/840/head -> origin/gh/eellison/840/head 2025-10-10T00:44:16.5093001Z * [new branch] gh/eellison/840/orig -> origin/gh/eellison/840/orig 2025-10-10T00:44:16.5095602Z * [new branch] gh/eellison/841/base -> origin/gh/eellison/841/base 2025-10-10T00:44:16.5097491Z * [new branch] gh/eellison/841/head -> origin/gh/eellison/841/head 2025-10-10T00:44:16.5099468Z * [new branch] gh/eellison/841/orig -> origin/gh/eellison/841/orig 2025-10-10T00:44:16.5103222Z * [new branch] gh/eellison/842/base -> origin/gh/eellison/842/base 2025-10-10T00:44:16.5105057Z * [new branch] gh/eellison/842/head -> origin/gh/eellison/842/head 2025-10-10T00:44:16.5107041Z * [new branch] gh/eellison/842/orig -> origin/gh/eellison/842/orig 2025-10-10T00:44:16.5110332Z * [new branch] gh/eellison/843/base -> origin/gh/eellison/843/base 2025-10-10T00:44:16.5112102Z * [new branch] gh/eellison/843/head -> origin/gh/eellison/843/head 2025-10-10T00:44:16.5125234Z * [new branch] gh/eellison/843/orig -> origin/gh/eellison/843/orig 2025-10-10T00:44:16.5125748Z * [new branch] gh/eellison/844/base -> origin/gh/eellison/844/base 2025-10-10T00:44:16.5126253Z * [new branch] gh/eellison/844/head -> origin/gh/eellison/844/head 2025-10-10T00:44:16.5126764Z * [new branch] gh/eellison/844/orig -> origin/gh/eellison/844/orig 2025-10-10T00:44:16.5127332Z * [new branch] gh/eellison/845/base -> origin/gh/eellison/845/base 2025-10-10T00:44:16.5127831Z * [new branch] gh/eellison/845/head -> origin/gh/eellison/845/head 2025-10-10T00:44:16.5128334Z * [new branch] gh/eellison/845/orig -> origin/gh/eellison/845/orig 2025-10-10T00:44:16.5129220Z * [new branch] gh/eellison/846/base -> origin/gh/eellison/846/base 2025-10-10T00:44:16.5131341Z * [new branch] gh/eellison/846/head -> origin/gh/eellison/846/head 2025-10-10T00:44:16.5133244Z * [new branch] gh/eellison/846/orig -> origin/gh/eellison/846/orig 2025-10-10T00:44:16.5136438Z * [new branch] gh/etaf/147/base -> origin/gh/etaf/147/base 2025-10-10T00:44:16.5138360Z * [new branch] gh/etaf/147/head -> origin/gh/etaf/147/head 2025-10-10T00:44:16.5141128Z * [new branch] gh/etaf/154/base -> origin/gh/etaf/154/base 2025-10-10T00:44:16.5143169Z * [new branch] gh/etaf/154/head -> origin/gh/etaf/154/head 2025-10-10T00:44:16.5144915Z * [new branch] gh/etaf/154/orig -> origin/gh/etaf/154/orig 2025-10-10T00:44:16.5147410Z * [new branch] gh/etaf/156/base -> origin/gh/etaf/156/base 2025-10-10T00:44:16.5149703Z * [new branch] gh/etaf/156/head -> origin/gh/etaf/156/head 2025-10-10T00:44:16.5151265Z * [new branch] gh/etaf/156/orig -> origin/gh/etaf/156/orig 2025-10-10T00:44:16.5154013Z * [new branch] gh/etaf/157/base -> origin/gh/etaf/157/base 2025-10-10T00:44:16.5155934Z * [new branch] gh/etaf/157/head -> origin/gh/etaf/157/head 2025-10-10T00:44:16.5158321Z * [new branch] gh/etaf/157/orig -> origin/gh/etaf/157/orig 2025-10-10T00:44:16.5160881Z * [new branch] gh/etaf/158/base -> origin/gh/etaf/158/base 2025-10-10T00:44:16.5163674Z * [new branch] gh/etaf/158/head -> origin/gh/etaf/158/head 2025-10-10T00:44:16.5165278Z * [new branch] gh/etaf/158/orig -> origin/gh/etaf/158/orig 2025-10-10T00:44:16.5168085Z * [new branch] gh/etaf/159/base -> origin/gh/etaf/159/base 2025-10-10T00:44:16.5170009Z * [new branch] gh/etaf/159/head -> origin/gh/etaf/159/head 2025-10-10T00:44:16.5171887Z * [new branch] gh/etaf/159/orig -> origin/gh/etaf/159/orig 2025-10-10T00:44:16.5174582Z * [new branch] gh/etaf/160/base -> origin/gh/etaf/160/base 2025-10-10T00:44:16.5176461Z * [new branch] gh/etaf/160/head -> origin/gh/etaf/160/head 2025-10-10T00:44:16.5178828Z * [new branch] gh/etaf/160/orig -> origin/gh/etaf/160/orig 2025-10-10T00:44:16.5181535Z * [new branch] gh/etaf/161/base -> origin/gh/etaf/161/base 2025-10-10T00:44:16.5183440Z * [new branch] gh/etaf/161/head -> origin/gh/etaf/161/head 2025-10-10T00:44:16.5185244Z * [new branch] gh/etaf/161/orig -> origin/gh/etaf/161/orig 2025-10-10T00:44:16.5187944Z * [new branch] gh/etaf/162/base -> origin/gh/etaf/162/base 2025-10-10T00:44:16.5189805Z * [new branch] gh/etaf/162/head -> origin/gh/etaf/162/head 2025-10-10T00:44:16.5191709Z * [new branch] gh/etaf/162/orig -> origin/gh/etaf/162/orig 2025-10-10T00:44:16.5194454Z * [new branch] gh/etaf/166/base -> origin/gh/etaf/166/base 2025-10-10T00:44:16.5196353Z * [new branch] gh/etaf/166/head -> origin/gh/etaf/166/head 2025-10-10T00:44:16.5198205Z * [new branch] gh/etaf/166/orig -> origin/gh/etaf/166/orig 2025-10-10T00:44:16.5201170Z * [new branch] gh/etaf/167/base -> origin/gh/etaf/167/base 2025-10-10T00:44:16.5202952Z * [new branch] gh/etaf/167/head -> origin/gh/etaf/167/head 2025-10-10T00:44:16.5204800Z * [new branch] gh/etaf/167/orig -> origin/gh/etaf/167/orig 2025-10-10T00:44:16.5207543Z * [new branch] gh/etaf/168/base -> origin/gh/etaf/168/base 2025-10-10T00:44:16.5209624Z * [new branch] gh/etaf/168/head -> origin/gh/etaf/168/head 2025-10-10T00:44:16.5211499Z * [new branch] gh/etaf/168/orig -> origin/gh/etaf/168/orig 2025-10-10T00:44:16.5214140Z * [new branch] gh/etaf/170/base -> origin/gh/etaf/170/base 2025-10-10T00:44:16.5216085Z * [new branch] gh/etaf/170/head -> origin/gh/etaf/170/head 2025-10-10T00:44:16.5218062Z * [new branch] gh/etaf/170/orig -> origin/gh/etaf/170/orig 2025-10-10T00:44:16.5220615Z * [new branch] gh/etaf/171/base -> origin/gh/etaf/171/base 2025-10-10T00:44:16.5222549Z * [new branch] gh/etaf/171/head -> origin/gh/etaf/171/head 2025-10-10T00:44:16.5224286Z * [new branch] gh/etaf/171/orig -> origin/gh/etaf/171/orig 2025-10-10T00:44:16.5226810Z * [new branch] gh/etaf/172/base -> origin/gh/etaf/172/base 2025-10-10T00:44:16.5228702Z * [new branch] gh/etaf/172/head -> origin/gh/etaf/172/head 2025-10-10T00:44:16.5230617Z * [new branch] gh/etaf/172/orig -> origin/gh/etaf/172/orig 2025-10-10T00:44:16.5234013Z * [new branch] gh/exclamaforte/1/base -> origin/gh/exclamaforte/1/base 2025-10-10T00:44:16.5235717Z * [new branch] gh/exclamaforte/1/head -> origin/gh/exclamaforte/1/head 2025-10-10T00:44:16.5238335Z * [new branch] gh/exclamaforte/2/base -> origin/gh/exclamaforte/2/base 2025-10-10T00:44:16.5240230Z * [new branch] gh/exclamaforte/2/head -> origin/gh/exclamaforte/2/head 2025-10-10T00:44:16.5242892Z * [new branch] gh/exclamaforte/3/base -> origin/gh/exclamaforte/3/base 2025-10-10T00:44:16.5244630Z * [new branch] gh/exclamaforte/3/head -> origin/gh/exclamaforte/3/head 2025-10-10T00:44:16.5247181Z * [new branch] gh/exclamaforte/4/base -> origin/gh/exclamaforte/4/base 2025-10-10T00:44:16.5249773Z * [new branch] gh/exclamaforte/4/head -> origin/gh/exclamaforte/4/head 2025-10-10T00:44:16.5253044Z * [new branch] gh/ezyang/2374/base -> origin/gh/ezyang/2374/base 2025-10-10T00:44:16.5254983Z * [new branch] gh/ezyang/2374/head -> origin/gh/ezyang/2374/head 2025-10-10T00:44:16.5257088Z * [new branch] gh/ezyang/2374/orig -> origin/gh/ezyang/2374/orig 2025-10-10T00:44:16.5259407Z * [new branch] gh/ezyang/2973/base -> origin/gh/ezyang/2973/base 2025-10-10T00:44:16.5261349Z * [new branch] gh/ezyang/2973/head -> origin/gh/ezyang/2973/head 2025-10-10T00:44:16.5263179Z * [new branch] gh/ezyang/2973/orig -> origin/gh/ezyang/2973/orig 2025-10-10T00:44:16.5265676Z * [new branch] gh/ezyang/2974/base -> origin/gh/ezyang/2974/base 2025-10-10T00:44:16.5267564Z * [new branch] gh/ezyang/2974/head -> origin/gh/ezyang/2974/head 2025-10-10T00:44:16.5269458Z * [new branch] gh/ezyang/2974/orig -> origin/gh/ezyang/2974/orig 2025-10-10T00:44:16.5271973Z * [new branch] gh/ezyang/3120/base -> origin/gh/ezyang/3120/base 2025-10-10T00:44:16.5273828Z * [new branch] gh/ezyang/3120/head -> origin/gh/ezyang/3120/head 2025-10-10T00:44:16.5275727Z * [new branch] gh/ezyang/3120/orig -> origin/gh/ezyang/3120/orig 2025-10-10T00:44:16.5278273Z * [new branch] gh/ezyang/3122/base -> origin/gh/ezyang/3122/base 2025-10-10T00:44:16.5280137Z * [new branch] gh/ezyang/3122/head -> origin/gh/ezyang/3122/head 2025-10-10T00:44:16.5281956Z * [new branch] gh/ezyang/3122/orig -> origin/gh/ezyang/3122/orig 2025-10-10T00:44:16.5284564Z * [new branch] gh/ezyang/3127/base -> origin/gh/ezyang/3127/base 2025-10-10T00:44:16.5286537Z * [new branch] gh/ezyang/3127/head -> origin/gh/ezyang/3127/head 2025-10-10T00:44:16.5288608Z * [new branch] gh/ezyang/3127/orig -> origin/gh/ezyang/3127/orig 2025-10-10T00:44:16.5291159Z * [new branch] gh/ezyang/3131/base -> origin/gh/ezyang/3131/base 2025-10-10T00:44:16.5293048Z * [new branch] gh/ezyang/3131/head -> origin/gh/ezyang/3131/head 2025-10-10T00:44:16.5294921Z * [new branch] gh/ezyang/3131/orig -> origin/gh/ezyang/3131/orig 2025-10-10T00:44:16.5297575Z * [new branch] gh/ezyang/3134/base -> origin/gh/ezyang/3134/base 2025-10-10T00:44:16.5299710Z * [new branch] gh/ezyang/3134/head -> origin/gh/ezyang/3134/head 2025-10-10T00:44:16.5301657Z * [new branch] gh/ezyang/3134/orig -> origin/gh/ezyang/3134/orig 2025-10-10T00:44:16.5304111Z * [new branch] gh/ezyang/3135/base -> origin/gh/ezyang/3135/base 2025-10-10T00:44:16.5305906Z * [new branch] gh/ezyang/3135/head -> origin/gh/ezyang/3135/head 2025-10-10T00:44:16.5307742Z * [new branch] gh/ezyang/3135/orig -> origin/gh/ezyang/3135/orig 2025-10-10T00:44:16.5310375Z * [new branch] gh/ezyang/3138/base -> origin/gh/ezyang/3138/base 2025-10-10T00:44:16.5312334Z * [new branch] gh/ezyang/3138/head -> origin/gh/ezyang/3138/head 2025-10-10T00:44:16.5314183Z * [new branch] gh/ezyang/3138/orig -> origin/gh/ezyang/3138/orig 2025-10-10T00:44:16.5316736Z * [new branch] gh/ezyang/3139/base -> origin/gh/ezyang/3139/base 2025-10-10T00:44:16.5318751Z * [new branch] gh/ezyang/3139/head -> origin/gh/ezyang/3139/head 2025-10-10T00:44:16.5320440Z * [new branch] gh/ezyang/3139/orig -> origin/gh/ezyang/3139/orig 2025-10-10T00:44:16.5323003Z * [new branch] gh/ezyang/3140/base -> origin/gh/ezyang/3140/base 2025-10-10T00:44:16.5324768Z * [new branch] gh/ezyang/3140/head -> origin/gh/ezyang/3140/head 2025-10-10T00:44:16.5326719Z * [new branch] gh/ezyang/3140/orig -> origin/gh/ezyang/3140/orig 2025-10-10T00:44:16.5329467Z * [new branch] gh/ezyang/3143/base -> origin/gh/ezyang/3143/base 2025-10-10T00:44:16.5331307Z * [new branch] gh/ezyang/3143/head -> origin/gh/ezyang/3143/head 2025-10-10T00:44:16.5333147Z * [new branch] gh/ezyang/3143/orig -> origin/gh/ezyang/3143/orig 2025-10-10T00:44:16.5335734Z * [new branch] gh/ezyang/3144/base -> origin/gh/ezyang/3144/base 2025-10-10T00:44:16.5337688Z * [new branch] gh/ezyang/3144/head -> origin/gh/ezyang/3144/head 2025-10-10T00:44:16.5339954Z * [new branch] gh/ezyang/3144/orig -> origin/gh/ezyang/3144/orig 2025-10-10T00:44:16.5342470Z * [new branch] gh/ezyang/3145/base -> origin/gh/ezyang/3145/base 2025-10-10T00:44:16.5344369Z * [new branch] gh/ezyang/3145/head -> origin/gh/ezyang/3145/head 2025-10-10T00:44:16.5346216Z * [new branch] gh/ezyang/3145/orig -> origin/gh/ezyang/3145/orig 2025-10-10T00:44:16.5349958Z * [new branch] gh/ezyang/3146/base -> origin/gh/ezyang/3146/base 2025-10-10T00:44:16.5351840Z * [new branch] gh/ezyang/3146/head -> origin/gh/ezyang/3146/head 2025-10-10T00:44:16.5353715Z * [new branch] gh/ezyang/3146/orig -> origin/gh/ezyang/3146/orig 2025-10-10T00:44:16.5356369Z * [new branch] gh/ezyang/3147/base -> origin/gh/ezyang/3147/base 2025-10-10T00:44:16.5358205Z * [new branch] gh/ezyang/3147/head -> origin/gh/ezyang/3147/head 2025-10-10T00:44:16.5360056Z * [new branch] gh/ezyang/3147/orig -> origin/gh/ezyang/3147/orig 2025-10-10T00:44:16.5362689Z * [new branch] gh/ezyang/3148/base -> origin/gh/ezyang/3148/base 2025-10-10T00:44:16.5364750Z * [new branch] gh/ezyang/3148/head -> origin/gh/ezyang/3148/head 2025-10-10T00:44:16.5366608Z * [new branch] gh/ezyang/3148/orig -> origin/gh/ezyang/3148/orig 2025-10-10T00:44:16.5369414Z * [new branch] gh/ezyang/3149/base -> origin/gh/ezyang/3149/base 2025-10-10T00:44:16.5371253Z * [new branch] gh/ezyang/3149/head -> origin/gh/ezyang/3149/head 2025-10-10T00:44:16.5373639Z * [new branch] gh/ezyang/3149/orig -> origin/gh/ezyang/3149/orig 2025-10-10T00:44:16.5376281Z * [new branch] gh/ezyang/3150/base -> origin/gh/ezyang/3150/base 2025-10-10T00:44:16.5378078Z * [new branch] gh/ezyang/3150/head -> origin/gh/ezyang/3150/head 2025-10-10T00:44:16.5379959Z * [new branch] gh/ezyang/3150/orig -> origin/gh/ezyang/3150/orig 2025-10-10T00:44:16.5382573Z * [new branch] gh/ezyang/3151/base -> origin/gh/ezyang/3151/base 2025-10-10T00:44:16.5384440Z * [new branch] gh/ezyang/3151/head -> origin/gh/ezyang/3151/head 2025-10-10T00:44:16.5386365Z * [new branch] gh/ezyang/3151/orig -> origin/gh/ezyang/3151/orig 2025-10-10T00:44:16.5389060Z * [new branch] gh/ezyang/3152/base -> origin/gh/ezyang/3152/base 2025-10-10T00:44:16.5391531Z * [new branch] gh/ezyang/3152/head -> origin/gh/ezyang/3152/head 2025-10-10T00:44:16.5393398Z * [new branch] gh/ezyang/3152/orig -> origin/gh/ezyang/3152/orig 2025-10-10T00:44:16.5396140Z * [new branch] gh/ezyang/3153/base -> origin/gh/ezyang/3153/base 2025-10-10T00:44:16.5397874Z * [new branch] gh/ezyang/3153/head -> origin/gh/ezyang/3153/head 2025-10-10T00:44:16.5399938Z * [new branch] gh/ezyang/3153/orig -> origin/gh/ezyang/3153/orig 2025-10-10T00:44:16.5403292Z * [new branch] gh/ezyang/3154/base -> origin/gh/ezyang/3154/base 2025-10-10T00:44:16.5404983Z * [new branch] gh/ezyang/3154/head -> origin/gh/ezyang/3154/head 2025-10-10T00:44:16.5406863Z * [new branch] gh/ezyang/3154/orig -> origin/gh/ezyang/3154/orig 2025-10-10T00:44:16.5409630Z * [new branch] gh/ezyang/3155/base -> origin/gh/ezyang/3155/base 2025-10-10T00:44:16.5411519Z * [new branch] gh/ezyang/3155/head -> origin/gh/ezyang/3155/head 2025-10-10T00:44:16.5413598Z * [new branch] gh/ezyang/3155/orig -> origin/gh/ezyang/3155/orig 2025-10-10T00:44:16.5416051Z * [new branch] gh/ezyang/3156/base -> origin/gh/ezyang/3156/base 2025-10-10T00:44:16.5417982Z * [new branch] gh/ezyang/3156/head -> origin/gh/ezyang/3156/head 2025-10-10T00:44:16.5419968Z * [new branch] gh/ezyang/3156/orig -> origin/gh/ezyang/3156/orig 2025-10-10T00:44:16.5422649Z * [new branch] gh/ezyang/3157/base -> origin/gh/ezyang/3157/base 2025-10-10T00:44:16.5424516Z * [new branch] gh/ezyang/3157/head -> origin/gh/ezyang/3157/head 2025-10-10T00:44:16.5426388Z * [new branch] gh/ezyang/3157/orig -> origin/gh/ezyang/3157/orig 2025-10-10T00:44:16.5428959Z * [new branch] gh/ezyang/3158/base -> origin/gh/ezyang/3158/base 2025-10-10T00:44:16.5430840Z * [new branch] gh/ezyang/3158/head -> origin/gh/ezyang/3158/head 2025-10-10T00:44:16.5432788Z * [new branch] gh/ezyang/3158/orig -> origin/gh/ezyang/3158/orig 2025-10-10T00:44:16.5435465Z * [new branch] gh/ezyang/3159/base -> origin/gh/ezyang/3159/base 2025-10-10T00:44:16.5437365Z * [new branch] gh/ezyang/3159/head -> origin/gh/ezyang/3159/head 2025-10-10T00:44:16.5439304Z * [new branch] gh/ezyang/3159/orig -> origin/gh/ezyang/3159/orig 2025-10-10T00:44:16.5441907Z * [new branch] gh/ezyang/3160/base -> origin/gh/ezyang/3160/base 2025-10-10T00:44:16.5443820Z * [new branch] gh/ezyang/3160/head -> origin/gh/ezyang/3160/head 2025-10-10T00:44:16.5445715Z * [new branch] gh/ezyang/3160/orig -> origin/gh/ezyang/3160/orig 2025-10-10T00:44:16.5448523Z * [new branch] gh/ezyang/3161/base -> origin/gh/ezyang/3161/base 2025-10-10T00:44:16.5450409Z * [new branch] gh/ezyang/3161/head -> origin/gh/ezyang/3161/head 2025-10-10T00:44:16.5452322Z * [new branch] gh/ezyang/3161/orig -> origin/gh/ezyang/3161/orig 2025-10-10T00:44:16.5454695Z * [new branch] gh/ezyang/3162/base -> origin/gh/ezyang/3162/base 2025-10-10T00:44:16.5456716Z * [new branch] gh/ezyang/3162/head -> origin/gh/ezyang/3162/head 2025-10-10T00:44:16.5458562Z * [new branch] gh/ezyang/3162/orig -> origin/gh/ezyang/3162/orig 2025-10-10T00:44:16.5461306Z * [new branch] gh/ezyang/3163/base -> origin/gh/ezyang/3163/base 2025-10-10T00:44:16.5463059Z * [new branch] gh/ezyang/3163/head -> origin/gh/ezyang/3163/head 2025-10-10T00:44:16.5465068Z * [new branch] gh/ezyang/3163/orig -> origin/gh/ezyang/3163/orig 2025-10-10T00:44:16.5467677Z * [new branch] gh/ezyang/3164/base -> origin/gh/ezyang/3164/base 2025-10-10T00:44:16.5469640Z * [new branch] gh/ezyang/3164/head -> origin/gh/ezyang/3164/head 2025-10-10T00:44:16.5471509Z * [new branch] gh/ezyang/3164/orig -> origin/gh/ezyang/3164/orig 2025-10-10T00:44:16.5474209Z * [new branch] gh/ezyang/3165/base -> origin/gh/ezyang/3165/base 2025-10-10T00:44:16.5475986Z * [new branch] gh/ezyang/3165/head -> origin/gh/ezyang/3165/head 2025-10-10T00:44:16.5477940Z * [new branch] gh/ezyang/3165/orig -> origin/gh/ezyang/3165/orig 2025-10-10T00:44:16.5480503Z * [new branch] gh/ezyang/3166/base -> origin/gh/ezyang/3166/base 2025-10-10T00:44:16.5482346Z * [new branch] gh/ezyang/3166/head -> origin/gh/ezyang/3166/head 2025-10-10T00:44:16.5484234Z * [new branch] gh/ezyang/3166/orig -> origin/gh/ezyang/3166/orig 2025-10-10T00:44:16.5486971Z * [new branch] gh/ezyang/3167/base -> origin/gh/ezyang/3167/base 2025-10-10T00:44:16.5489047Z * [new branch] gh/ezyang/3167/head -> origin/gh/ezyang/3167/head 2025-10-10T00:44:16.5490935Z * [new branch] gh/ezyang/3167/orig -> origin/gh/ezyang/3167/orig 2025-10-10T00:44:16.5493520Z * [new branch] gh/ezyang/3168/base -> origin/gh/ezyang/3168/base 2025-10-10T00:44:16.5495483Z * [new branch] gh/ezyang/3168/head -> origin/gh/ezyang/3168/head 2025-10-10T00:44:16.5497346Z * [new branch] gh/ezyang/3168/orig -> origin/gh/ezyang/3168/orig 2025-10-10T00:44:16.5502527Z * [new branch] gh/ezyang/3169/base -> origin/gh/ezyang/3169/base 2025-10-10T00:44:16.5504383Z * [new branch] gh/ezyang/3169/head -> origin/gh/ezyang/3169/head 2025-10-10T00:44:16.5506176Z * [new branch] gh/ezyang/3169/orig -> origin/gh/ezyang/3169/orig 2025-10-10T00:44:16.5508814Z * [new branch] gh/ezyang/3170/base -> origin/gh/ezyang/3170/base 2025-10-10T00:44:16.5510679Z * [new branch] gh/ezyang/3170/head -> origin/gh/ezyang/3170/head 2025-10-10T00:44:16.5512637Z * [new branch] gh/ezyang/3170/orig -> origin/gh/ezyang/3170/orig 2025-10-10T00:44:16.5515227Z * [new branch] gh/ezyang/3171/base -> origin/gh/ezyang/3171/base 2025-10-10T00:44:16.5517184Z * [new branch] gh/ezyang/3171/head -> origin/gh/ezyang/3171/head 2025-10-10T00:44:16.5519098Z * [new branch] gh/ezyang/3171/orig -> origin/gh/ezyang/3171/orig 2025-10-10T00:44:16.5521705Z * [new branch] gh/ezyang/3172/base -> origin/gh/ezyang/3172/base 2025-10-10T00:44:16.5523617Z * [new branch] gh/ezyang/3172/head -> origin/gh/ezyang/3172/head 2025-10-10T00:44:16.5525563Z * [new branch] gh/ezyang/3172/orig -> origin/gh/ezyang/3172/orig 2025-10-10T00:44:16.5528394Z * [new branch] gh/ezyang/3173/base -> origin/gh/ezyang/3173/base 2025-10-10T00:44:16.5530250Z * [new branch] gh/ezyang/3173/head -> origin/gh/ezyang/3173/head 2025-10-10T00:44:16.5532042Z * [new branch] gh/ezyang/3173/orig -> origin/gh/ezyang/3173/orig 2025-10-10T00:44:16.5535134Z * [new branch] gh/fadara01/1/base -> origin/gh/fadara01/1/base 2025-10-10T00:44:16.5537020Z * [new branch] gh/fadara01/1/head -> origin/gh/fadara01/1/head 2025-10-10T00:44:16.5538858Z * [new branch] gh/fadara01/1/orig -> origin/gh/fadara01/1/orig 2025-10-10T00:44:16.5542067Z * [new branch] gh/fduwjj/175/base -> origin/gh/fduwjj/175/base 2025-10-10T00:44:16.5544169Z * [new branch] gh/fduwjj/175/head -> origin/gh/fduwjj/175/head 2025-10-10T00:44:16.5546076Z * [new branch] gh/fduwjj/175/orig -> origin/gh/fduwjj/175/orig 2025-10-10T00:44:16.5548774Z * [new branch] gh/fduwjj/176/base -> origin/gh/fduwjj/176/base 2025-10-10T00:44:16.5550715Z * [new branch] gh/fduwjj/176/head -> origin/gh/fduwjj/176/head 2025-10-10T00:44:16.5552672Z * [new branch] gh/fduwjj/176/orig -> origin/gh/fduwjj/176/orig 2025-10-10T00:44:16.5555042Z * [new branch] gh/fduwjj/177/base -> origin/gh/fduwjj/177/base 2025-10-10T00:44:16.5556961Z * [new branch] gh/fduwjj/177/head -> origin/gh/fduwjj/177/head 2025-10-10T00:44:16.5558738Z * [new branch] gh/fduwjj/177/orig -> origin/gh/fduwjj/177/orig 2025-10-10T00:44:16.5561337Z * [new branch] gh/fduwjj/182/base -> origin/gh/fduwjj/182/base 2025-10-10T00:44:16.5563126Z * [new branch] gh/fduwjj/182/head -> origin/gh/fduwjj/182/head 2025-10-10T00:44:16.5564995Z * [new branch] gh/fduwjj/182/orig -> origin/gh/fduwjj/182/orig 2025-10-10T00:44:16.5567745Z * [new branch] gh/fduwjj/183/base -> origin/gh/fduwjj/183/base 2025-10-10T00:44:16.5569966Z * [new branch] gh/fduwjj/183/head -> origin/gh/fduwjj/183/head 2025-10-10T00:44:16.5571817Z * [new branch] gh/fduwjj/183/orig -> origin/gh/fduwjj/183/orig 2025-10-10T00:44:16.5574762Z * [new branch] gh/fduwjj/184/base -> origin/gh/fduwjj/184/base 2025-10-10T00:44:16.5576603Z * [new branch] gh/fduwjj/184/head -> origin/gh/fduwjj/184/head 2025-10-10T00:44:16.5578416Z * [new branch] gh/fduwjj/184/orig -> origin/gh/fduwjj/184/orig 2025-10-10T00:44:16.5581036Z * [new branch] gh/fduwjj/185/base -> origin/gh/fduwjj/185/base 2025-10-10T00:44:16.5582963Z * [new branch] gh/fduwjj/185/head -> origin/gh/fduwjj/185/head 2025-10-10T00:44:16.5585328Z * [new branch] gh/fduwjj/185/orig -> origin/gh/fduwjj/185/orig 2025-10-10T00:44:16.5588204Z * [new branch] gh/fduwjj/191/base -> origin/gh/fduwjj/191/base 2025-10-10T00:44:16.5590180Z * [new branch] gh/fduwjj/191/head -> origin/gh/fduwjj/191/head 2025-10-10T00:44:16.5592038Z * [new branch] gh/fduwjj/191/orig -> origin/gh/fduwjj/191/orig 2025-10-10T00:44:16.5594674Z * [new branch] gh/fduwjj/192/base -> origin/gh/fduwjj/192/base 2025-10-10T00:44:16.5596756Z * [new branch] gh/fduwjj/192/head -> origin/gh/fduwjj/192/head 2025-10-10T00:44:16.5598647Z * [new branch] gh/fduwjj/192/orig -> origin/gh/fduwjj/192/orig 2025-10-10T00:44:16.5601587Z * [new branch] gh/fduwjj/193/base -> origin/gh/fduwjj/193/base 2025-10-10T00:44:16.5603335Z * [new branch] gh/fduwjj/193/head -> origin/gh/fduwjj/193/head 2025-10-10T00:44:16.5605215Z * [new branch] gh/fduwjj/193/orig -> origin/gh/fduwjj/193/orig 2025-10-10T00:44:16.5607918Z * [new branch] gh/fduwjj/194/base -> origin/gh/fduwjj/194/base 2025-10-10T00:44:16.5609724Z * [new branch] gh/fduwjj/194/head -> origin/gh/fduwjj/194/head 2025-10-10T00:44:16.5611549Z * [new branch] gh/fduwjj/194/orig -> origin/gh/fduwjj/194/orig 2025-10-10T00:44:16.5614298Z * [new branch] gh/fduwjj/195/base -> origin/gh/fduwjj/195/base 2025-10-10T00:44:16.5616277Z * [new branch] gh/fduwjj/195/head -> origin/gh/fduwjj/195/head 2025-10-10T00:44:16.5618122Z * [new branch] gh/fduwjj/195/orig -> origin/gh/fduwjj/195/orig 2025-10-10T00:44:16.5620539Z * [new branch] gh/fduwjj/196/base -> origin/gh/fduwjj/196/base 2025-10-10T00:44:16.5622398Z * [new branch] gh/fduwjj/196/head -> origin/gh/fduwjj/196/head 2025-10-10T00:44:16.5624317Z * [new branch] gh/fduwjj/196/orig -> origin/gh/fduwjj/196/orig 2025-10-10T00:44:16.5626757Z * [new branch] gh/fduwjj/197/base -> origin/gh/fduwjj/197/base 2025-10-10T00:44:16.5628555Z * [new branch] gh/fduwjj/197/head -> origin/gh/fduwjj/197/head 2025-10-10T00:44:16.5630604Z * [new branch] gh/fduwjj/197/orig -> origin/gh/fduwjj/197/orig 2025-10-10T00:44:16.5632890Z * [new branch] gh/fduwjj/198/base -> origin/gh/fduwjj/198/base 2025-10-10T00:44:16.5634781Z * [new branch] gh/fduwjj/198/head -> origin/gh/fduwjj/198/head 2025-10-10T00:44:16.5636680Z * [new branch] gh/fduwjj/198/orig -> origin/gh/fduwjj/198/orig 2025-10-10T00:44:16.5639088Z * [new branch] gh/fduwjj/199/base -> origin/gh/fduwjj/199/base 2025-10-10T00:44:16.5640956Z * [new branch] gh/fduwjj/199/head -> origin/gh/fduwjj/199/head 2025-10-10T00:44:16.5642785Z * [new branch] gh/fduwjj/199/orig -> origin/gh/fduwjj/199/orig 2025-10-10T00:44:16.5645717Z * [new branch] gh/fduwjj/200/base -> origin/gh/fduwjj/200/base 2025-10-10T00:44:16.5647734Z * [new branch] gh/fduwjj/200/head -> origin/gh/fduwjj/200/head 2025-10-10T00:44:16.5649639Z * [new branch] gh/fduwjj/200/orig -> origin/gh/fduwjj/200/orig 2025-10-10T00:44:16.5652243Z * [new branch] gh/fduwjj/201/base -> origin/gh/fduwjj/201/base 2025-10-10T00:44:16.5654164Z * [new branch] gh/fduwjj/201/head -> origin/gh/fduwjj/201/head 2025-10-10T00:44:16.5656028Z * [new branch] gh/fduwjj/201/orig -> origin/gh/fduwjj/201/orig 2025-10-10T00:44:16.5658674Z * [new branch] gh/fduwjj/202/base -> origin/gh/fduwjj/202/base 2025-10-10T00:44:16.5660563Z * [new branch] gh/fduwjj/202/head -> origin/gh/fduwjj/202/head 2025-10-10T00:44:16.5662355Z * [new branch] gh/fduwjj/202/orig -> origin/gh/fduwjj/202/orig 2025-10-10T00:44:16.5665163Z * [new branch] gh/fduwjj/203/base -> origin/gh/fduwjj/203/base 2025-10-10T00:44:16.5667082Z * [new branch] gh/fduwjj/203/head -> origin/gh/fduwjj/203/head 2025-10-10T00:44:16.5669020Z * [new branch] gh/fduwjj/203/orig -> origin/gh/fduwjj/203/orig 2025-10-10T00:44:16.5672023Z * [new branch] gh/fduwjj/204/base -> origin/gh/fduwjj/204/base 2025-10-10T00:44:16.5673827Z * [new branch] gh/fduwjj/204/head -> origin/gh/fduwjj/204/head 2025-10-10T00:44:16.5676080Z * [new branch] gh/fduwjj/204/orig -> origin/gh/fduwjj/204/orig 2025-10-10T00:44:16.5678843Z * [new branch] gh/fduwjj/205/base -> origin/gh/fduwjj/205/base 2025-10-10T00:44:16.5680756Z * [new branch] gh/fduwjj/205/head -> origin/gh/fduwjj/205/head 2025-10-10T00:44:16.5682617Z * [new branch] gh/fduwjj/205/orig -> origin/gh/fduwjj/205/orig 2025-10-10T00:44:16.5685266Z * [new branch] gh/fduwjj/206/base -> origin/gh/fduwjj/206/base 2025-10-10T00:44:16.5687377Z * [new branch] gh/fduwjj/206/head -> origin/gh/fduwjj/206/head 2025-10-10T00:44:16.5689430Z * [new branch] gh/fduwjj/206/orig -> origin/gh/fduwjj/206/orig 2025-10-10T00:44:16.5692149Z * [new branch] gh/fduwjj/207/base -> origin/gh/fduwjj/207/base 2025-10-10T00:44:16.5694028Z * [new branch] gh/fduwjj/207/head -> origin/gh/fduwjj/207/head 2025-10-10T00:44:16.5695898Z * [new branch] gh/fduwjj/207/orig -> origin/gh/fduwjj/207/orig 2025-10-10T00:44:16.5698724Z * [new branch] gh/fduwjj/208/base -> origin/gh/fduwjj/208/base 2025-10-10T00:44:16.5700807Z * [new branch] gh/fduwjj/208/head -> origin/gh/fduwjj/208/head 2025-10-10T00:44:16.5702584Z * [new branch] gh/fduwjj/208/orig -> origin/gh/fduwjj/208/orig 2025-10-10T00:44:16.5705176Z * [new branch] gh/fduwjj/209/base -> origin/gh/fduwjj/209/base 2025-10-10T00:44:16.5707256Z * [new branch] gh/fduwjj/209/head -> origin/gh/fduwjj/209/head 2025-10-10T00:44:16.5709022Z * [new branch] gh/fduwjj/209/orig -> origin/gh/fduwjj/209/orig 2025-10-10T00:44:16.5711715Z * [new branch] gh/fduwjj/210/base -> origin/gh/fduwjj/210/base 2025-10-10T00:44:16.5713471Z * [new branch] gh/fduwjj/210/head -> origin/gh/fduwjj/210/head 2025-10-10T00:44:16.5715434Z * [new branch] gh/fduwjj/210/orig -> origin/gh/fduwjj/210/orig 2025-10-10T00:44:16.5718103Z * [new branch] gh/fduwjj/211/base -> origin/gh/fduwjj/211/base 2025-10-10T00:44:16.5720000Z * [new branch] gh/fduwjj/211/head -> origin/gh/fduwjj/211/head 2025-10-10T00:44:16.5721848Z * [new branch] gh/fduwjj/211/orig -> origin/gh/fduwjj/211/orig 2025-10-10T00:44:16.5724479Z * [new branch] gh/fduwjj/212/base -> origin/gh/fduwjj/212/base 2025-10-10T00:44:16.5726814Z * [new branch] gh/fduwjj/212/head -> origin/gh/fduwjj/212/head 2025-10-10T00:44:16.5728855Z * [new branch] gh/fduwjj/212/orig -> origin/gh/fduwjj/212/orig 2025-10-10T00:44:16.5731581Z * [new branch] gh/fduwjj/213/base -> origin/gh/fduwjj/213/base 2025-10-10T00:44:16.5733744Z * [new branch] gh/fduwjj/213/head -> origin/gh/fduwjj/213/head 2025-10-10T00:44:16.5734989Z * [new branch] gh/fduwjj/213/orig -> origin/gh/fduwjj/213/orig 2025-10-10T00:44:16.5738186Z * [new branch] gh/fduwjj/214/base -> origin/gh/fduwjj/214/base 2025-10-10T00:44:16.5740001Z * [new branch] gh/fduwjj/214/head -> origin/gh/fduwjj/214/head 2025-10-10T00:44:16.5741790Z * [new branch] gh/fduwjj/214/orig -> origin/gh/fduwjj/214/orig 2025-10-10T00:44:16.5744432Z * [new branch] gh/fduwjj/215/base -> origin/gh/fduwjj/215/base 2025-10-10T00:44:16.5746250Z * [new branch] gh/fduwjj/215/head -> origin/gh/fduwjj/215/head 2025-10-10T00:44:16.5748071Z * [new branch] gh/fduwjj/215/orig -> origin/gh/fduwjj/215/orig 2025-10-10T00:44:16.5750749Z * [new branch] gh/fduwjj/216/base -> origin/gh/fduwjj/216/base 2025-10-10T00:44:16.5752494Z * [new branch] gh/fduwjj/216/head -> origin/gh/fduwjj/216/head 2025-10-10T00:44:16.5754271Z * [new branch] gh/fduwjj/216/orig -> origin/gh/fduwjj/216/orig 2025-10-10T00:44:16.5757216Z * [new branch] gh/fduwjj/217/base -> origin/gh/fduwjj/217/base 2025-10-10T00:44:16.5759064Z * [new branch] gh/fduwjj/217/head -> origin/gh/fduwjj/217/head 2025-10-10T00:44:16.5760830Z * [new branch] gh/fduwjj/217/orig -> origin/gh/fduwjj/217/orig 2025-10-10T00:44:16.5763473Z * [new branch] gh/fduwjj/218/base -> origin/gh/fduwjj/218/base 2025-10-10T00:44:16.5765384Z * [new branch] gh/fduwjj/218/head -> origin/gh/fduwjj/218/head 2025-10-10T00:44:16.5767267Z * [new branch] gh/fduwjj/218/orig -> origin/gh/fduwjj/218/orig 2025-10-10T00:44:16.5770075Z * [new branch] gh/fduwjj/219/base -> origin/gh/fduwjj/219/base 2025-10-10T00:44:16.5772177Z * [new branch] gh/fduwjj/219/head -> origin/gh/fduwjj/219/head 2025-10-10T00:44:16.5774034Z * [new branch] gh/fduwjj/219/orig -> origin/gh/fduwjj/219/orig 2025-10-10T00:44:16.5776651Z * [new branch] gh/fduwjj/220/base -> origin/gh/fduwjj/220/base 2025-10-10T00:44:16.5778601Z * [new branch] gh/fduwjj/220/head -> origin/gh/fduwjj/220/head 2025-10-10T00:44:16.5780462Z * [new branch] gh/fduwjj/220/orig -> origin/gh/fduwjj/220/orig 2025-10-10T00:44:16.5783154Z * [new branch] gh/fduwjj/221/base -> origin/gh/fduwjj/221/base 2025-10-10T00:44:16.5785468Z * [new branch] gh/fduwjj/221/head -> origin/gh/fduwjj/221/head 2025-10-10T00:44:16.5786623Z * [new branch] gh/fduwjj/221/orig -> origin/gh/fduwjj/221/orig 2025-10-10T00:44:16.5789854Z * [new branch] gh/fduwjj/222/base -> origin/gh/fduwjj/222/base 2025-10-10T00:44:16.5790982Z * [new branch] gh/fduwjj/222/head -> origin/gh/fduwjj/222/head 2025-10-10T00:44:16.5793305Z * [new branch] gh/fduwjj/222/orig -> origin/gh/fduwjj/222/orig 2025-10-10T00:44:16.5796350Z * [new branch] gh/fduwjj/223/base -> origin/gh/fduwjj/223/base 2025-10-10T00:44:16.5798184Z * [new branch] gh/fduwjj/223/head -> origin/gh/fduwjj/223/head 2025-10-10T00:44:16.5801536Z * [new branch] gh/fduwjj/223/orig -> origin/gh/fduwjj/223/orig 2025-10-10T00:44:16.5804520Z * [new branch] gh/fegin/313/base -> origin/gh/fegin/313/base 2025-10-10T00:44:16.5806367Z * [new branch] gh/fegin/313/head -> origin/gh/fegin/313/head 2025-10-10T00:44:16.5808500Z * [new branch] gh/fegin/313/orig -> origin/gh/fegin/313/orig 2025-10-10T00:44:16.5811079Z * [new branch] gh/fegin/314/base -> origin/gh/fegin/314/base 2025-10-10T00:44:16.5812952Z * [new branch] gh/fegin/314/head -> origin/gh/fegin/314/head 2025-10-10T00:44:16.5814791Z * [new branch] gh/fegin/314/orig -> origin/gh/fegin/314/orig 2025-10-10T00:44:16.5817250Z * [new branch] gh/fegin/315/base -> origin/gh/fegin/315/base 2025-10-10T00:44:16.5819146Z * [new branch] gh/fegin/315/head -> origin/gh/fegin/315/head 2025-10-10T00:44:16.5821017Z * [new branch] gh/fegin/315/orig -> origin/gh/fegin/315/orig 2025-10-10T00:44:16.5823562Z * [new branch] gh/fegin/316/base -> origin/gh/fegin/316/base 2025-10-10T00:44:16.5825436Z * [new branch] gh/fegin/316/head -> origin/gh/fegin/316/head 2025-10-10T00:44:16.5827270Z * [new branch] gh/fegin/316/orig -> origin/gh/fegin/316/orig 2025-10-10T00:44:16.5829904Z * [new branch] gh/fegin/317/base -> origin/gh/fegin/317/base 2025-10-10T00:44:16.5831740Z * [new branch] gh/fegin/317/head -> origin/gh/fegin/317/head 2025-10-10T00:44:16.5833761Z * [new branch] gh/fegin/317/orig -> origin/gh/fegin/317/orig 2025-10-10T00:44:16.5836345Z * [new branch] gh/fegin/318/base -> origin/gh/fegin/318/base 2025-10-10T00:44:16.5838177Z * [new branch] gh/fegin/318/head -> origin/gh/fegin/318/head 2025-10-10T00:44:16.5840261Z * [new branch] gh/fegin/318/orig -> origin/gh/fegin/318/orig 2025-10-10T00:44:16.5842809Z * [new branch] gh/fegin/319/base -> origin/gh/fegin/319/base 2025-10-10T00:44:16.5844773Z * [new branch] gh/fegin/319/head -> origin/gh/fegin/319/head 2025-10-10T00:44:16.5846627Z * [new branch] gh/fegin/319/orig -> origin/gh/fegin/319/orig 2025-10-10T00:44:16.5849360Z * [new branch] gh/fegin/320/base -> origin/gh/fegin/320/base 2025-10-10T00:44:16.5851198Z * [new branch] gh/fegin/320/head -> origin/gh/fegin/320/head 2025-10-10T00:44:16.5852959Z * [new branch] gh/fegin/320/orig -> origin/gh/fegin/320/orig 2025-10-10T00:44:16.5855574Z * [new branch] gh/fegin/321/base -> origin/gh/fegin/321/base 2025-10-10T00:44:16.5857415Z * [new branch] gh/fegin/321/head -> origin/gh/fegin/321/head 2025-10-10T00:44:16.5859392Z * [new branch] gh/fegin/321/orig -> origin/gh/fegin/321/orig 2025-10-10T00:44:16.5861795Z * [new branch] gh/fegin/322/base -> origin/gh/fegin/322/base 2025-10-10T00:44:16.5863834Z * [new branch] gh/fegin/322/head -> origin/gh/fegin/322/head 2025-10-10T00:44:16.5865565Z * [new branch] gh/fegin/322/orig -> origin/gh/fegin/322/orig 2025-10-10T00:44:16.5868104Z * [new branch] gh/fegin/323/base -> origin/gh/fegin/323/base 2025-10-10T00:44:16.5869984Z * [new branch] gh/fegin/323/head -> origin/gh/fegin/323/head 2025-10-10T00:44:16.5872642Z * [new branch] gh/fegin/324/base -> origin/gh/fegin/324/base 2025-10-10T00:44:16.5874681Z * [new branch] gh/fegin/324/head -> origin/gh/fegin/324/head 2025-10-10T00:44:16.5876588Z * [new branch] gh/fegin/324/orig -> origin/gh/fegin/324/orig 2025-10-10T00:44:16.5879183Z * [new branch] gh/fegin/325/base -> origin/gh/fegin/325/base 2025-10-10T00:44:16.5881096Z * [new branch] gh/fegin/325/head -> origin/gh/fegin/325/head 2025-10-10T00:44:16.5882949Z * [new branch] gh/fegin/325/orig -> origin/gh/fegin/325/orig 2025-10-10T00:44:16.5885561Z * [new branch] gh/fegin/326/base -> origin/gh/fegin/326/base 2025-10-10T00:44:16.5917581Z * [new branch] gh/fegin/326/head -> origin/gh/fegin/326/head 2025-10-10T00:44:16.5918171Z * [new branch] gh/fegin/326/orig -> origin/gh/fegin/326/orig 2025-10-10T00:44:16.5918662Z * [new branch] gh/fegin/327/base -> origin/gh/fegin/327/base 2025-10-10T00:44:16.5919150Z * [new branch] gh/fegin/327/head -> origin/gh/fegin/327/head 2025-10-10T00:44:16.5919643Z * [new branch] gh/fegin/327/orig -> origin/gh/fegin/327/orig 2025-10-10T00:44:16.5920139Z * [new branch] gh/fffrog/133/base -> origin/gh/fffrog/133/base 2025-10-10T00:44:16.5920636Z * [new branch] gh/fffrog/133/head -> origin/gh/fffrog/133/head 2025-10-10T00:44:16.5921134Z * [new branch] gh/fffrog/133/orig -> origin/gh/fffrog/133/orig 2025-10-10T00:44:16.5921627Z * [new branch] gh/fffrog/137/base -> origin/gh/fffrog/137/base 2025-10-10T00:44:16.5922118Z * [new branch] gh/fffrog/137/head -> origin/gh/fffrog/137/head 2025-10-10T00:44:16.5922608Z * [new branch] gh/fffrog/137/orig -> origin/gh/fffrog/137/orig 2025-10-10T00:44:16.5923087Z * [new branch] gh/fffrog/147/base -> origin/gh/fffrog/147/base 2025-10-10T00:44:16.5923286Z * [new branch] gh/fffrog/147/head -> origin/gh/fffrog/147/head 2025-10-10T00:44:16.5923478Z * [new branch] gh/fffrog/147/orig -> origin/gh/fffrog/147/orig 2025-10-10T00:44:16.5923682Z * [new branch] gh/fffrog/149/base -> origin/gh/fffrog/149/base 2025-10-10T00:44:16.5923878Z * [new branch] gh/fffrog/149/head -> origin/gh/fffrog/149/head 2025-10-10T00:44:16.5924264Z * [new branch] gh/fffrog/149/orig -> origin/gh/fffrog/149/orig 2025-10-10T00:44:16.5925126Z * [new branch] gh/fffrog/150/base -> origin/gh/fffrog/150/base 2025-10-10T00:44:16.5927311Z * [new branch] gh/fffrog/150/head -> origin/gh/fffrog/150/head 2025-10-10T00:44:16.5929295Z * [new branch] gh/fffrog/150/orig -> origin/gh/fffrog/150/orig 2025-10-10T00:44:16.5931856Z * [new branch] gh/fffrog/153/base -> origin/gh/fffrog/153/base 2025-10-10T00:44:16.5933700Z * [new branch] gh/fffrog/153/head -> origin/gh/fffrog/153/head 2025-10-10T00:44:16.5935533Z * [new branch] gh/fffrog/153/orig -> origin/gh/fffrog/153/orig 2025-10-10T00:44:16.5938216Z * [new branch] gh/fffrog/154/base -> origin/gh/fffrog/154/base 2025-10-10T00:44:16.5940147Z * [new branch] gh/fffrog/154/head -> origin/gh/fffrog/154/head 2025-10-10T00:44:16.5941892Z * [new branch] gh/fffrog/154/orig -> origin/gh/fffrog/154/orig 2025-10-10T00:44:16.5944633Z * [new branch] gh/fffrog/155/base -> origin/gh/fffrog/155/base 2025-10-10T00:44:16.5946404Z * [new branch] gh/fffrog/155/head -> origin/gh/fffrog/155/head 2025-10-10T00:44:16.5948247Z * [new branch] gh/fffrog/155/orig -> origin/gh/fffrog/155/orig 2025-10-10T00:44:16.5950726Z * [new branch] gh/fffrog/156/base -> origin/gh/fffrog/156/base 2025-10-10T00:44:16.5952542Z * [new branch] gh/fffrog/156/head -> origin/gh/fffrog/156/head 2025-10-10T00:44:16.5954420Z * [new branch] gh/fffrog/156/orig -> origin/gh/fffrog/156/orig 2025-10-10T00:44:16.5956906Z * [new branch] gh/fffrog/157/base -> origin/gh/fffrog/157/base 2025-10-10T00:44:16.5958844Z * [new branch] gh/fffrog/157/head -> origin/gh/fffrog/157/head 2025-10-10T00:44:16.5960685Z * [new branch] gh/fffrog/157/orig -> origin/gh/fffrog/157/orig 2025-10-10T00:44:16.5963245Z * [new branch] gh/fffrog/158/base -> origin/gh/fffrog/158/base 2025-10-10T00:44:16.5965096Z * [new branch] gh/fffrog/158/head -> origin/gh/fffrog/158/head 2025-10-10T00:44:16.5967041Z * [new branch] gh/fffrog/158/orig -> origin/gh/fffrog/158/orig 2025-10-10T00:44:16.5969809Z * [new branch] gh/fffrog/159/base -> origin/gh/fffrog/159/base 2025-10-10T00:44:16.5972143Z * [new branch] gh/fffrog/159/head -> origin/gh/fffrog/159/head 2025-10-10T00:44:16.5974171Z * [new branch] gh/fffrog/159/orig -> origin/gh/fffrog/159/orig 2025-10-10T00:44:16.5976876Z * [new branch] gh/fffrog/160/base -> origin/gh/fffrog/160/base 2025-10-10T00:44:16.5978627Z * [new branch] gh/fffrog/160/head -> origin/gh/fffrog/160/head 2025-10-10T00:44:16.5981184Z * [new branch] gh/fffrog/161/base -> origin/gh/fffrog/161/base 2025-10-10T00:44:16.5983030Z * [new branch] gh/fffrog/161/head -> origin/gh/fffrog/161/head 2025-10-10T00:44:16.5984839Z * [new branch] gh/fffrog/161/orig -> origin/gh/fffrog/161/orig 2025-10-10T00:44:16.5987391Z * [new branch] gh/fffrog/162/base -> origin/gh/fffrog/162/base 2025-10-10T00:44:16.5989345Z * [new branch] gh/fffrog/162/head -> origin/gh/fffrog/162/head 2025-10-10T00:44:16.5991095Z * [new branch] gh/fffrog/162/orig -> origin/gh/fffrog/162/orig 2025-10-10T00:44:16.5993743Z * [new branch] gh/fffrog/163/base -> origin/gh/fffrog/163/base 2025-10-10T00:44:16.5995630Z * [new branch] gh/fffrog/163/head -> origin/gh/fffrog/163/head 2025-10-10T00:44:16.5997690Z * [new branch] gh/fffrog/163/orig -> origin/gh/fffrog/163/orig 2025-10-10T00:44:16.6000819Z * [new branch] gh/fffrog/164/base -> origin/gh/fffrog/164/base 2025-10-10T00:44:16.6002698Z * [new branch] gh/fffrog/164/head -> origin/gh/fffrog/164/head 2025-10-10T00:44:16.6004569Z * [new branch] gh/fffrog/164/orig -> origin/gh/fffrog/164/orig 2025-10-10T00:44:16.6007251Z * [new branch] gh/fffrog/165/base -> origin/gh/fffrog/165/base 2025-10-10T00:44:16.6009568Z * [new branch] gh/fffrog/165/head -> origin/gh/fffrog/165/head 2025-10-10T00:44:16.6011374Z * [new branch] gh/fffrog/165/orig -> origin/gh/fffrog/165/orig 2025-10-10T00:44:16.6014093Z * [new branch] gh/fffrog/166/base -> origin/gh/fffrog/166/base 2025-10-10T00:44:16.6015918Z * [new branch] gh/fffrog/166/head -> origin/gh/fffrog/166/head 2025-10-10T00:44:16.6017967Z * [new branch] gh/fffrog/166/orig -> origin/gh/fffrog/166/orig 2025-10-10T00:44:16.6020340Z * [new branch] gh/fffrog/167/base -> origin/gh/fffrog/167/base 2025-10-10T00:44:16.6022224Z * [new branch] gh/fffrog/167/head -> origin/gh/fffrog/167/head 2025-10-10T00:44:16.6024093Z * [new branch] gh/fffrog/167/orig -> origin/gh/fffrog/167/orig 2025-10-10T00:44:16.6026690Z * [new branch] gh/fffrog/168/base -> origin/gh/fffrog/168/base 2025-10-10T00:44:16.6028555Z * [new branch] gh/fffrog/168/head -> origin/gh/fffrog/168/head 2025-10-10T00:44:16.6030372Z * [new branch] gh/fffrog/168/orig -> origin/gh/fffrog/168/orig 2025-10-10T00:44:16.6033089Z * [new branch] gh/fffrog/169/base -> origin/gh/fffrog/169/base 2025-10-10T00:44:16.6035127Z * [new branch] gh/fffrog/169/head -> origin/gh/fffrog/169/head 2025-10-10T00:44:16.6036956Z * [new branch] gh/fffrog/169/orig -> origin/gh/fffrog/169/orig 2025-10-10T00:44:16.6039744Z * [new branch] gh/fffrog/170/base -> origin/gh/fffrog/170/base 2025-10-10T00:44:16.6041528Z * [new branch] gh/fffrog/170/head -> origin/gh/fffrog/170/head 2025-10-10T00:44:16.6043507Z * [new branch] gh/fffrog/170/orig -> origin/gh/fffrog/170/orig 2025-10-10T00:44:16.6046365Z * [new branch] gh/fffrog/171/base -> origin/gh/fffrog/171/base 2025-10-10T00:44:16.6048419Z * [new branch] gh/fffrog/171/head -> origin/gh/fffrog/171/head 2025-10-10T00:44:16.6050262Z * [new branch] gh/fffrog/171/orig -> origin/gh/fffrog/171/orig 2025-10-10T00:44:16.6052890Z * [new branch] gh/fffrog/172/base -> origin/gh/fffrog/172/base 2025-10-10T00:44:16.6054719Z * [new branch] gh/fffrog/172/head -> origin/gh/fffrog/172/head 2025-10-10T00:44:16.6056663Z * [new branch] gh/fffrog/172/orig -> origin/gh/fffrog/172/orig 2025-10-10T00:44:16.6059250Z * [new branch] gh/fffrog/173/base -> origin/gh/fffrog/173/base 2025-10-10T00:44:16.6061130Z * [new branch] gh/fffrog/173/head -> origin/gh/fffrog/173/head 2025-10-10T00:44:16.6062763Z * [new branch] gh/fffrog/173/orig -> origin/gh/fffrog/173/orig 2025-10-10T00:44:16.6065623Z * [new branch] gh/fffrog/174/base -> origin/gh/fffrog/174/base 2025-10-10T00:44:16.6067501Z * [new branch] gh/fffrog/174/head -> origin/gh/fffrog/174/head 2025-10-10T00:44:16.6069359Z * [new branch] gh/fffrog/174/orig -> origin/gh/fffrog/174/orig 2025-10-10T00:44:16.6072037Z * [new branch] gh/fffrog/175/base -> origin/gh/fffrog/175/base 2025-10-10T00:44:16.6073851Z * [new branch] gh/fffrog/175/head -> origin/gh/fffrog/175/head 2025-10-10T00:44:16.6075756Z * [new branch] gh/fffrog/175/orig -> origin/gh/fffrog/175/orig 2025-10-10T00:44:16.6078499Z * [new branch] gh/fffrog/176/base -> origin/gh/fffrog/176/base 2025-10-10T00:44:16.6080327Z * [new branch] gh/fffrog/176/head -> origin/gh/fffrog/176/head 2025-10-10T00:44:16.6082335Z * [new branch] gh/fffrog/176/orig -> origin/gh/fffrog/176/orig 2025-10-10T00:44:16.6085306Z * [new branch] gh/fxdawnn/1/base -> origin/gh/fxdawnn/1/base 2025-10-10T00:44:16.6087285Z * [new branch] gh/fxdawnn/1/head -> origin/gh/fxdawnn/1/head 2025-10-10T00:44:16.6089387Z * [new branch] gh/fxdawnn/1/orig -> origin/gh/fxdawnn/1/orig 2025-10-10T00:44:16.6091832Z * [new branch] gh/fxdawnn/2/base -> origin/gh/fxdawnn/2/base 2025-10-10T00:44:16.6093720Z * [new branch] gh/fxdawnn/2/head -> origin/gh/fxdawnn/2/head 2025-10-10T00:44:16.6095725Z * [new branch] gh/fxdawnn/2/orig -> origin/gh/fxdawnn/2/orig 2025-10-10T00:44:16.6098518Z * [new branch] gh/fxdawnn/3/base -> origin/gh/fxdawnn/3/base 2025-10-10T00:44:16.6102456Z * [new branch] gh/fxdawnn/3/head -> origin/gh/fxdawnn/3/head 2025-10-10T00:44:16.6104045Z * [new branch] gh/fxdawnn/3/orig -> origin/gh/fxdawnn/3/orig 2025-10-10T00:44:16.6106679Z * [new branch] gh/fxdawnn/4/base -> origin/gh/fxdawnn/4/base 2025-10-10T00:44:16.6108522Z * [new branch] gh/fxdawnn/4/orig -> origin/gh/fxdawnn/4/orig 2025-10-10T00:44:16.6111592Z * [new branch] gh/gmagogsfm/1/base -> origin/gh/gmagogsfm/1/base 2025-10-10T00:44:16.6114084Z * [new branch] gh/gmagogsfm/1/head -> origin/gh/gmagogsfm/1/head 2025-10-10T00:44:16.6116048Z * [new branch] gh/gmagogsfm/1/orig -> origin/gh/gmagogsfm/1/orig 2025-10-10T00:44:16.6118537Z * [new branch] gh/gmagogsfm/2/base -> origin/gh/gmagogsfm/2/base 2025-10-10T00:44:16.6120350Z * [new branch] gh/gmagogsfm/2/head -> origin/gh/gmagogsfm/2/head 2025-10-10T00:44:16.6122136Z * [new branch] gh/gmagogsfm/2/orig -> origin/gh/gmagogsfm/2/orig 2025-10-10T00:44:16.6124609Z * [new branch] gh/gmagogsfm/3/base -> origin/gh/gmagogsfm/3/base 2025-10-10T00:44:16.6126975Z * [new branch] gh/gmagogsfm/3/head -> origin/gh/gmagogsfm/3/head 2025-10-10T00:44:16.6129029Z * [new branch] gh/gmagogsfm/3/orig -> origin/gh/gmagogsfm/3/orig 2025-10-10T00:44:16.6132419Z * [new branch] gh/guangyey/134/base -> origin/gh/guangyey/134/base 2025-10-10T00:44:16.6134603Z * [new branch] gh/guangyey/134/head -> origin/gh/guangyey/134/head 2025-10-10T00:44:16.6136528Z * [new branch] gh/guangyey/134/orig -> origin/gh/guangyey/134/orig 2025-10-10T00:44:16.6139042Z * [new branch] gh/guangyey/135/base -> origin/gh/guangyey/135/base 2025-10-10T00:44:16.6141008Z * [new branch] gh/guangyey/135/head -> origin/gh/guangyey/135/head 2025-10-10T00:44:16.6142905Z * [new branch] gh/guangyey/135/orig -> origin/gh/guangyey/135/orig 2025-10-10T00:44:16.6145407Z * [new branch] gh/guangyey/139/base -> origin/gh/guangyey/139/base 2025-10-10T00:44:16.6147308Z * [new branch] gh/guangyey/139/head -> origin/gh/guangyey/139/head 2025-10-10T00:44:16.6149149Z * [new branch] gh/guangyey/139/orig -> origin/gh/guangyey/139/orig 2025-10-10T00:44:16.6151653Z * [new branch] gh/guangyey/140/base -> origin/gh/guangyey/140/base 2025-10-10T00:44:16.6153554Z * [new branch] gh/guangyey/140/head -> origin/gh/guangyey/140/head 2025-10-10T00:44:16.6155364Z * [new branch] gh/guangyey/140/orig -> origin/gh/guangyey/140/orig 2025-10-10T00:44:16.6157907Z * [new branch] gh/guangyey/142/base -> origin/gh/guangyey/142/base 2025-10-10T00:44:16.6159845Z * [new branch] gh/guangyey/142/head -> origin/gh/guangyey/142/head 2025-10-10T00:44:16.6161644Z * [new branch] gh/guangyey/142/orig -> origin/gh/guangyey/142/orig 2025-10-10T00:44:16.6164188Z * [new branch] gh/guangyey/163/base -> origin/gh/guangyey/163/base 2025-10-10T00:44:16.6166124Z * [new branch] gh/guangyey/163/head -> origin/gh/guangyey/163/head 2025-10-10T00:44:16.6168163Z * [new branch] gh/guangyey/163/orig -> origin/gh/guangyey/163/orig 2025-10-10T00:44:16.6170793Z * [new branch] gh/guangyey/168/base -> origin/gh/guangyey/168/base 2025-10-10T00:44:16.6172675Z * [new branch] gh/guangyey/168/head -> origin/gh/guangyey/168/head 2025-10-10T00:44:16.6174724Z * [new branch] gh/guangyey/168/orig -> origin/gh/guangyey/168/orig 2025-10-10T00:44:16.6177573Z * [new branch] gh/guangyey/169/base -> origin/gh/guangyey/169/base 2025-10-10T00:44:16.6179474Z * [new branch] gh/guangyey/169/head -> origin/gh/guangyey/169/head 2025-10-10T00:44:16.6181371Z * [new branch] gh/guangyey/169/orig -> origin/gh/guangyey/169/orig 2025-10-10T00:44:16.6183910Z * [new branch] gh/guangyey/170/base -> origin/gh/guangyey/170/base 2025-10-10T00:44:16.6185764Z * [new branch] gh/guangyey/170/head -> origin/gh/guangyey/170/head 2025-10-10T00:44:16.6187629Z * [new branch] gh/guangyey/170/orig -> origin/gh/guangyey/170/orig 2025-10-10T00:44:16.6190198Z * [new branch] gh/guangyey/171/base -> origin/gh/guangyey/171/base 2025-10-10T00:44:16.6192250Z * [new branch] gh/guangyey/171/head -> origin/gh/guangyey/171/head 2025-10-10T00:44:16.6194037Z * [new branch] gh/guangyey/171/orig -> origin/gh/guangyey/171/orig 2025-10-10T00:44:16.6196622Z * [new branch] gh/guangyey/176/base -> origin/gh/guangyey/176/base 2025-10-10T00:44:16.6198625Z * [new branch] gh/guangyey/176/head -> origin/gh/guangyey/176/head 2025-10-10T00:44:16.6200625Z * [new branch] gh/guangyey/176/orig -> origin/gh/guangyey/176/orig 2025-10-10T00:44:16.6203080Z * [new branch] gh/guangyey/178/base -> origin/gh/guangyey/178/base 2025-10-10T00:44:16.6204934Z * [new branch] gh/guangyey/178/head -> origin/gh/guangyey/178/head 2025-10-10T00:44:16.6206806Z * [new branch] gh/guangyey/178/orig -> origin/gh/guangyey/178/orig 2025-10-10T00:44:16.6209513Z * [new branch] gh/guangyey/181/base -> origin/gh/guangyey/181/base 2025-10-10T00:44:16.6211608Z * [new branch] gh/guangyey/181/head -> origin/gh/guangyey/181/head 2025-10-10T00:44:16.6213342Z * [new branch] gh/guangyey/181/orig -> origin/gh/guangyey/181/orig 2025-10-10T00:44:16.6215816Z * [new branch] gh/guangyey/182/base -> origin/gh/guangyey/182/base 2025-10-10T00:44:16.6217777Z * [new branch] gh/guangyey/182/head -> origin/gh/guangyey/182/head 2025-10-10T00:44:16.6219624Z * [new branch] gh/guangyey/182/orig -> origin/gh/guangyey/182/orig 2025-10-10T00:44:16.6222173Z * [new branch] gh/guangyey/183/base -> origin/gh/guangyey/183/base 2025-10-10T00:44:16.6223989Z * [new branch] gh/guangyey/183/head -> origin/gh/guangyey/183/head 2025-10-10T00:44:16.6225880Z * [new branch] gh/guangyey/183/orig -> origin/gh/guangyey/183/orig 2025-10-10T00:44:16.6228440Z * [new branch] gh/guangyey/185/base -> origin/gh/guangyey/185/base 2025-10-10T00:44:16.6230849Z * [new branch] gh/guangyey/185/head -> origin/gh/guangyey/185/head 2025-10-10T00:44:16.6232203Z * [new branch] gh/guangyey/185/orig -> origin/gh/guangyey/185/orig 2025-10-10T00:44:16.6235230Z * [new branch] gh/guangyey/186/base -> origin/gh/guangyey/186/base 2025-10-10T00:44:16.6237107Z * [new branch] gh/guangyey/186/head -> origin/gh/guangyey/186/head 2025-10-10T00:44:16.6239393Z * [new branch] gh/guangyey/186/orig -> origin/gh/guangyey/186/orig 2025-10-10T00:44:16.6242068Z * [new branch] gh/guangyey/187/base -> origin/gh/guangyey/187/base 2025-10-10T00:44:16.6244187Z * [new branch] gh/guangyey/187/head -> origin/gh/guangyey/187/head 2025-10-10T00:44:16.6245865Z * [new branch] gh/guangyey/187/orig -> origin/gh/guangyey/187/orig 2025-10-10T00:44:16.6248727Z * [new branch] gh/guangyey/188/base -> origin/gh/guangyey/188/base 2025-10-10T00:44:16.6250733Z * [new branch] gh/guangyey/188/head -> origin/gh/guangyey/188/head 2025-10-10T00:44:16.6252407Z * [new branch] gh/guangyey/188/orig -> origin/gh/guangyey/188/orig 2025-10-10T00:44:16.6255060Z * [new branch] gh/guangyey/190/base -> origin/gh/guangyey/190/base 2025-10-10T00:44:16.6256921Z * [new branch] gh/guangyey/190/head -> origin/gh/guangyey/190/head 2025-10-10T00:44:16.6258751Z * [new branch] gh/guangyey/190/orig -> origin/gh/guangyey/190/orig 2025-10-10T00:44:16.6261385Z * [new branch] gh/guangyey/194/base -> origin/gh/guangyey/194/base 2025-10-10T00:44:16.6263174Z * [new branch] gh/guangyey/194/head -> origin/gh/guangyey/194/head 2025-10-10T00:44:16.6265083Z * [new branch] gh/guangyey/194/orig -> origin/gh/guangyey/194/orig 2025-10-10T00:44:16.6267731Z * [new branch] gh/guangyey/195/base -> origin/gh/guangyey/195/base 2025-10-10T00:44:16.6269832Z * [new branch] gh/guangyey/195/head -> origin/gh/guangyey/195/head 2025-10-10T00:44:16.6271678Z * [new branch] gh/guangyey/195/orig -> origin/gh/guangyey/195/orig 2025-10-10T00:44:16.6274292Z * [new branch] gh/guangyey/201/base -> origin/gh/guangyey/201/base 2025-10-10T00:44:16.6276165Z * [new branch] gh/guangyey/201/head -> origin/gh/guangyey/201/head 2025-10-10T00:44:16.6278016Z * [new branch] gh/guangyey/201/orig -> origin/gh/guangyey/201/orig 2025-10-10T00:44:16.6280619Z * [new branch] gh/guangyey/202/base -> origin/gh/guangyey/202/base 2025-10-10T00:44:16.6282374Z * [new branch] gh/guangyey/202/head -> origin/gh/guangyey/202/head 2025-10-10T00:44:16.6284334Z * [new branch] gh/guangyey/202/orig -> origin/gh/guangyey/202/orig 2025-10-10T00:44:16.6286903Z * [new branch] gh/guangyey/203/base -> origin/gh/guangyey/203/base 2025-10-10T00:44:16.6288883Z * [new branch] gh/guangyey/203/head -> origin/gh/guangyey/203/head 2025-10-10T00:44:16.6290894Z * [new branch] gh/guangyey/203/orig -> origin/gh/guangyey/203/orig 2025-10-10T00:44:16.6293533Z * [new branch] gh/guangyey/205/base -> origin/gh/guangyey/205/base 2025-10-10T00:44:16.6295517Z * [new branch] gh/guangyey/205/head -> origin/gh/guangyey/205/head 2025-10-10T00:44:16.6297404Z * [new branch] gh/guangyey/205/orig -> origin/gh/guangyey/205/orig 2025-10-10T00:44:16.6301903Z * [new branch] gh/guangyey/208/base -> origin/gh/guangyey/208/base 2025-10-10T00:44:16.6303722Z * [new branch] gh/guangyey/208/head -> origin/gh/guangyey/208/head 2025-10-10T00:44:16.6305494Z * [new branch] gh/guangyey/208/orig -> origin/gh/guangyey/208/orig 2025-10-10T00:44:16.6308189Z * [new branch] gh/guangyey/209/base -> origin/gh/guangyey/209/base 2025-10-10T00:44:16.6310046Z * [new branch] gh/guangyey/209/head -> origin/gh/guangyey/209/head 2025-10-10T00:44:16.6311969Z * [new branch] gh/guangyey/209/orig -> origin/gh/guangyey/209/orig 2025-10-10T00:44:16.6314511Z * [new branch] gh/guangyey/210/base -> origin/gh/guangyey/210/base 2025-10-10T00:44:16.6316678Z * [new branch] gh/guangyey/210/head -> origin/gh/guangyey/210/head 2025-10-10T00:44:16.6318387Z * [new branch] gh/guangyey/210/orig -> origin/gh/guangyey/210/orig 2025-10-10T00:44:16.6321037Z * [new branch] gh/guangyey/211/base -> origin/gh/guangyey/211/base 2025-10-10T00:44:16.6322894Z * [new branch] gh/guangyey/211/head -> origin/gh/guangyey/211/head 2025-10-10T00:44:16.6324813Z * [new branch] gh/guangyey/211/orig -> origin/gh/guangyey/211/orig 2025-10-10T00:44:16.6327641Z * [new branch] gh/guangyey/89/base -> origin/gh/guangyey/89/base 2025-10-10T00:44:16.6329591Z * [new branch] gh/guangyey/89/head -> origin/gh/guangyey/89/head 2025-10-10T00:44:16.6331321Z * [new branch] gh/guangyey/89/orig -> origin/gh/guangyey/89/orig 2025-10-10T00:44:16.6334582Z * [new branch] gh/guilhermeleobas/107/base -> origin/gh/guilhermeleobas/107/base 2025-10-10T00:44:16.6336462Z * [new branch] gh/guilhermeleobas/107/head -> origin/gh/guilhermeleobas/107/head 2025-10-10T00:44:16.6338298Z * [new branch] gh/guilhermeleobas/107/orig -> origin/gh/guilhermeleobas/107/orig 2025-10-10T00:44:16.6340773Z * [new branch] gh/guilhermeleobas/108/base -> origin/gh/guilhermeleobas/108/base 2025-10-10T00:44:16.6342644Z * [new branch] gh/guilhermeleobas/108/head -> origin/gh/guilhermeleobas/108/head 2025-10-10T00:44:16.6344777Z * [new branch] gh/guilhermeleobas/108/orig -> origin/gh/guilhermeleobas/108/orig 2025-10-10T00:44:16.6347541Z * [new branch] gh/guilhermeleobas/124/base -> origin/gh/guilhermeleobas/124/base 2025-10-10T00:44:16.6349399Z * [new branch] gh/guilhermeleobas/124/head -> origin/gh/guilhermeleobas/124/head 2025-10-10T00:44:16.6351309Z * [new branch] gh/guilhermeleobas/124/orig -> origin/gh/guilhermeleobas/124/orig 2025-10-10T00:44:16.6353911Z * [new branch] gh/guilhermeleobas/147/base -> origin/gh/guilhermeleobas/147/base 2025-10-10T00:44:16.6355745Z * [new branch] gh/guilhermeleobas/147/head -> origin/gh/guilhermeleobas/147/head 2025-10-10T00:44:16.6357513Z * [new branch] gh/guilhermeleobas/147/orig -> origin/gh/guilhermeleobas/147/orig 2025-10-10T00:44:16.6360231Z * [new branch] gh/guilhermeleobas/150/base -> origin/gh/guilhermeleobas/150/base 2025-10-10T00:44:16.6361977Z * [new branch] gh/guilhermeleobas/150/head -> origin/gh/guilhermeleobas/150/head 2025-10-10T00:44:16.6363994Z * [new branch] gh/guilhermeleobas/150/orig -> origin/gh/guilhermeleobas/150/orig 2025-10-10T00:44:16.6366612Z * [new branch] gh/guilhermeleobas/166/base -> origin/gh/guilhermeleobas/166/base 2025-10-10T00:44:16.6368872Z * [new branch] gh/guilhermeleobas/166/head -> origin/gh/guilhermeleobas/166/head 2025-10-10T00:44:16.6370520Z * [new branch] gh/guilhermeleobas/166/orig -> origin/gh/guilhermeleobas/166/orig 2025-10-10T00:44:16.6373362Z * [new branch] gh/guilhermeleobas/168/base -> origin/gh/guilhermeleobas/168/base 2025-10-10T00:44:16.6375260Z * [new branch] gh/guilhermeleobas/168/head -> origin/gh/guilhermeleobas/168/head 2025-10-10T00:44:16.6377112Z * [new branch] gh/guilhermeleobas/168/orig -> origin/gh/guilhermeleobas/168/orig 2025-10-10T00:44:16.6379650Z * [new branch] gh/guilhermeleobas/169/base -> origin/gh/guilhermeleobas/169/base 2025-10-10T00:44:16.6381475Z * [new branch] gh/guilhermeleobas/169/head -> origin/gh/guilhermeleobas/169/head 2025-10-10T00:44:16.6383374Z * [new branch] gh/guilhermeleobas/169/orig -> origin/gh/guilhermeleobas/169/orig 2025-10-10T00:44:16.6385753Z * [new branch] gh/guilhermeleobas/170/base -> origin/gh/guilhermeleobas/170/base 2025-10-10T00:44:16.6387658Z * [new branch] gh/guilhermeleobas/170/head -> origin/gh/guilhermeleobas/170/head 2025-10-10T00:44:16.6389543Z * [new branch] gh/guilhermeleobas/170/orig -> origin/gh/guilhermeleobas/170/orig 2025-10-10T00:44:16.6392146Z * [new branch] gh/guilhermeleobas/171/base -> origin/gh/guilhermeleobas/171/base 2025-10-10T00:44:16.6393986Z * [new branch] gh/guilhermeleobas/171/head -> origin/gh/guilhermeleobas/171/head 2025-10-10T00:44:16.6395804Z * [new branch] gh/guilhermeleobas/171/orig -> origin/gh/guilhermeleobas/171/orig 2025-10-10T00:44:16.6399046Z * [new branch] gh/guilhermeleobas/173/base -> origin/gh/guilhermeleobas/173/base 2025-10-10T00:44:16.6401203Z * [new branch] gh/guilhermeleobas/173/head -> origin/gh/guilhermeleobas/173/head 2025-10-10T00:44:16.6402979Z * [new branch] gh/guilhermeleobas/173/orig -> origin/gh/guilhermeleobas/173/orig 2025-10-10T00:44:16.6405503Z * [new branch] gh/guilhermeleobas/193/base -> origin/gh/guilhermeleobas/193/base 2025-10-10T00:44:16.6407316Z * [new branch] gh/guilhermeleobas/193/head -> origin/gh/guilhermeleobas/193/head 2025-10-10T00:44:16.6410044Z * [new branch] gh/guilhermeleobas/193/orig -> origin/gh/guilhermeleobas/193/orig 2025-10-10T00:44:16.6412582Z * [new branch] gh/guilhermeleobas/204/base -> origin/gh/guilhermeleobas/204/base 2025-10-10T00:44:16.6414381Z * [new branch] gh/guilhermeleobas/204/head -> origin/gh/guilhermeleobas/204/head 2025-10-10T00:44:16.6416324Z * [new branch] gh/guilhermeleobas/204/orig -> origin/gh/guilhermeleobas/204/orig 2025-10-10T00:44:16.6418891Z * [new branch] gh/guilhermeleobas/211/base -> origin/gh/guilhermeleobas/211/base 2025-10-10T00:44:16.6420766Z * [new branch] gh/guilhermeleobas/211/head -> origin/gh/guilhermeleobas/211/head 2025-10-10T00:44:16.6422610Z * [new branch] gh/guilhermeleobas/211/orig -> origin/gh/guilhermeleobas/211/orig 2025-10-10T00:44:16.6425175Z * [new branch] gh/guilhermeleobas/226/base -> origin/gh/guilhermeleobas/226/base 2025-10-10T00:44:16.6427053Z * [new branch] gh/guilhermeleobas/226/head -> origin/gh/guilhermeleobas/226/head 2025-10-10T00:44:16.6428947Z * [new branch] gh/guilhermeleobas/226/orig -> origin/gh/guilhermeleobas/226/orig 2025-10-10T00:44:16.6431439Z * [new branch] gh/guilhermeleobas/236/base -> origin/gh/guilhermeleobas/236/base 2025-10-10T00:44:16.6433285Z * [new branch] gh/guilhermeleobas/236/head -> origin/gh/guilhermeleobas/236/head 2025-10-10T00:44:16.6435152Z * [new branch] gh/guilhermeleobas/236/orig -> origin/gh/guilhermeleobas/236/orig 2025-10-10T00:44:16.6438264Z * [new branch] gh/guilhermeleobas/237/base -> origin/gh/guilhermeleobas/237/base 2025-10-10T00:44:16.6440095Z * [new branch] gh/guilhermeleobas/237/head -> origin/gh/guilhermeleobas/237/head 2025-10-10T00:44:16.6441980Z * [new branch] gh/guilhermeleobas/237/orig -> origin/gh/guilhermeleobas/237/orig 2025-10-10T00:44:16.6444590Z * [new branch] gh/guilhermeleobas/239/base -> origin/gh/guilhermeleobas/239/base 2025-10-10T00:44:16.6446493Z * [new branch] gh/guilhermeleobas/239/head -> origin/gh/guilhermeleobas/239/head 2025-10-10T00:44:16.6448605Z * [new branch] gh/guilhermeleobas/239/orig -> origin/gh/guilhermeleobas/239/orig 2025-10-10T00:44:16.6451314Z * [new branch] gh/guilhermeleobas/246/base -> origin/gh/guilhermeleobas/246/base 2025-10-10T00:44:16.6453237Z * [new branch] gh/guilhermeleobas/246/head -> origin/gh/guilhermeleobas/246/head 2025-10-10T00:44:16.6455115Z * [new branch] gh/guilhermeleobas/246/orig -> origin/gh/guilhermeleobas/246/orig 2025-10-10T00:44:16.6457692Z * [new branch] gh/guilhermeleobas/247/base -> origin/gh/guilhermeleobas/247/base 2025-10-10T00:44:16.6459771Z * [new branch] gh/guilhermeleobas/247/head -> origin/gh/guilhermeleobas/247/head 2025-10-10T00:44:16.6461675Z * [new branch] gh/guilhermeleobas/247/orig -> origin/gh/guilhermeleobas/247/orig 2025-10-10T00:44:16.6464342Z * [new branch] gh/guilhermeleobas/248/base -> origin/gh/guilhermeleobas/248/base 2025-10-10T00:44:16.6466036Z * [new branch] gh/guilhermeleobas/248/head -> origin/gh/guilhermeleobas/248/head 2025-10-10T00:44:16.6468209Z * [new branch] gh/guilhermeleobas/248/orig -> origin/gh/guilhermeleobas/248/orig 2025-10-10T00:44:16.6471092Z * [new branch] gh/guilhermeleobas/249/base -> origin/gh/guilhermeleobas/249/base 2025-10-10T00:44:16.6472708Z * [new branch] gh/guilhermeleobas/249/head -> origin/gh/guilhermeleobas/249/head 2025-10-10T00:44:16.6474598Z * [new branch] gh/guilhermeleobas/249/orig -> origin/gh/guilhermeleobas/249/orig 2025-10-10T00:44:16.6477390Z * [new branch] gh/guilhermeleobas/250/base -> origin/gh/guilhermeleobas/250/base 2025-10-10T00:44:16.6479226Z * [new branch] gh/guilhermeleobas/250/head -> origin/gh/guilhermeleobas/250/head 2025-10-10T00:44:16.6481172Z * [new branch] gh/guilhermeleobas/250/orig -> origin/gh/guilhermeleobas/250/orig 2025-10-10T00:44:16.6484820Z * [new branch] gh/henrylhtsang/150/base -> origin/gh/henrylhtsang/150/base 2025-10-10T00:44:16.6486773Z * [new branch] gh/henrylhtsang/150/head -> origin/gh/henrylhtsang/150/head 2025-10-10T00:44:16.6488758Z * [new branch] gh/henrylhtsang/150/orig -> origin/gh/henrylhtsang/150/orig 2025-10-10T00:44:16.6491446Z * [new branch] gh/henrylhtsang/151/base -> origin/gh/henrylhtsang/151/base 2025-10-10T00:44:16.6493278Z * [new branch] gh/henrylhtsang/151/head -> origin/gh/henrylhtsang/151/head 2025-10-10T00:44:16.6495212Z * [new branch] gh/henrylhtsang/151/orig -> origin/gh/henrylhtsang/151/orig 2025-10-10T00:44:16.6497785Z * [new branch] gh/henrylhtsang/152/base -> origin/gh/henrylhtsang/152/base 2025-10-10T00:44:16.6500004Z * [new branch] gh/henrylhtsang/152/head -> origin/gh/henrylhtsang/152/head 2025-10-10T00:44:16.6501911Z * [new branch] gh/henrylhtsang/152/orig -> origin/gh/henrylhtsang/152/orig 2025-10-10T00:44:16.6504463Z * [new branch] gh/henrylhtsang/153/base -> origin/gh/henrylhtsang/153/base 2025-10-10T00:44:16.6506302Z * [new branch] gh/henrylhtsang/153/head -> origin/gh/henrylhtsang/153/head 2025-10-10T00:44:16.6508117Z * [new branch] gh/henrylhtsang/153/orig -> origin/gh/henrylhtsang/153/orig 2025-10-10T00:44:16.6511211Z * [new branch] gh/huydhn/1/next -> origin/gh/huydhn/1/next 2025-10-10T00:44:16.6513700Z * [new branch] gh/huydhn/2/next -> origin/gh/huydhn/2/next 2025-10-10T00:44:16.6516286Z * [new branch] gh/huydhn/3/next -> origin/gh/huydhn/3/next 2025-10-10T00:44:16.6518795Z * [new branch] gh/huydhn/4/next -> origin/gh/huydhn/4/next 2025-10-10T00:44:16.6521360Z * [new branch] gh/huydhn/5/next -> origin/gh/huydhn/5/next 2025-10-10T00:44:16.6523962Z * [new branch] gh/huydhn/6/next -> origin/gh/huydhn/6/next 2025-10-10T00:44:16.6527489Z * [new branch] gh/int3/97/base -> origin/gh/int3/97/base 2025-10-10T00:44:16.6529582Z * [new branch] gh/int3/97/head -> origin/gh/int3/97/head 2025-10-10T00:44:16.6532787Z * [new branch] gh/isuruf/101/base -> origin/gh/isuruf/101/base 2025-10-10T00:44:16.6534813Z * [new branch] gh/isuruf/101/head -> origin/gh/isuruf/101/head 2025-10-10T00:44:16.6537217Z * [new branch] gh/isuruf/146/base -> origin/gh/isuruf/146/base 2025-10-10T00:44:16.6539020Z * [new branch] gh/isuruf/146/head -> origin/gh/isuruf/146/head 2025-10-10T00:44:16.6540851Z * [new branch] gh/isuruf/146/orig -> origin/gh/isuruf/146/orig 2025-10-10T00:44:16.6543458Z * [new branch] gh/isuruf/147/base -> origin/gh/isuruf/147/base 2025-10-10T00:44:16.6545242Z * [new branch] gh/isuruf/147/head -> origin/gh/isuruf/147/head 2025-10-10T00:44:16.6547246Z * [new branch] gh/isuruf/147/orig -> origin/gh/isuruf/147/orig 2025-10-10T00:44:16.6549778Z * [new branch] gh/isuruf/148/base -> origin/gh/isuruf/148/base 2025-10-10T00:44:16.6551829Z * [new branch] gh/isuruf/148/head -> origin/gh/isuruf/148/head 2025-10-10T00:44:16.6553480Z * [new branch] gh/isuruf/148/orig -> origin/gh/isuruf/148/orig 2025-10-10T00:44:16.6555972Z * [new branch] gh/isuruf/149/base -> origin/gh/isuruf/149/base 2025-10-10T00:44:16.6557846Z * [new branch] gh/isuruf/149/head -> origin/gh/isuruf/149/head 2025-10-10T00:44:16.6559729Z * [new branch] gh/isuruf/149/orig -> origin/gh/isuruf/149/orig 2025-10-10T00:44:16.6562222Z * [new branch] gh/isuruf/150/base -> origin/gh/isuruf/150/base 2025-10-10T00:44:16.6564180Z * [new branch] gh/isuruf/150/head -> origin/gh/isuruf/150/head 2025-10-10T00:44:16.6565943Z * [new branch] gh/isuruf/150/orig -> origin/gh/isuruf/150/orig 2025-10-10T00:44:16.6568646Z * [new branch] gh/isuruf/151/base -> origin/gh/isuruf/151/base 2025-10-10T00:44:16.6570597Z * [new branch] gh/isuruf/151/head -> origin/gh/isuruf/151/head 2025-10-10T00:44:16.6572508Z * [new branch] gh/isuruf/151/orig -> origin/gh/isuruf/151/orig 2025-10-10T00:44:16.6575108Z * [new branch] gh/isuruf/152/base -> origin/gh/isuruf/152/base 2025-10-10T00:44:16.6576823Z * [new branch] gh/isuruf/152/head -> origin/gh/isuruf/152/head 2025-10-10T00:44:16.6579206Z * [new branch] gh/isuruf/152/orig -> origin/gh/isuruf/152/orig 2025-10-10T00:44:16.6581754Z * [new branch] gh/isuruf/153/base -> origin/gh/isuruf/153/base 2025-10-10T00:44:16.6583571Z * [new branch] gh/isuruf/153/head -> origin/gh/isuruf/153/head 2025-10-10T00:44:16.6585438Z * [new branch] gh/isuruf/153/orig -> origin/gh/isuruf/153/orig 2025-10-10T00:44:16.6588057Z * [new branch] gh/isuruf/154/base -> origin/gh/isuruf/154/base 2025-10-10T00:44:16.6589965Z * [new branch] gh/isuruf/154/head -> origin/gh/isuruf/154/head 2025-10-10T00:44:16.6591801Z * [new branch] gh/isuruf/154/orig -> origin/gh/isuruf/154/orig 2025-10-10T00:44:16.6594859Z * [new branch] gh/isuruf/155/base -> origin/gh/isuruf/155/base 2025-10-10T00:44:16.6596839Z * [new branch] gh/isuruf/155/head -> origin/gh/isuruf/155/head 2025-10-10T00:44:16.6598832Z * [new branch] gh/isuruf/155/orig -> origin/gh/isuruf/155/orig 2025-10-10T00:44:16.6601471Z * [new branch] gh/isuruf/156/base -> origin/gh/isuruf/156/base 2025-10-10T00:44:16.6603224Z * [new branch] gh/isuruf/156/head -> origin/gh/isuruf/156/head 2025-10-10T00:44:16.6605112Z * [new branch] gh/isuruf/156/orig -> origin/gh/isuruf/156/orig 2025-10-10T00:44:16.6607720Z * [new branch] gh/isuruf/157/base -> origin/gh/isuruf/157/base 2025-10-10T00:44:16.6609723Z * [new branch] gh/isuruf/157/head -> origin/gh/isuruf/157/head 2025-10-10T00:44:16.6611646Z * [new branch] gh/isuruf/157/orig -> origin/gh/isuruf/157/orig 2025-10-10T00:44:16.6614293Z * [new branch] gh/isuruf/81/base -> origin/gh/isuruf/81/base 2025-10-10T00:44:16.6616181Z * [new branch] gh/isuruf/81/head -> origin/gh/isuruf/81/head 2025-10-10T00:44:16.6618062Z * [new branch] gh/isuruf/81/orig -> origin/gh/isuruf/81/orig 2025-10-10T00:44:16.6621078Z * [new branch] gh/jamesjwu/171/base -> origin/gh/jamesjwu/171/base 2025-10-10T00:44:16.6623115Z * [new branch] gh/jamesjwu/171/head -> origin/gh/jamesjwu/171/head 2025-10-10T00:44:16.6624990Z * [new branch] gh/jamesjwu/171/orig -> origin/gh/jamesjwu/171/orig 2025-10-10T00:44:16.6627489Z * [new branch] gh/jamesjwu/176/base -> origin/gh/jamesjwu/176/base 2025-10-10T00:44:16.6629570Z * [new branch] gh/jamesjwu/176/head -> origin/gh/jamesjwu/176/head 2025-10-10T00:44:16.6631241Z * [new branch] gh/jamesjwu/176/orig -> origin/gh/jamesjwu/176/orig 2025-10-10T00:44:16.6633818Z * [new branch] gh/jamesjwu/186/base -> origin/gh/jamesjwu/186/base 2025-10-10T00:44:16.6635709Z * [new branch] gh/jamesjwu/186/head -> origin/gh/jamesjwu/186/head 2025-10-10T00:44:16.6637574Z * [new branch] gh/jamesjwu/186/orig -> origin/gh/jamesjwu/186/orig 2025-10-10T00:44:16.6640027Z * [new branch] gh/jamesjwu/187/base -> origin/gh/jamesjwu/187/base 2025-10-10T00:44:16.6641889Z * [new branch] gh/jamesjwu/187/head -> origin/gh/jamesjwu/187/head 2025-10-10T00:44:16.6643722Z * [new branch] gh/jamesjwu/187/orig -> origin/gh/jamesjwu/187/orig 2025-10-10T00:44:16.6646867Z * [new branch] gh/jamesjwu/189/base -> origin/gh/jamesjwu/189/base 2025-10-10T00:44:16.6649037Z * [new branch] gh/jamesjwu/189/head -> origin/gh/jamesjwu/189/head 2025-10-10T00:44:16.6650899Z * [new branch] gh/jamesjwu/189/orig -> origin/gh/jamesjwu/189/orig 2025-10-10T00:44:16.6653485Z * [new branch] gh/jamesjwu/190/base -> origin/gh/jamesjwu/190/base 2025-10-10T00:44:16.6655347Z * [new branch] gh/jamesjwu/190/head -> origin/gh/jamesjwu/190/head 2025-10-10T00:44:16.6657212Z * [new branch] gh/jamesjwu/190/orig -> origin/gh/jamesjwu/190/orig 2025-10-10T00:44:16.6659639Z * [new branch] gh/jamesjwu/191/base -> origin/gh/jamesjwu/191/base 2025-10-10T00:44:16.6661595Z * [new branch] gh/jamesjwu/191/head -> origin/gh/jamesjwu/191/head 2025-10-10T00:44:16.6663341Z * [new branch] gh/jamesjwu/191/orig -> origin/gh/jamesjwu/191/orig 2025-10-10T00:44:16.6666027Z * [new branch] gh/jamesjwu/192/base -> origin/gh/jamesjwu/192/base 2025-10-10T00:44:16.6667970Z * [new branch] gh/jamesjwu/192/head -> origin/gh/jamesjwu/192/head 2025-10-10T00:44:16.6670536Z * [new branch] gh/jamesjwu/193/base -> origin/gh/jamesjwu/193/base 2025-10-10T00:44:16.6672374Z * [new branch] gh/jamesjwu/193/head -> origin/gh/jamesjwu/193/head 2025-10-10T00:44:16.6674315Z * [new branch] gh/jamesjwu/193/orig -> origin/gh/jamesjwu/193/orig 2025-10-10T00:44:16.6677008Z * [new branch] gh/jamesjwu/194/base -> origin/gh/jamesjwu/194/base 2025-10-10T00:44:16.6678904Z * [new branch] gh/jamesjwu/194/head -> origin/gh/jamesjwu/194/head 2025-10-10T00:44:16.6681005Z * [new branch] gh/jamesjwu/194/orig -> origin/gh/jamesjwu/194/orig 2025-10-10T00:44:16.6683442Z * [new branch] gh/jamesjwu/195/base -> origin/gh/jamesjwu/195/base 2025-10-10T00:44:16.6685253Z * [new branch] gh/jamesjwu/195/head -> origin/gh/jamesjwu/195/head 2025-10-10T00:44:16.6687140Z * [new branch] gh/jamesjwu/195/orig -> origin/gh/jamesjwu/195/orig 2025-10-10T00:44:16.6689838Z * [new branch] gh/jamesjwu/196/base -> origin/gh/jamesjwu/196/base 2025-10-10T00:44:16.6691846Z * [new branch] gh/jamesjwu/196/head -> origin/gh/jamesjwu/196/head 2025-10-10T00:44:16.6693804Z * [new branch] gh/jamesjwu/196/orig -> origin/gh/jamesjwu/196/orig 2025-10-10T00:44:16.6696419Z * [new branch] gh/jamesjwu/52/base -> origin/gh/jamesjwu/52/base 2025-10-10T00:44:16.6698928Z * [new branch] gh/jamesjwu/52/head -> origin/gh/jamesjwu/52/head 2025-10-10T00:44:16.6703068Z * [new branch] gh/jamesjwu/53/base -> origin/gh/jamesjwu/53/base 2025-10-10T00:44:16.6705429Z * [new branch] gh/jamesjwu/53/head -> origin/gh/jamesjwu/53/head 2025-10-10T00:44:16.6707986Z * [new branch] gh/jamesjwu/54/base -> origin/gh/jamesjwu/54/base 2025-10-10T00:44:16.6709674Z * [new branch] gh/jamesjwu/54/head -> origin/gh/jamesjwu/54/head 2025-10-10T00:44:16.6712071Z * [new branch] gh/jamesjwu/55/base -> origin/gh/jamesjwu/55/base 2025-10-10T00:44:16.6713845Z * [new branch] gh/jamesjwu/55/head -> origin/gh/jamesjwu/55/head 2025-10-10T00:44:16.6716304Z * [new branch] gh/jamesjwu/56/base -> origin/gh/jamesjwu/56/base 2025-10-10T00:44:16.6718106Z * [new branch] gh/jamesjwu/56/head -> origin/gh/jamesjwu/56/head 2025-10-10T00:44:16.6721510Z * [new branch] gh/jamesjwu/57/base -> origin/gh/jamesjwu/57/base 2025-10-10T00:44:16.6723254Z * [new branch] gh/jamesjwu/57/head -> origin/gh/jamesjwu/57/head 2025-10-10T00:44:16.6725867Z * [new branch] gh/jamesjwu/58/base -> origin/gh/jamesjwu/58/base 2025-10-10T00:44:16.6727843Z * [new branch] gh/jamesjwu/58/head -> origin/gh/jamesjwu/58/head 2025-10-10T00:44:16.6730296Z * [new branch] gh/jamesjwu/59/base -> origin/gh/jamesjwu/59/base 2025-10-10T00:44:16.6732076Z * [new branch] gh/jamesjwu/59/head -> origin/gh/jamesjwu/59/head 2025-10-10T00:44:16.6734581Z * [new branch] gh/jamesjwu/60/base -> origin/gh/jamesjwu/60/base 2025-10-10T00:44:16.6736393Z * [new branch] gh/jamesjwu/60/head -> origin/gh/jamesjwu/60/head 2025-10-10T00:44:16.6738838Z * [new branch] gh/jamesjwu/61/base -> origin/gh/jamesjwu/61/base 2025-10-10T00:44:16.6740687Z * [new branch] gh/jamesjwu/61/head -> origin/gh/jamesjwu/61/head 2025-10-10T00:44:16.6743146Z * [new branch] gh/jamesjwu/62/base -> origin/gh/jamesjwu/62/base 2025-10-10T00:44:16.6744942Z * [new branch] gh/jamesjwu/62/head -> origin/gh/jamesjwu/62/head 2025-10-10T00:44:16.6747414Z * [new branch] gh/jamesjwu/63/base -> origin/gh/jamesjwu/63/base 2025-10-10T00:44:16.6749586Z * [new branch] gh/jamesjwu/63/head -> origin/gh/jamesjwu/63/head 2025-10-10T00:44:16.6752344Z * [new branch] gh/jamesjwu/64/base -> origin/gh/jamesjwu/64/base 2025-10-10T00:44:16.6754141Z * [new branch] gh/jamesjwu/64/head -> origin/gh/jamesjwu/64/head 2025-10-10T00:44:16.6756642Z * [new branch] gh/jamesjwu/65/base -> origin/gh/jamesjwu/65/base 2025-10-10T00:44:16.6758356Z * [new branch] gh/jamesjwu/65/head -> origin/gh/jamesjwu/65/head 2025-10-10T00:44:16.6761805Z * [new branch] gh/janeyx99/165/base -> origin/gh/janeyx99/165/base 2025-10-10T00:44:16.6763979Z * [new branch] gh/janeyx99/165/head -> origin/gh/janeyx99/165/head 2025-10-10T00:44:16.6765574Z * [new branch] gh/janeyx99/165/orig -> origin/gh/janeyx99/165/orig 2025-10-10T00:44:16.6768116Z * [new branch] gh/janeyx99/201/base -> origin/gh/janeyx99/201/base 2025-10-10T00:44:16.6770267Z * [new branch] gh/janeyx99/201/head -> origin/gh/janeyx99/201/head 2025-10-10T00:44:16.6771804Z * [new branch] gh/janeyx99/201/orig -> origin/gh/janeyx99/201/orig 2025-10-10T00:44:16.6774772Z * [new branch] gh/janeyx99/225/base -> origin/gh/janeyx99/225/base 2025-10-10T00:44:16.6776613Z * [new branch] gh/janeyx99/225/head -> origin/gh/janeyx99/225/head 2025-10-10T00:44:16.6778516Z * [new branch] gh/janeyx99/225/orig -> origin/gh/janeyx99/225/orig 2025-10-10T00:44:16.6781020Z * [new branch] gh/janeyx99/299/base -> origin/gh/janeyx99/299/base 2025-10-10T00:44:16.6782918Z * [new branch] gh/janeyx99/299/head -> origin/gh/janeyx99/299/head 2025-10-10T00:44:16.6784828Z * [new branch] gh/janeyx99/299/orig -> origin/gh/janeyx99/299/orig 2025-10-10T00:44:16.6787665Z * [new branch] gh/janeyx99/302/base -> origin/gh/janeyx99/302/base 2025-10-10T00:44:16.6789629Z * [new branch] gh/janeyx99/302/head -> origin/gh/janeyx99/302/head 2025-10-10T00:44:16.6792013Z * [new branch] gh/janeyx99/303/base -> origin/gh/janeyx99/303/base 2025-10-10T00:44:16.6793791Z * [new branch] gh/janeyx99/303/head -> origin/gh/janeyx99/303/head 2025-10-10T00:44:16.6796327Z * [new branch] gh/janeyx99/304/base -> origin/gh/janeyx99/304/base 2025-10-10T00:44:16.6798492Z * [new branch] gh/janeyx99/304/head -> origin/gh/janeyx99/304/head 2025-10-10T00:44:16.6800399Z * [new branch] gh/janeyx99/304/orig -> origin/gh/janeyx99/304/orig 2025-10-10T00:44:16.6802787Z * [new branch] gh/janeyx99/305/base -> origin/gh/janeyx99/305/base 2025-10-10T00:44:16.6804675Z * [new branch] gh/janeyx99/305/head -> origin/gh/janeyx99/305/head 2025-10-10T00:44:16.6807195Z * [new branch] gh/janeyx99/306/base -> origin/gh/janeyx99/306/base 2025-10-10T00:44:16.6809158Z * [new branch] gh/janeyx99/306/head -> origin/gh/janeyx99/306/head 2025-10-10T00:44:16.6811792Z * [new branch] gh/janeyx99/307/base -> origin/gh/janeyx99/307/base 2025-10-10T00:44:16.6813571Z * [new branch] gh/janeyx99/307/head -> origin/gh/janeyx99/307/head 2025-10-10T00:44:16.6815399Z * [new branch] gh/janeyx99/307/orig -> origin/gh/janeyx99/307/orig 2025-10-10T00:44:16.6817813Z * [new branch] gh/janeyx99/308/base -> origin/gh/janeyx99/308/base 2025-10-10T00:44:16.6819715Z * [new branch] gh/janeyx99/308/head -> origin/gh/janeyx99/308/head 2025-10-10T00:44:16.6821641Z * [new branch] gh/janeyx99/308/orig -> origin/gh/janeyx99/308/orig 2025-10-10T00:44:16.6824285Z * [new branch] gh/janeyx99/309/base -> origin/gh/janeyx99/309/base 2025-10-10T00:44:16.6826188Z * [new branch] gh/janeyx99/309/head -> origin/gh/janeyx99/309/head 2025-10-10T00:44:16.6828037Z * [new branch] gh/janeyx99/309/orig -> origin/gh/janeyx99/309/orig 2025-10-10T00:44:16.6830628Z * [new branch] gh/janeyx99/310/base -> origin/gh/janeyx99/310/base 2025-10-10T00:44:16.6832507Z * [new branch] gh/janeyx99/310/head -> origin/gh/janeyx99/310/head 2025-10-10T00:44:16.6834369Z * [new branch] gh/janeyx99/310/orig -> origin/gh/janeyx99/310/orig 2025-10-10T00:44:16.6836759Z * [new branch] gh/janeyx99/311/base -> origin/gh/janeyx99/311/base 2025-10-10T00:44:16.6838600Z * [new branch] gh/janeyx99/311/head -> origin/gh/janeyx99/311/head 2025-10-10T00:44:16.6840458Z * [new branch] gh/janeyx99/311/orig -> origin/gh/janeyx99/311/orig 2025-10-10T00:44:16.6842806Z * [new branch] gh/janeyx99/312/base -> origin/gh/janeyx99/312/base 2025-10-10T00:44:16.6844665Z * [new branch] gh/janeyx99/312/head -> origin/gh/janeyx99/312/head 2025-10-10T00:44:16.6846544Z * [new branch] gh/janeyx99/312/orig -> origin/gh/janeyx99/312/orig 2025-10-10T00:44:16.6849354Z * [new branch] gh/janeyx99/313/base -> origin/gh/janeyx99/313/base 2025-10-10T00:44:16.6851082Z * [new branch] gh/janeyx99/313/head -> origin/gh/janeyx99/313/head 2025-10-10T00:44:16.6852945Z * [new branch] gh/janeyx99/313/orig -> origin/gh/janeyx99/313/orig 2025-10-10T00:44:16.6855991Z * [new branch] gh/janeyx99/314/base -> origin/gh/janeyx99/314/base 2025-10-10T00:44:16.6857936Z * [new branch] gh/janeyx99/314/head -> origin/gh/janeyx99/314/head 2025-10-10T00:44:16.6859957Z * [new branch] gh/janeyx99/314/orig -> origin/gh/janeyx99/314/orig 2025-10-10T00:44:16.6862576Z * [new branch] gh/janeyx99/88/base -> origin/gh/janeyx99/88/base 2025-10-10T00:44:16.6864423Z * [new branch] gh/janeyx99/88/head -> origin/gh/janeyx99/88/head 2025-10-10T00:44:16.6866294Z * [new branch] gh/janeyx99/88/orig -> origin/gh/janeyx99/88/orig 2025-10-10T00:44:16.6869441Z * [new branch] gh/jansel/360/base -> origin/gh/jansel/360/base 2025-10-10T00:44:16.6872739Z * [new branch] gh/jansel/360/head -> origin/gh/jansel/360/head 2025-10-10T00:44:16.6874385Z * [new branch] gh/jansel/451/base -> origin/gh/jansel/451/base 2025-10-10T00:44:16.6875354Z * [new branch] gh/jansel/451/head -> origin/gh/jansel/451/head 2025-10-10T00:44:16.6877491Z * [new branch] gh/jansel/451/orig -> origin/gh/jansel/451/orig 2025-10-10T00:44:16.6880039Z * [new branch] gh/jansel/462/base -> origin/gh/jansel/462/base 2025-10-10T00:44:16.6881879Z * [new branch] gh/jansel/462/head -> origin/gh/jansel/462/head 2025-10-10T00:44:16.6883701Z * [new branch] gh/jansel/462/orig -> origin/gh/jansel/462/orig 2025-10-10T00:44:16.6886250Z * [new branch] gh/jansel/531/base -> origin/gh/jansel/531/base 2025-10-10T00:44:16.6889001Z * [new branch] gh/jansel/531/head -> origin/gh/jansel/531/head 2025-10-10T00:44:16.6890820Z * [new branch] gh/jansel/531/orig -> origin/gh/jansel/531/orig 2025-10-10T00:44:16.6893468Z * [new branch] gh/jansel/532/base -> origin/gh/jansel/532/base 2025-10-10T00:44:16.6895246Z * [new branch] gh/jansel/532/head -> origin/gh/jansel/532/head 2025-10-10T00:44:16.6897067Z * [new branch] gh/jansel/532/orig -> origin/gh/jansel/532/orig 2025-10-10T00:44:16.6900072Z * [new branch] gh/jansel/533/base -> origin/gh/jansel/533/base 2025-10-10T00:44:16.6901954Z * [new branch] gh/jansel/533/head -> origin/gh/jansel/533/head 2025-10-10T00:44:16.6903703Z * [new branch] gh/jansel/533/orig -> origin/gh/jansel/533/orig 2025-10-10T00:44:16.6906230Z * [new branch] gh/jansel/534/base -> origin/gh/jansel/534/base 2025-10-10T00:44:16.6908092Z * [new branch] gh/jansel/534/head -> origin/gh/jansel/534/head 2025-10-10T00:44:16.6909994Z * [new branch] gh/jansel/534/orig -> origin/gh/jansel/534/orig 2025-10-10T00:44:16.6912499Z * [new branch] gh/jansel/535/base -> origin/gh/jansel/535/base 2025-10-10T00:44:16.6914331Z * [new branch] gh/jansel/535/head -> origin/gh/jansel/535/head 2025-10-10T00:44:16.6916217Z * [new branch] gh/jansel/535/orig -> origin/gh/jansel/535/orig 2025-10-10T00:44:16.6918778Z * [new branch] gh/jansel/536/base -> origin/gh/jansel/536/base 2025-10-10T00:44:16.6920660Z * [new branch] gh/jansel/536/head -> origin/gh/jansel/536/head 2025-10-10T00:44:16.6922541Z * [new branch] gh/jansel/536/orig -> origin/gh/jansel/536/orig 2025-10-10T00:44:16.6925191Z * [new branch] gh/jansel/537/base -> origin/gh/jansel/537/base 2025-10-10T00:44:16.6927134Z * [new branch] gh/jansel/537/head -> origin/gh/jansel/537/head 2025-10-10T00:44:16.6929068Z * [new branch] gh/jansel/537/orig -> origin/gh/jansel/537/orig 2025-10-10T00:44:16.6931566Z * [new branch] gh/jansel/538/base -> origin/gh/jansel/538/base 2025-10-10T00:44:16.6933439Z * [new branch] gh/jansel/538/head -> origin/gh/jansel/538/head 2025-10-10T00:44:16.6935480Z * [new branch] gh/jansel/538/orig -> origin/gh/jansel/538/orig 2025-10-10T00:44:16.6937894Z * [new branch] gh/jansel/539/base -> origin/gh/jansel/539/base 2025-10-10T00:44:16.6939814Z * [new branch] gh/jansel/539/head -> origin/gh/jansel/539/head 2025-10-10T00:44:16.6941676Z * [new branch] gh/jansel/539/orig -> origin/gh/jansel/539/orig 2025-10-10T00:44:16.6944197Z * [new branch] gh/jansel/540/base -> origin/gh/jansel/540/base 2025-10-10T00:44:16.6946538Z * [new branch] gh/jansel/540/head -> origin/gh/jansel/540/head 2025-10-10T00:44:16.6948394Z * [new branch] gh/jansel/540/orig -> origin/gh/jansel/540/orig 2025-10-10T00:44:16.6951068Z * [new branch] gh/jansel/541/base -> origin/gh/jansel/541/base 2025-10-10T00:44:16.6952914Z * [new branch] gh/jansel/541/head -> origin/gh/jansel/541/head 2025-10-10T00:44:16.6954795Z * [new branch] gh/jansel/541/orig -> origin/gh/jansel/541/orig 2025-10-10T00:44:16.6957266Z * [new branch] gh/jansel/542/base -> origin/gh/jansel/542/base 2025-10-10T00:44:16.6959161Z * [new branch] gh/jansel/542/head -> origin/gh/jansel/542/head 2025-10-10T00:44:16.6960990Z * [new branch] gh/jansel/542/orig -> origin/gh/jansel/542/orig 2025-10-10T00:44:16.6963477Z * [new branch] gh/jansel/543/base -> origin/gh/jansel/543/base 2025-10-10T00:44:16.6965334Z * [new branch] gh/jansel/543/head -> origin/gh/jansel/543/head 2025-10-10T00:44:16.6967265Z * [new branch] gh/jansel/543/orig -> origin/gh/jansel/543/orig 2025-10-10T00:44:16.6970471Z * [new branch] gh/jansel/544/base -> origin/gh/jansel/544/base 2025-10-10T00:44:16.6972490Z * [new branch] gh/jansel/544/head -> origin/gh/jansel/544/head 2025-10-10T00:44:16.6974340Z * [new branch] gh/jansel/544/orig -> origin/gh/jansel/544/orig 2025-10-10T00:44:16.6977087Z * [new branch] gh/jansel/545/base -> origin/gh/jansel/545/base 2025-10-10T00:44:16.6978932Z * [new branch] gh/jansel/545/head -> origin/gh/jansel/545/head 2025-10-10T00:44:16.6980758Z * [new branch] gh/jansel/545/orig -> origin/gh/jansel/545/orig 2025-10-10T00:44:16.6983303Z * [new branch] gh/jansel/546/base -> origin/gh/jansel/546/base 2025-10-10T00:44:16.6985157Z * [new branch] gh/jansel/546/head -> origin/gh/jansel/546/head 2025-10-10T00:44:16.6987056Z * [new branch] gh/jansel/546/orig -> origin/gh/jansel/546/orig 2025-10-10T00:44:16.6989660Z * [new branch] gh/jansel/547/base -> origin/gh/jansel/547/base 2025-10-10T00:44:16.6991565Z * [new branch] gh/jansel/547/head -> origin/gh/jansel/547/head 2025-10-10T00:44:16.6993500Z * [new branch] gh/jansel/547/orig -> origin/gh/jansel/547/orig 2025-10-10T00:44:16.6996071Z * [new branch] gh/jansel/548/base -> origin/gh/jansel/548/base 2025-10-10T00:44:16.6997889Z * [new branch] gh/jansel/548/head -> origin/gh/jansel/548/head 2025-10-10T00:44:16.7000038Z * [new branch] gh/jansel/548/orig -> origin/gh/jansel/548/orig 2025-10-10T00:44:16.7003173Z * [new branch] gh/jbschlosser/247/base -> origin/gh/jbschlosser/247/base 2025-10-10T00:44:16.7005091Z * [new branch] gh/jbschlosser/247/head -> origin/gh/jbschlosser/247/head 2025-10-10T00:44:16.7006930Z * [new branch] gh/jbschlosser/247/orig -> origin/gh/jbschlosser/247/orig 2025-10-10T00:44:16.7009754Z * [new branch] gh/jbschlosser/250/base -> origin/gh/jbschlosser/250/base 2025-10-10T00:44:16.7011593Z * [new branch] gh/jbschlosser/250/head -> origin/gh/jbschlosser/250/head 2025-10-10T00:44:16.7013597Z * [new branch] gh/jbschlosser/250/orig -> origin/gh/jbschlosser/250/orig 2025-10-10T00:44:16.7016937Z * [new branch] gh/jbschlosser/251/base -> origin/gh/jbschlosser/251/base 2025-10-10T00:44:16.7018829Z * [new branch] gh/jbschlosser/251/head -> origin/gh/jbschlosser/251/head 2025-10-10T00:44:16.7020791Z * [new branch] gh/jbschlosser/251/orig -> origin/gh/jbschlosser/251/orig 2025-10-10T00:44:16.7023842Z * [new branch] gh/jiayisunx/59/base -> origin/gh/jiayisunx/59/base 2025-10-10T00:44:16.7025832Z * [new branch] gh/jiayisunx/59/head -> origin/gh/jiayisunx/59/head 2025-10-10T00:44:16.7027686Z * [new branch] gh/jiayisunx/59/orig -> origin/gh/jiayisunx/59/orig 2025-10-10T00:44:16.7030248Z * [new branch] gh/jiayisunx/61/base -> origin/gh/jiayisunx/61/base 2025-10-10T00:44:16.7032180Z * [new branch] gh/jiayisunx/61/head -> origin/gh/jiayisunx/61/head 2025-10-10T00:44:16.7034314Z * [new branch] gh/jiayisunx/61/orig -> origin/gh/jiayisunx/61/orig 2025-10-10T00:44:16.7036848Z * [new branch] gh/jiayisunx/65/base -> origin/gh/jiayisunx/65/base 2025-10-10T00:44:16.7038747Z * [new branch] gh/jiayisunx/65/head -> origin/gh/jiayisunx/65/head 2025-10-10T00:44:16.7040635Z * [new branch] gh/jiayisunx/65/orig -> origin/gh/jiayisunx/65/orig 2025-10-10T00:44:16.7043164Z * [new branch] gh/jiayisunx/67/base -> origin/gh/jiayisunx/67/base 2025-10-10T00:44:16.7045047Z * [new branch] gh/jiayisunx/67/head -> origin/gh/jiayisunx/67/head 2025-10-10T00:44:16.7046910Z * [new branch] gh/jiayisunx/67/orig -> origin/gh/jiayisunx/67/orig 2025-10-10T00:44:16.7049679Z * [new branch] gh/jiayisunx/68/base -> origin/gh/jiayisunx/68/base 2025-10-10T00:44:16.7051536Z * [new branch] gh/jiayisunx/68/head -> origin/gh/jiayisunx/68/head 2025-10-10T00:44:16.7053369Z * [new branch] gh/jiayisunx/68/orig -> origin/gh/jiayisunx/68/orig 2025-10-10T00:44:16.7055957Z * [new branch] gh/jiayisunx/71/base -> origin/gh/jiayisunx/71/base 2025-10-10T00:44:16.7057766Z * [new branch] gh/jiayisunx/71/head -> origin/gh/jiayisunx/71/head 2025-10-10T00:44:16.7059598Z * [new branch] gh/jiayisunx/71/orig -> origin/gh/jiayisunx/71/orig 2025-10-10T00:44:16.7062123Z * [new branch] gh/jiayisunx/72/base -> origin/gh/jiayisunx/72/base 2025-10-10T00:44:16.7063973Z * [new branch] gh/jiayisunx/72/head -> origin/gh/jiayisunx/72/head 2025-10-10T00:44:16.7065857Z * [new branch] gh/jiayisunx/72/orig -> origin/gh/jiayisunx/72/orig 2025-10-10T00:44:16.7068558Z * [new branch] gh/jiayisunx/77/base -> origin/gh/jiayisunx/77/base 2025-10-10T00:44:16.7070348Z * [new branch] gh/jiayisunx/77/head -> origin/gh/jiayisunx/77/head 2025-10-10T00:44:16.7072188Z * [new branch] gh/jiayisunx/77/orig -> origin/gh/jiayisunx/77/orig 2025-10-10T00:44:16.7074657Z * [new branch] gh/jiayisunx/78/base -> origin/gh/jiayisunx/78/base 2025-10-10T00:44:16.7076734Z * [new branch] gh/jiayisunx/78/head -> origin/gh/jiayisunx/78/head 2025-10-10T00:44:16.7078567Z * [new branch] gh/jiayisunx/78/orig -> origin/gh/jiayisunx/78/orig 2025-10-10T00:44:16.7081120Z * [new branch] gh/jiayisunx/79/base -> origin/gh/jiayisunx/79/base 2025-10-10T00:44:16.7082961Z * [new branch] gh/jiayisunx/79/head -> origin/gh/jiayisunx/79/head 2025-10-10T00:44:16.7085331Z * [new branch] gh/jiayisunx/79/orig -> origin/gh/jiayisunx/79/orig 2025-10-10T00:44:16.7088689Z * [new branch] gh/jiayisunx/80/base -> origin/gh/jiayisunx/80/base 2025-10-10T00:44:16.7090634Z * [new branch] gh/jiayisunx/80/head -> origin/gh/jiayisunx/80/head 2025-10-10T00:44:16.7092550Z * [new branch] gh/jiayisunx/80/orig -> origin/gh/jiayisunx/80/orig 2025-10-10T00:44:16.7094951Z * [new branch] gh/jiayisunx/81/base -> origin/gh/jiayisunx/81/base 2025-10-10T00:44:16.7096825Z * [new branch] gh/jiayisunx/81/head -> origin/gh/jiayisunx/81/head 2025-10-10T00:44:16.7098805Z * [new branch] gh/jiayisunx/81/orig -> origin/gh/jiayisunx/81/orig 2025-10-10T00:44:16.7102451Z * [new branch] gh/jiayisunx/82/base -> origin/gh/jiayisunx/82/base 2025-10-10T00:44:16.7104638Z * [new branch] gh/jiayisunx/82/head -> origin/gh/jiayisunx/82/head 2025-10-10T00:44:16.7106405Z * [new branch] gh/jiayisunx/82/orig -> origin/gh/jiayisunx/82/orig 2025-10-10T00:44:16.7109279Z * [new branch] gh/jiayisunx/83/base -> origin/gh/jiayisunx/83/base 2025-10-10T00:44:16.7110873Z * [new branch] gh/jiayisunx/83/head -> origin/gh/jiayisunx/83/head 2025-10-10T00:44:16.7113044Z * [new branch] gh/jiayisunx/83/orig -> origin/gh/jiayisunx/83/orig 2025-10-10T00:44:16.7115234Z * [new branch] gh/jiayisunx/84/base -> origin/gh/jiayisunx/84/base 2025-10-10T00:44:16.7117261Z * [new branch] gh/jiayisunx/84/head -> origin/gh/jiayisunx/84/head 2025-10-10T00:44:16.7119121Z * [new branch] gh/jiayisunx/84/orig -> origin/gh/jiayisunx/84/orig 2025-10-10T00:44:16.7122087Z * [new branch] gh/jjwu@meta.com/1/base -> origin/gh/jjwu@meta.com/1/base 2025-10-10T00:44:16.7123851Z * [new branch] gh/jjwu@meta.com/1/head -> origin/gh/jjwu@meta.com/1/head 2025-10-10T00:44:16.7127374Z * [new branch] gh/karthickai/3/base -> origin/gh/karthickai/3/base 2025-10-10T00:44:16.7129080Z * [new branch] gh/karthickai/3/head -> origin/gh/karthickai/3/head 2025-10-10T00:44:16.7130954Z * [new branch] gh/karthickai/3/orig -> origin/gh/karthickai/3/orig 2025-10-10T00:44:16.7133490Z * [new branch] gh/karthickai/4/base -> origin/gh/karthickai/4/base 2025-10-10T00:44:16.7135521Z * [new branch] gh/karthickai/4/head -> origin/gh/karthickai/4/head 2025-10-10T00:44:16.7137365Z * [new branch] gh/karthickai/4/orig -> origin/gh/karthickai/4/orig 2025-10-10T00:44:16.7140665Z * [new branch] gh/karthickai/5/base -> origin/gh/karthickai/5/base 2025-10-10T00:44:16.7142675Z * [new branch] gh/karthickai/5/head -> origin/gh/karthickai/5/head 2025-10-10T00:44:16.7144541Z * [new branch] gh/karthickai/5/orig -> origin/gh/karthickai/5/orig 2025-10-10T00:44:16.7147170Z * [new branch] gh/karthickai/6/base -> origin/gh/karthickai/6/base 2025-10-10T00:44:16.7149209Z * [new branch] gh/karthickai/6/head -> origin/gh/karthickai/6/head 2025-10-10T00:44:16.7151041Z * [new branch] gh/karthickai/6/orig -> origin/gh/karthickai/6/orig 2025-10-10T00:44:16.7154273Z * [new branch] gh/kurtamohler/32/base -> origin/gh/kurtamohler/32/base 2025-10-10T00:44:16.7156705Z * [new branch] gh/kurtamohler/32/head -> origin/gh/kurtamohler/32/head 2025-10-10T00:44:16.7158551Z * [new branch] gh/kurtamohler/32/orig -> origin/gh/kurtamohler/32/orig 2025-10-10T00:44:16.7161076Z * [new branch] gh/kurtamohler/33/base -> origin/gh/kurtamohler/33/base 2025-10-10T00:44:16.7162925Z * [new branch] gh/kurtamohler/33/head -> origin/gh/kurtamohler/33/head 2025-10-10T00:44:16.7164751Z * [new branch] gh/kurtamohler/33/orig -> origin/gh/kurtamohler/33/orig 2025-10-10T00:44:16.7167881Z * [new branch] gh/kurtamohler/34/base -> origin/gh/kurtamohler/34/base 2025-10-10T00:44:16.7169974Z * [new branch] gh/kurtamohler/34/head -> origin/gh/kurtamohler/34/head 2025-10-10T00:44:16.7171704Z * [new branch] gh/kurtamohler/34/orig -> origin/gh/kurtamohler/34/orig 2025-10-10T00:44:16.7174156Z * [new branch] gh/kurtamohler/51/base -> origin/gh/kurtamohler/51/base 2025-10-10T00:44:16.7176610Z * [new branch] gh/kurtamohler/51/head -> origin/gh/kurtamohler/51/head 2025-10-10T00:44:16.7178443Z * [new branch] gh/kurtamohler/51/orig -> origin/gh/kurtamohler/51/orig 2025-10-10T00:44:16.7181119Z * [new branch] gh/kurtamohler/52/base -> origin/gh/kurtamohler/52/base 2025-10-10T00:44:16.7183117Z * [new branch] gh/kurtamohler/52/head -> origin/gh/kurtamohler/52/head 2025-10-10T00:44:16.7185026Z * [new branch] gh/kurtamohler/52/orig -> origin/gh/kurtamohler/52/orig 2025-10-10T00:44:16.7187634Z * [new branch] gh/kurtamohler/53/base -> origin/gh/kurtamohler/53/base 2025-10-10T00:44:16.7189995Z * [new branch] gh/kurtamohler/53/head -> origin/gh/kurtamohler/53/head 2025-10-10T00:44:16.7191823Z * [new branch] gh/kurtamohler/53/orig -> origin/gh/kurtamohler/53/orig 2025-10-10T00:44:16.7194734Z * [new branch] gh/kurtamohler/54/base -> origin/gh/kurtamohler/54/base 2025-10-10T00:44:16.7196582Z * [new branch] gh/kurtamohler/54/head -> origin/gh/kurtamohler/54/head 2025-10-10T00:44:16.7198593Z * [new branch] gh/kurtamohler/54/orig -> origin/gh/kurtamohler/54/orig 2025-10-10T00:44:16.7201342Z * [new branch] gh/kurtamohler/55/base -> origin/gh/kurtamohler/55/base 2025-10-10T00:44:16.7203184Z * [new branch] gh/kurtamohler/55/head -> origin/gh/kurtamohler/55/head 2025-10-10T00:44:16.7205010Z * [new branch] gh/kurtamohler/55/orig -> origin/gh/kurtamohler/55/orig 2025-10-10T00:44:16.7208762Z * [new branch] gh/kwen2501/130/base -> origin/gh/kwen2501/130/base 2025-10-10T00:44:16.7210612Z * [new branch] gh/kwen2501/130/head -> origin/gh/kwen2501/130/head 2025-10-10T00:44:16.7212448Z * [new branch] gh/kwen2501/130/orig -> origin/gh/kwen2501/130/orig 2025-10-10T00:44:16.7215026Z * [new branch] gh/kwen2501/15/base -> origin/gh/kwen2501/15/base 2025-10-10T00:44:16.7216903Z * [new branch] gh/kwen2501/15/head -> origin/gh/kwen2501/15/head 2025-10-10T00:44:16.7219443Z * [new branch] gh/kwen2501/170/base -> origin/gh/kwen2501/170/base 2025-10-10T00:44:16.7221335Z * [new branch] gh/kwen2501/170/head -> origin/gh/kwen2501/170/head 2025-10-10T00:44:16.7223922Z * [new branch] gh/kwen2501/187/base -> origin/gh/kwen2501/187/base 2025-10-10T00:44:16.7225837Z * [new branch] gh/kwen2501/187/head -> origin/gh/kwen2501/187/head 2025-10-10T00:44:16.7227706Z * [new branch] gh/kwen2501/187/orig -> origin/gh/kwen2501/187/orig 2025-10-10T00:44:16.7230422Z * [new branch] gh/kwen2501/188/base -> origin/gh/kwen2501/188/base 2025-10-10T00:44:16.7232443Z * [new branch] gh/kwen2501/188/head -> origin/gh/kwen2501/188/head 2025-10-10T00:44:16.7234152Z * [new branch] gh/kwen2501/188/orig -> origin/gh/kwen2501/188/orig 2025-10-10T00:44:16.7236708Z * [new branch] gh/kwen2501/211/base -> origin/gh/kwen2501/211/base 2025-10-10T00:44:16.7238526Z * [new branch] gh/kwen2501/211/head -> origin/gh/kwen2501/211/head 2025-10-10T00:44:16.7241069Z * [new branch] gh/kwen2501/222/base -> origin/gh/kwen2501/222/base 2025-10-10T00:44:16.7242939Z * [new branch] gh/kwen2501/222/head -> origin/gh/kwen2501/222/head 2025-10-10T00:44:16.7244991Z * [new branch] gh/kwen2501/222/orig -> origin/gh/kwen2501/222/orig 2025-10-10T00:44:16.7248032Z * [new branch] gh/kwen2501/224/base -> origin/gh/kwen2501/224/base 2025-10-10T00:44:16.7249951Z * [new branch] gh/kwen2501/224/head -> origin/gh/kwen2501/224/head 2025-10-10T00:44:16.7251883Z * [new branch] gh/kwen2501/224/orig -> origin/gh/kwen2501/224/orig 2025-10-10T00:44:16.7254421Z * [new branch] gh/kwen2501/228/base -> origin/gh/kwen2501/228/base 2025-10-10T00:44:16.7256364Z * [new branch] gh/kwen2501/228/head -> origin/gh/kwen2501/228/head 2025-10-10T00:44:16.7258195Z * [new branch] gh/kwen2501/228/orig -> origin/gh/kwen2501/228/orig 2025-10-10T00:44:16.7260767Z * [new branch] gh/kwen2501/230/base -> origin/gh/kwen2501/230/base 2025-10-10T00:44:16.7262703Z * [new branch] gh/kwen2501/230/head -> origin/gh/kwen2501/230/head 2025-10-10T00:44:16.7264600Z * [new branch] gh/kwen2501/230/orig -> origin/gh/kwen2501/230/orig 2025-10-10T00:44:16.7267151Z * [new branch] gh/kwen2501/231/base -> origin/gh/kwen2501/231/base 2025-10-10T00:44:16.7270115Z * [new branch] gh/kwen2501/231/head -> origin/gh/kwen2501/231/head 2025-10-10T00:44:16.7272236Z * [new branch] gh/kwen2501/231/orig -> origin/gh/kwen2501/231/orig 2025-10-10T00:44:16.7274532Z * [new branch] gh/kwen2501/232/base -> origin/gh/kwen2501/232/base 2025-10-10T00:44:16.7276405Z * [new branch] gh/kwen2501/232/head -> origin/gh/kwen2501/232/head 2025-10-10T00:44:16.7278267Z * [new branch] gh/kwen2501/232/orig -> origin/gh/kwen2501/232/orig 2025-10-10T00:44:16.7280766Z * [new branch] gh/kwen2501/233/base -> origin/gh/kwen2501/233/base 2025-10-10T00:44:16.7282667Z * [new branch] gh/kwen2501/233/head -> origin/gh/kwen2501/233/head 2025-10-10T00:44:16.7284519Z * [new branch] gh/kwen2501/233/orig -> origin/gh/kwen2501/233/orig 2025-10-10T00:44:16.7287852Z * [new branch] gh/kwen2501/234/base -> origin/gh/kwen2501/234/base 2025-10-10T00:44:16.7289836Z * [new branch] gh/kwen2501/234/head -> origin/gh/kwen2501/234/head 2025-10-10T00:44:16.7291685Z * [new branch] gh/kwen2501/234/orig -> origin/gh/kwen2501/234/orig 2025-10-10T00:44:16.7294219Z * [new branch] gh/kwen2501/235/base -> origin/gh/kwen2501/235/base 2025-10-10T00:44:16.7296084Z * [new branch] gh/kwen2501/235/head -> origin/gh/kwen2501/235/head 2025-10-10T00:44:16.7297974Z * [new branch] gh/kwen2501/235/orig -> origin/gh/kwen2501/235/orig 2025-10-10T00:44:16.7300725Z * [new branch] gh/kwen2501/236/base -> origin/gh/kwen2501/236/base 2025-10-10T00:44:16.7302597Z * [new branch] gh/kwen2501/236/head -> origin/gh/kwen2501/236/head 2025-10-10T00:44:16.7304469Z * [new branch] gh/kwen2501/236/orig -> origin/gh/kwen2501/236/orig 2025-10-10T00:44:16.7307029Z * [new branch] gh/kwen2501/237/base -> origin/gh/kwen2501/237/base 2025-10-10T00:44:16.7308995Z * [new branch] gh/kwen2501/237/head -> origin/gh/kwen2501/237/head 2025-10-10T00:44:16.7310839Z * [new branch] gh/kwen2501/237/orig -> origin/gh/kwen2501/237/orig 2025-10-10T00:44:16.7313916Z * [new branch] gh/kwen2501/238/base -> origin/gh/kwen2501/238/base 2025-10-10T00:44:16.7315834Z * [new branch] gh/kwen2501/238/head -> origin/gh/kwen2501/238/head 2025-10-10T00:44:16.7317684Z * [new branch] gh/kwen2501/238/orig -> origin/gh/kwen2501/238/orig 2025-10-10T00:44:16.7320314Z * [new branch] gh/kwen2501/239/base -> origin/gh/kwen2501/239/base 2025-10-10T00:44:16.7322289Z * [new branch] gh/kwen2501/239/head -> origin/gh/kwen2501/239/head 2025-10-10T00:44:16.7324017Z * [new branch] gh/kwen2501/239/orig -> origin/gh/kwen2501/239/orig 2025-10-10T00:44:16.7326592Z * [new branch] gh/kwen2501/240/base -> origin/gh/kwen2501/240/base 2025-10-10T00:44:16.7328635Z * [new branch] gh/kwen2501/240/head -> origin/gh/kwen2501/240/head 2025-10-10T00:44:16.7330512Z * [new branch] gh/kwen2501/240/orig -> origin/gh/kwen2501/240/orig 2025-10-10T00:44:16.7333109Z * [new branch] gh/kwen2501/241/base -> origin/gh/kwen2501/241/base 2025-10-10T00:44:16.7335155Z * [new branch] gh/kwen2501/241/head -> origin/gh/kwen2501/241/head 2025-10-10T00:44:16.7337016Z * [new branch] gh/kwen2501/241/orig -> origin/gh/kwen2501/241/orig 2025-10-10T00:44:16.7339499Z * [new branch] gh/kwen2501/242/base -> origin/gh/kwen2501/242/base 2025-10-10T00:44:16.7341334Z * [new branch] gh/kwen2501/242/head -> origin/gh/kwen2501/242/head 2025-10-10T00:44:16.7343225Z * [new branch] gh/kwen2501/242/orig -> origin/gh/kwen2501/242/orig 2025-10-10T00:44:16.7345694Z * [new branch] gh/kwen2501/243/base -> origin/gh/kwen2501/243/base 2025-10-10T00:44:16.7347632Z * [new branch] gh/kwen2501/243/head -> origin/gh/kwen2501/243/head 2025-10-10T00:44:16.7356977Z * [new branch] gh/kwen2501/243/orig -> origin/gh/kwen2501/243/orig 2025-10-10T00:44:16.7357226Z * [new branch] gh/kwen2501/244/base -> origin/gh/kwen2501/244/base 2025-10-10T00:44:16.7357432Z * [new branch] gh/kwen2501/244/head -> origin/gh/kwen2501/244/head 2025-10-10T00:44:16.7357631Z * [new branch] gh/kwen2501/244/orig -> origin/gh/kwen2501/244/orig 2025-10-10T00:44:16.7358341Z * [new branch] gh/kwen2501/245/base -> origin/gh/kwen2501/245/base 2025-10-10T00:44:16.7360453Z * [new branch] gh/kwen2501/245/head -> origin/gh/kwen2501/245/head 2025-10-10T00:44:16.7362355Z * [new branch] gh/kwen2501/245/orig -> origin/gh/kwen2501/245/orig 2025-10-10T00:44:16.7364826Z * [new branch] gh/kwen2501/246/base -> origin/gh/kwen2501/246/base 2025-10-10T00:44:16.7366754Z * [new branch] gh/kwen2501/246/head -> origin/gh/kwen2501/246/head 2025-10-10T00:44:16.7368783Z * [new branch] gh/kwen2501/246/orig -> origin/gh/kwen2501/246/orig 2025-10-10T00:44:16.7371430Z * [new branch] gh/kwen2501/247/base -> origin/gh/kwen2501/247/base 2025-10-10T00:44:16.7373338Z * [new branch] gh/kwen2501/247/head -> origin/gh/kwen2501/247/head 2025-10-10T00:44:16.7375218Z * [new branch] gh/kwen2501/247/orig -> origin/gh/kwen2501/247/orig 2025-10-10T00:44:16.7377861Z * [new branch] gh/kwen2501/248/base -> origin/gh/kwen2501/248/base 2025-10-10T00:44:16.7379750Z * [new branch] gh/kwen2501/248/head -> origin/gh/kwen2501/248/head 2025-10-10T00:44:16.7381584Z * [new branch] gh/kwen2501/248/orig -> origin/gh/kwen2501/248/orig 2025-10-10T00:44:16.7384087Z * [new branch] gh/kwen2501/249/base -> origin/gh/kwen2501/249/base 2025-10-10T00:44:16.7386027Z * [new branch] gh/kwen2501/249/head -> origin/gh/kwen2501/249/head 2025-10-10T00:44:16.7387963Z * [new branch] gh/kwen2501/249/orig -> origin/gh/kwen2501/249/orig 2025-10-10T00:44:16.7390601Z * [new branch] gh/kwen2501/250/base -> origin/gh/kwen2501/250/base 2025-10-10T00:44:16.7392537Z * [new branch] gh/kwen2501/250/head -> origin/gh/kwen2501/250/head 2025-10-10T00:44:16.7394351Z * [new branch] gh/kwen2501/250/orig -> origin/gh/kwen2501/250/orig 2025-10-10T00:44:16.7397094Z * [new branch] gh/kwen2501/251/base -> origin/gh/kwen2501/251/base 2025-10-10T00:44:16.7399083Z * [new branch] gh/kwen2501/251/head -> origin/gh/kwen2501/251/head 2025-10-10T00:44:16.7403129Z * [new branch] gh/kwen2501/251/orig -> origin/gh/kwen2501/251/orig 2025-10-10T00:44:16.7405734Z * [new branch] gh/kwen2501/252/base -> origin/gh/kwen2501/252/base 2025-10-10T00:44:16.7407695Z * [new branch] gh/kwen2501/252/head -> origin/gh/kwen2501/252/head 2025-10-10T00:44:16.7409594Z * [new branch] gh/kwen2501/252/orig -> origin/gh/kwen2501/252/orig 2025-10-10T00:44:16.7412237Z * [new branch] gh/kwen2501/253/base -> origin/gh/kwen2501/253/base 2025-10-10T00:44:16.7414071Z * [new branch] gh/kwen2501/253/head -> origin/gh/kwen2501/253/head 2025-10-10T00:44:16.7415889Z * [new branch] gh/kwen2501/253/orig -> origin/gh/kwen2501/253/orig 2025-10-10T00:44:16.7418509Z * [new branch] gh/kwen2501/254/base -> origin/gh/kwen2501/254/base 2025-10-10T00:44:16.7420377Z * [new branch] gh/kwen2501/254/head -> origin/gh/kwen2501/254/head 2025-10-10T00:44:16.7422224Z * [new branch] gh/kwen2501/254/orig -> origin/gh/kwen2501/254/orig 2025-10-10T00:44:16.7424869Z * [new branch] gh/kwen2501/255/base -> origin/gh/kwen2501/255/base 2025-10-10T00:44:16.7426773Z * [new branch] gh/kwen2501/255/head -> origin/gh/kwen2501/255/head 2025-10-10T00:44:16.7428675Z * [new branch] gh/kwen2501/255/orig -> origin/gh/kwen2501/255/orig 2025-10-10T00:44:16.7431364Z * [new branch] gh/kwen2501/256/base -> origin/gh/kwen2501/256/base 2025-10-10T00:44:16.7433214Z * [new branch] gh/kwen2501/256/head -> origin/gh/kwen2501/256/head 2025-10-10T00:44:16.7435077Z * [new branch] gh/kwen2501/256/orig -> origin/gh/kwen2501/256/orig 2025-10-10T00:44:16.7437693Z * [new branch] gh/kwen2501/257/base -> origin/gh/kwen2501/257/base 2025-10-10T00:44:16.7439628Z * [new branch] gh/kwen2501/257/head -> origin/gh/kwen2501/257/head 2025-10-10T00:44:16.7441481Z * [new branch] gh/kwen2501/257/orig -> origin/gh/kwen2501/257/orig 2025-10-10T00:44:16.7444121Z * [new branch] gh/kwen2501/258/base -> origin/gh/kwen2501/258/base 2025-10-10T00:44:16.7445943Z * [new branch] gh/kwen2501/258/head -> origin/gh/kwen2501/258/head 2025-10-10T00:44:16.7447918Z * [new branch] gh/kwen2501/258/orig -> origin/gh/kwen2501/258/orig 2025-10-10T00:44:16.7450565Z * [new branch] gh/kwen2501/259/base -> origin/gh/kwen2501/259/base 2025-10-10T00:44:16.7452430Z * [new branch] gh/kwen2501/259/head -> origin/gh/kwen2501/259/head 2025-10-10T00:44:16.7454227Z * [new branch] gh/kwen2501/259/orig -> origin/gh/kwen2501/259/orig 2025-10-10T00:44:16.7456894Z * [new branch] gh/kwen2501/260/base -> origin/gh/kwen2501/260/base 2025-10-10T00:44:16.7458776Z * [new branch] gh/kwen2501/260/head -> origin/gh/kwen2501/260/head 2025-10-10T00:44:16.7460685Z * [new branch] gh/kwen2501/260/orig -> origin/gh/kwen2501/260/orig 2025-10-10T00:44:16.7463276Z * [new branch] gh/kwen2501/261/base -> origin/gh/kwen2501/261/base 2025-10-10T00:44:16.7465213Z * [new branch] gh/kwen2501/261/head -> origin/gh/kwen2501/261/head 2025-10-10T00:44:16.7467057Z * [new branch] gh/kwen2501/261/orig -> origin/gh/kwen2501/261/orig 2025-10-10T00:44:16.7469653Z * [new branch] gh/kwen2501/262/base -> origin/gh/kwen2501/262/base 2025-10-10T00:44:16.7471566Z * [new branch] gh/kwen2501/262/head -> origin/gh/kwen2501/262/head 2025-10-10T00:44:16.7473596Z * [new branch] gh/kwen2501/262/orig -> origin/gh/kwen2501/262/orig 2025-10-10T00:44:16.7476069Z * [new branch] gh/kwen2501/263/base -> origin/gh/kwen2501/263/base 2025-10-10T00:44:16.7477874Z * [new branch] gh/kwen2501/263/head -> origin/gh/kwen2501/263/head 2025-10-10T00:44:16.7479710Z * [new branch] gh/kwen2501/263/orig -> origin/gh/kwen2501/263/orig 2025-10-10T00:44:16.7482318Z * [new branch] gh/kwen2501/264/base -> origin/gh/kwen2501/264/base 2025-10-10T00:44:16.7484152Z * [new branch] gh/kwen2501/264/head -> origin/gh/kwen2501/264/head 2025-10-10T00:44:16.7486050Z * [new branch] gh/kwen2501/264/orig -> origin/gh/kwen2501/264/orig 2025-10-10T00:44:16.7488855Z * [new branch] gh/kwen2501/265/base -> origin/gh/kwen2501/265/base 2025-10-10T00:44:16.7490757Z * [new branch] gh/kwen2501/265/head -> origin/gh/kwen2501/265/head 2025-10-10T00:44:16.7493147Z * [new branch] gh/kwen2501/265/orig -> origin/gh/kwen2501/265/orig 2025-10-10T00:44:16.7495709Z * [new branch] gh/kwen2501/266/base -> origin/gh/kwen2501/266/base 2025-10-10T00:44:16.7497554Z * [new branch] gh/kwen2501/266/head -> origin/gh/kwen2501/266/head 2025-10-10T00:44:16.7499761Z * [new branch] gh/kwen2501/266/orig -> origin/gh/kwen2501/266/orig 2025-10-10T00:44:16.7502393Z * [new branch] gh/kwen2501/267/base -> origin/gh/kwen2501/267/base 2025-10-10T00:44:16.7504171Z * [new branch] gh/kwen2501/267/head -> origin/gh/kwen2501/267/head 2025-10-10T00:44:16.7506002Z * [new branch] gh/kwen2501/267/orig -> origin/gh/kwen2501/267/orig 2025-10-10T00:44:16.7508641Z * [new branch] gh/kwen2501/268/base -> origin/gh/kwen2501/268/base 2025-10-10T00:44:16.7510531Z * [new branch] gh/kwen2501/268/head -> origin/gh/kwen2501/268/head 2025-10-10T00:44:16.7512516Z * [new branch] gh/kwen2501/268/orig -> origin/gh/kwen2501/268/orig 2025-10-10T00:44:16.7515059Z * [new branch] gh/kwen2501/269/base -> origin/gh/kwen2501/269/base 2025-10-10T00:44:16.7517118Z * [new branch] gh/kwen2501/269/head -> origin/gh/kwen2501/269/head 2025-10-10T00:44:16.7518960Z * [new branch] gh/kwen2501/269/orig -> origin/gh/kwen2501/269/orig 2025-10-10T00:44:16.7521596Z * [new branch] gh/kwen2501/270/base -> origin/gh/kwen2501/270/base 2025-10-10T00:44:16.7523499Z * [new branch] gh/kwen2501/270/head -> origin/gh/kwen2501/270/head 2025-10-10T00:44:16.7525397Z * [new branch] gh/kwen2501/270/orig -> origin/gh/kwen2501/270/orig 2025-10-10T00:44:16.7528239Z * [new branch] gh/kwen2501/271/base -> origin/gh/kwen2501/271/base 2025-10-10T00:44:16.7530090Z * [new branch] gh/kwen2501/271/head -> origin/gh/kwen2501/271/head 2025-10-10T00:44:16.7531990Z * [new branch] gh/kwen2501/271/orig -> origin/gh/kwen2501/271/orig 2025-10-10T00:44:16.7534540Z * [new branch] gh/kwen2501/272/base -> origin/gh/kwen2501/272/base 2025-10-10T00:44:16.7536369Z * [new branch] gh/kwen2501/272/head -> origin/gh/kwen2501/272/head 2025-10-10T00:44:16.7538232Z * [new branch] gh/kwen2501/272/orig -> origin/gh/kwen2501/272/orig 2025-10-10T00:44:16.7540810Z * [new branch] gh/kwen2501/273/base -> origin/gh/kwen2501/273/base 2025-10-10T00:44:16.7542773Z * [new branch] gh/kwen2501/273/head -> origin/gh/kwen2501/273/head 2025-10-10T00:44:16.7544760Z * [new branch] gh/kwen2501/273/orig -> origin/gh/kwen2501/273/orig 2025-10-10T00:44:16.7547412Z * [new branch] gh/kwen2501/274/base -> origin/gh/kwen2501/274/base 2025-10-10T00:44:16.7549435Z * [new branch] gh/kwen2501/274/head -> origin/gh/kwen2501/274/head 2025-10-10T00:44:16.7551183Z * [new branch] gh/kwen2501/274/orig -> origin/gh/kwen2501/274/orig 2025-10-10T00:44:16.7554252Z * [new branch] gh/laithsakka/251/base -> origin/gh/laithsakka/251/base 2025-10-10T00:44:16.7556499Z * [new branch] gh/laithsakka/251/head -> origin/gh/laithsakka/251/head 2025-10-10T00:44:16.7558354Z * [new branch] gh/laithsakka/251/orig -> origin/gh/laithsakka/251/orig 2025-10-10T00:44:16.7561163Z * [new branch] gh/laithsakka/262/base -> origin/gh/laithsakka/262/base 2025-10-10T00:44:16.7562959Z * [new branch] gh/laithsakka/262/head -> origin/gh/laithsakka/262/head 2025-10-10T00:44:16.7565432Z * [new branch] gh/laithsakka/262/orig -> origin/gh/laithsakka/262/orig 2025-10-10T00:44:16.7568288Z * [new branch] gh/laithsakka/263/base -> origin/gh/laithsakka/263/base 2025-10-10T00:44:16.7570156Z * [new branch] gh/laithsakka/263/head -> origin/gh/laithsakka/263/head 2025-10-10T00:44:16.7571991Z * [new branch] gh/laithsakka/263/orig -> origin/gh/laithsakka/263/orig 2025-10-10T00:44:16.7574410Z * [new branch] gh/laithsakka/264/base -> origin/gh/laithsakka/264/base 2025-10-10T00:44:16.7576269Z * [new branch] gh/laithsakka/264/head -> origin/gh/laithsakka/264/head 2025-10-10T00:44:16.7578145Z * [new branch] gh/laithsakka/264/orig -> origin/gh/laithsakka/264/orig 2025-10-10T00:44:16.7580630Z * [new branch] gh/laithsakka/268/base -> origin/gh/laithsakka/268/base 2025-10-10T00:44:16.7582457Z * [new branch] gh/laithsakka/268/head -> origin/gh/laithsakka/268/head 2025-10-10T00:44:16.7584278Z * [new branch] gh/laithsakka/268/orig -> origin/gh/laithsakka/268/orig 2025-10-10T00:44:16.7586897Z * [new branch] gh/laithsakka/269/base -> origin/gh/laithsakka/269/base 2025-10-10T00:44:16.7588755Z * [new branch] gh/laithsakka/269/head -> origin/gh/laithsakka/269/head 2025-10-10T00:44:16.7590598Z * [new branch] gh/laithsakka/269/orig -> origin/gh/laithsakka/269/orig 2025-10-10T00:44:16.7593241Z * [new branch] gh/laithsakka/271/base -> origin/gh/laithsakka/271/base 2025-10-10T00:44:16.7595102Z * [new branch] gh/laithsakka/271/head -> origin/gh/laithsakka/271/head 2025-10-10T00:44:16.7596972Z * [new branch] gh/laithsakka/271/orig -> origin/gh/laithsakka/271/orig 2025-10-10T00:44:16.7599700Z * [new branch] gh/laithsakka/272/base -> origin/gh/laithsakka/272/base 2025-10-10T00:44:16.7601564Z * [new branch] gh/laithsakka/272/head -> origin/gh/laithsakka/272/head 2025-10-10T00:44:16.7603400Z * [new branch] gh/laithsakka/272/orig -> origin/gh/laithsakka/272/orig 2025-10-10T00:44:16.7605925Z * [new branch] gh/laithsakka/273/base -> origin/gh/laithsakka/273/base 2025-10-10T00:44:16.7607924Z * [new branch] gh/laithsakka/273/head -> origin/gh/laithsakka/273/head 2025-10-10T00:44:16.7609828Z * [new branch] gh/laithsakka/273/orig -> origin/gh/laithsakka/273/orig 2025-10-10T00:44:16.7612409Z * [new branch] gh/laithsakka/274/base -> origin/gh/laithsakka/274/base 2025-10-10T00:44:16.7614297Z * [new branch] gh/laithsakka/274/head -> origin/gh/laithsakka/274/head 2025-10-10T00:44:16.7616134Z * [new branch] gh/laithsakka/274/orig -> origin/gh/laithsakka/274/orig 2025-10-10T00:44:16.7618786Z * [new branch] gh/laithsakka/275/base -> origin/gh/laithsakka/275/base 2025-10-10T00:44:16.7620693Z * [new branch] gh/laithsakka/275/head -> origin/gh/laithsakka/275/head 2025-10-10T00:44:16.7622524Z * [new branch] gh/laithsakka/275/orig -> origin/gh/laithsakka/275/orig 2025-10-10T00:44:16.7625189Z * [new branch] gh/laithsakka/276/base -> origin/gh/laithsakka/276/base 2025-10-10T00:44:16.7626877Z * [new branch] gh/laithsakka/276/head -> origin/gh/laithsakka/276/head 2025-10-10T00:44:16.7628765Z * [new branch] gh/laithsakka/276/orig -> origin/gh/laithsakka/276/orig 2025-10-10T00:44:16.7631451Z * [new branch] gh/laithsakka/277/base -> origin/gh/laithsakka/277/base 2025-10-10T00:44:16.7633304Z * [new branch] gh/laithsakka/277/head -> origin/gh/laithsakka/277/head 2025-10-10T00:44:16.7635203Z * [new branch] gh/laithsakka/277/orig -> origin/gh/laithsakka/277/orig 2025-10-10T00:44:16.7637600Z * [new branch] gh/laithsakka/278/base -> origin/gh/laithsakka/278/base 2025-10-10T00:44:16.7639465Z * [new branch] gh/laithsakka/278/head -> origin/gh/laithsakka/278/head 2025-10-10T00:44:16.7641292Z * [new branch] gh/laithsakka/278/orig -> origin/gh/laithsakka/278/orig 2025-10-10T00:44:16.7644093Z * [new branch] gh/laithsakka/279/base -> origin/gh/laithsakka/279/base 2025-10-10T00:44:16.7645955Z * [new branch] gh/laithsakka/279/head -> origin/gh/laithsakka/279/head 2025-10-10T00:44:16.7647930Z * [new branch] gh/laithsakka/279/orig -> origin/gh/laithsakka/279/orig 2025-10-10T00:44:16.7650724Z * [new branch] gh/laithsakka/28/base -> origin/gh/laithsakka/28/base 2025-10-10T00:44:16.7653806Z * [new branch] gh/laithsakka/280/base -> origin/gh/laithsakka/280/base 2025-10-10T00:44:16.7656952Z * [new branch] gh/laithsakka/280/head -> origin/gh/laithsakka/280/head 2025-10-10T00:44:16.7658844Z * [new branch] gh/laithsakka/280/orig -> origin/gh/laithsakka/280/orig 2025-10-10T00:44:16.7661801Z * [new branch] gh/laithsakka/281/base -> origin/gh/laithsakka/281/base 2025-10-10T00:44:16.7663663Z * [new branch] gh/laithsakka/281/head -> origin/gh/laithsakka/281/head 2025-10-10T00:44:16.7665561Z * [new branch] gh/laithsakka/281/orig -> origin/gh/laithsakka/281/orig 2025-10-10T00:44:16.7668075Z * [new branch] gh/laithsakka/282/base -> origin/gh/laithsakka/282/base 2025-10-10T00:44:16.7670034Z * [new branch] gh/laithsakka/282/head -> origin/gh/laithsakka/282/head 2025-10-10T00:44:16.7672023Z * [new branch] gh/laithsakka/282/orig -> origin/gh/laithsakka/282/orig 2025-10-10T00:44:16.7674737Z * [new branch] gh/laithsakka/283/base -> origin/gh/laithsakka/283/base 2025-10-10T00:44:16.7676493Z * [new branch] gh/laithsakka/283/head -> origin/gh/laithsakka/283/head 2025-10-10T00:44:16.7678336Z * [new branch] gh/laithsakka/283/orig -> origin/gh/laithsakka/283/orig 2025-10-10T00:44:16.7681007Z * [new branch] gh/laithsakka/284/base -> origin/gh/laithsakka/284/base 2025-10-10T00:44:16.7682859Z * [new branch] gh/laithsakka/284/head -> origin/gh/laithsakka/284/head 2025-10-10T00:44:16.7685023Z * [new branch] gh/laithsakka/284/orig -> origin/gh/laithsakka/284/orig 2025-10-10T00:44:16.7688490Z * [new branch] gh/laithsakka/285/base -> origin/gh/laithsakka/285/base 2025-10-10T00:44:16.7690444Z * [new branch] gh/laithsakka/285/head -> origin/gh/laithsakka/285/head 2025-10-10T00:44:16.7692241Z * [new branch] gh/laithsakka/285/orig -> origin/gh/laithsakka/285/orig 2025-10-10T00:44:16.7694917Z * [new branch] gh/laithsakka/286/base -> origin/gh/laithsakka/286/base 2025-10-10T00:44:16.7696686Z * [new branch] gh/laithsakka/286/head -> origin/gh/laithsakka/286/head 2025-10-10T00:44:16.7699793Z * [new branch] gh/laithsakka/286/orig -> origin/gh/laithsakka/286/orig 2025-10-10T00:44:16.7702657Z * [new branch] gh/laithsakka/287/base -> origin/gh/laithsakka/287/base 2025-10-10T00:44:16.7704348Z * [new branch] gh/laithsakka/287/head -> origin/gh/laithsakka/287/head 2025-10-10T00:44:16.7706217Z * [new branch] gh/laithsakka/287/orig -> origin/gh/laithsakka/287/orig 2025-10-10T00:44:16.7708896Z * [new branch] gh/laithsakka/288/base -> origin/gh/laithsakka/288/base 2025-10-10T00:44:16.7710781Z * [new branch] gh/laithsakka/288/head -> origin/gh/laithsakka/288/head 2025-10-10T00:44:16.7712606Z * [new branch] gh/laithsakka/288/orig -> origin/gh/laithsakka/288/orig 2025-10-10T00:44:16.7715439Z * [new branch] gh/laithsakka/289/base -> origin/gh/laithsakka/289/base 2025-10-10T00:44:16.7717501Z * [new branch] gh/laithsakka/289/head -> origin/gh/laithsakka/289/head 2025-10-10T00:44:16.7719311Z * [new branch] gh/laithsakka/289/orig -> origin/gh/laithsakka/289/orig 2025-10-10T00:44:16.7721787Z * [new branch] gh/laithsakka/29/base -> origin/gh/laithsakka/29/base 2025-10-10T00:44:16.7724373Z * [new branch] gh/laithsakka/290/base -> origin/gh/laithsakka/290/base 2025-10-10T00:44:16.7726456Z * [new branch] gh/laithsakka/290/head -> origin/gh/laithsakka/290/head 2025-10-10T00:44:16.7728585Z * [new branch] gh/laithsakka/290/orig -> origin/gh/laithsakka/290/orig 2025-10-10T00:44:16.7731068Z * [new branch] gh/laithsakka/291/base -> origin/gh/laithsakka/291/base 2025-10-10T00:44:16.7732960Z * [new branch] gh/laithsakka/291/head -> origin/gh/laithsakka/291/head 2025-10-10T00:44:16.7734833Z * [new branch] gh/laithsakka/291/orig -> origin/gh/laithsakka/291/orig 2025-10-10T00:44:16.7737680Z * [new branch] gh/laithsakka/292/base -> origin/gh/laithsakka/292/base 2025-10-10T00:44:16.7739537Z * [new branch] gh/laithsakka/292/head -> origin/gh/laithsakka/292/head 2025-10-10T00:44:16.7741424Z * [new branch] gh/laithsakka/292/orig -> origin/gh/laithsakka/292/orig 2025-10-10T00:44:16.7743912Z * [new branch] gh/laithsakka/293/base -> origin/gh/laithsakka/293/base 2025-10-10T00:44:16.7745780Z * [new branch] gh/laithsakka/293/head -> origin/gh/laithsakka/293/head 2025-10-10T00:44:16.7748126Z * [new branch] gh/laithsakka/293/orig -> origin/gh/laithsakka/293/orig 2025-10-10T00:44:16.7750897Z * [new branch] gh/laithsakka/294/base -> origin/gh/laithsakka/294/base 2025-10-10T00:44:16.7752824Z * [new branch] gh/laithsakka/294/head -> origin/gh/laithsakka/294/head 2025-10-10T00:44:16.7754675Z * [new branch] gh/laithsakka/294/orig -> origin/gh/laithsakka/294/orig 2025-10-10T00:44:16.7757366Z * [new branch] gh/laithsakka/295/base -> origin/gh/laithsakka/295/base 2025-10-10T00:44:16.7759568Z * [new branch] gh/laithsakka/295/head -> origin/gh/laithsakka/295/head 2025-10-10T00:44:16.7760935Z * [new branch] gh/laithsakka/295/orig -> origin/gh/laithsakka/295/orig 2025-10-10T00:44:16.7763643Z * [new branch] gh/laithsakka/296/base -> origin/gh/laithsakka/296/base 2025-10-10T00:44:16.7765536Z * [new branch] gh/laithsakka/296/head -> origin/gh/laithsakka/296/head 2025-10-10T00:44:16.7767390Z * [new branch] gh/laithsakka/296/orig -> origin/gh/laithsakka/296/orig 2025-10-10T00:44:16.7770154Z * [new branch] gh/laithsakka/297/base -> origin/gh/laithsakka/297/base 2025-10-10T00:44:16.7771963Z * [new branch] gh/laithsakka/297/head -> origin/gh/laithsakka/297/head 2025-10-10T00:44:16.7773851Z * [new branch] gh/laithsakka/297/orig -> origin/gh/laithsakka/297/orig 2025-10-10T00:44:16.7776738Z * [new branch] gh/laithsakka/298/base -> origin/gh/laithsakka/298/base 2025-10-10T00:44:16.7778678Z * [new branch] gh/laithsakka/298/head -> origin/gh/laithsakka/298/head 2025-10-10T00:44:16.7780471Z * [new branch] gh/laithsakka/298/orig -> origin/gh/laithsakka/298/orig 2025-10-10T00:44:16.7783050Z * [new branch] gh/laithsakka/299/base -> origin/gh/laithsakka/299/base 2025-10-10T00:44:16.7784928Z * [new branch] gh/laithsakka/299/head -> origin/gh/laithsakka/299/head 2025-10-10T00:44:16.7786764Z * [new branch] gh/laithsakka/299/orig -> origin/gh/laithsakka/299/orig 2025-10-10T00:44:16.7789212Z * [new branch] gh/laithsakka/30/base -> origin/gh/laithsakka/30/base 2025-10-10T00:44:16.7791105Z * [new branch] gh/laithsakka/30/head -> origin/gh/laithsakka/30/head 2025-10-10T00:44:16.7793887Z * [new branch] gh/laithsakka/300/base -> origin/gh/laithsakka/300/base 2025-10-10T00:44:16.7795610Z * [new branch] gh/laithsakka/300/head -> origin/gh/laithsakka/300/head 2025-10-10T00:44:16.7797521Z * [new branch] gh/laithsakka/300/orig -> origin/gh/laithsakka/300/orig 2025-10-10T00:44:16.7802665Z * [new branch] gh/laithsakka/301/base -> origin/gh/laithsakka/301/base 2025-10-10T00:44:16.7804570Z * [new branch] gh/laithsakka/301/head -> origin/gh/laithsakka/301/head 2025-10-10T00:44:16.7806421Z * [new branch] gh/laithsakka/301/orig -> origin/gh/laithsakka/301/orig 2025-10-10T00:44:16.7809243Z * [new branch] gh/laithsakka/302/base -> origin/gh/laithsakka/302/base 2025-10-10T00:44:16.7811073Z * [new branch] gh/laithsakka/302/head -> origin/gh/laithsakka/302/head 2025-10-10T00:44:16.7812864Z * [new branch] gh/laithsakka/302/orig -> origin/gh/laithsakka/302/orig 2025-10-10T00:44:16.7815322Z * [new branch] gh/laithsakka/303/base -> origin/gh/laithsakka/303/base 2025-10-10T00:44:16.7817175Z * [new branch] gh/laithsakka/303/head -> origin/gh/laithsakka/303/head 2025-10-10T00:44:16.7819046Z * [new branch] gh/laithsakka/303/orig -> origin/gh/laithsakka/303/orig 2025-10-10T00:44:16.7821490Z * [new branch] gh/laithsakka/304/base -> origin/gh/laithsakka/304/base 2025-10-10T00:44:16.7823348Z * [new branch] gh/laithsakka/304/head -> origin/gh/laithsakka/304/head 2025-10-10T00:44:16.7825185Z * [new branch] gh/laithsakka/304/orig -> origin/gh/laithsakka/304/orig 2025-10-10T00:44:16.7827849Z * [new branch] gh/laithsakka/305/base -> origin/gh/laithsakka/305/base 2025-10-10T00:44:16.7829761Z * [new branch] gh/laithsakka/305/head -> origin/gh/laithsakka/305/head 2025-10-10T00:44:16.7831521Z * [new branch] gh/laithsakka/305/orig -> origin/gh/laithsakka/305/orig 2025-10-10T00:44:16.7834094Z * [new branch] gh/laithsakka/306/base -> origin/gh/laithsakka/306/base 2025-10-10T00:44:16.7835948Z * [new branch] gh/laithsakka/306/head -> origin/gh/laithsakka/306/head 2025-10-10T00:44:16.7837818Z * [new branch] gh/laithsakka/306/orig -> origin/gh/laithsakka/306/orig 2025-10-10T00:44:16.7840268Z * [new branch] gh/laithsakka/307/base -> origin/gh/laithsakka/307/base 2025-10-10T00:44:16.7842162Z * [new branch] gh/laithsakka/307/head -> origin/gh/laithsakka/307/head 2025-10-10T00:44:16.7844004Z * [new branch] gh/laithsakka/307/orig -> origin/gh/laithsakka/307/orig 2025-10-10T00:44:16.7846462Z * [new branch] gh/laithsakka/308/base -> origin/gh/laithsakka/308/base 2025-10-10T00:44:16.7848592Z * [new branch] gh/laithsakka/308/head -> origin/gh/laithsakka/308/head 2025-10-10T00:44:16.7850441Z * [new branch] gh/laithsakka/308/orig -> origin/gh/laithsakka/308/orig 2025-10-10T00:44:16.7853113Z * [new branch] gh/laithsakka/309/base -> origin/gh/laithsakka/309/base 2025-10-10T00:44:16.7855047Z * [new branch] gh/laithsakka/309/head -> origin/gh/laithsakka/309/head 2025-10-10T00:44:16.7856878Z * [new branch] gh/laithsakka/309/orig -> origin/gh/laithsakka/309/orig 2025-10-10T00:44:16.7859388Z * [new branch] gh/laithsakka/31/base -> origin/gh/laithsakka/31/base 2025-10-10T00:44:16.7861168Z * [new branch] gh/laithsakka/31/head -> origin/gh/laithsakka/31/head 2025-10-10T00:44:16.7863877Z * [new branch] gh/laithsakka/310/base -> origin/gh/laithsakka/310/base 2025-10-10T00:44:16.7865685Z * [new branch] gh/laithsakka/310/head -> origin/gh/laithsakka/310/head 2025-10-10T00:44:16.7867504Z * [new branch] gh/laithsakka/310/orig -> origin/gh/laithsakka/310/orig 2025-10-10T00:44:16.7870163Z * [new branch] gh/laithsakka/311/base -> origin/gh/laithsakka/311/base 2025-10-10T00:44:16.7872022Z * [new branch] gh/laithsakka/311/head -> origin/gh/laithsakka/311/head 2025-10-10T00:44:16.7873949Z * [new branch] gh/laithsakka/311/orig -> origin/gh/laithsakka/311/orig 2025-10-10T00:44:16.7876860Z * [new branch] gh/laithsakka/312/base -> origin/gh/laithsakka/312/base 2025-10-10T00:44:16.7878946Z * [new branch] gh/laithsakka/312/head -> origin/gh/laithsakka/312/head 2025-10-10T00:44:16.7880848Z * [new branch] gh/laithsakka/312/orig -> origin/gh/laithsakka/312/orig 2025-10-10T00:44:16.7883319Z * [new branch] gh/laithsakka/313/base -> origin/gh/laithsakka/313/base 2025-10-10T00:44:16.7885209Z * [new branch] gh/laithsakka/313/head -> origin/gh/laithsakka/313/head 2025-10-10T00:44:16.7887117Z * [new branch] gh/laithsakka/313/orig -> origin/gh/laithsakka/313/orig 2025-10-10T00:44:16.7890369Z * [new branch] gh/laithsakka/32/base -> origin/gh/laithsakka/32/base 2025-10-10T00:44:16.7891653Z * [new branch] gh/laithsakka/32/head -> origin/gh/laithsakka/32/head 2025-10-10T00:44:16.7894982Z * [new branch] gh/liangel/1/base -> origin/gh/liangel/1/base 2025-10-10T00:44:16.7896790Z * [new branch] gh/liangel/1/head -> origin/gh/liangel/1/head 2025-10-10T00:44:16.7898995Z * [new branch] gh/liangel/1/orig -> origin/gh/liangel/1/orig 2025-10-10T00:44:16.7901614Z * [new branch] gh/liangel/2/base -> origin/gh/liangel/2/base 2025-10-10T00:44:16.7903450Z * [new branch] gh/liangel/2/head -> origin/gh/liangel/2/head 2025-10-10T00:44:16.7905434Z * [new branch] gh/liangel/2/orig -> origin/gh/liangel/2/orig 2025-10-10T00:44:16.7908016Z * [new branch] gh/liangel/3/base -> origin/gh/liangel/3/base 2025-10-10T00:44:16.7910389Z * [new branch] gh/liangel/3/head -> origin/gh/liangel/3/head 2025-10-10T00:44:16.7912215Z * [new branch] gh/liangel/3/orig -> origin/gh/liangel/3/orig 2025-10-10T00:44:16.7914778Z * [new branch] gh/liangel/4/base -> origin/gh/liangel/4/base 2025-10-10T00:44:16.7916650Z * [new branch] gh/liangel/4/head -> origin/gh/liangel/4/head 2025-10-10T00:44:16.7918538Z * [new branch] gh/liangel/4/orig -> origin/gh/liangel/4/orig 2025-10-10T00:44:16.7923413Z * [new branch] gh/lucaskabela/1/base -> origin/gh/lucaskabela/1/base 2025-10-10T00:44:16.7925256Z * [new branch] gh/lucaskabela/1/head -> origin/gh/lucaskabela/1/head 2025-10-10T00:44:16.7928438Z * [new branch] gh/lw/3/base -> origin/gh/lw/3/base 2025-10-10T00:44:16.7930411Z * [new branch] gh/lw/3/head -> origin/gh/lw/3/head 2025-10-10T00:44:16.7932404Z * [new branch] gh/lw/3/orig -> origin/gh/lw/3/orig 2025-10-10T00:44:16.7934810Z * [new branch] gh/lw/4/base -> origin/gh/lw/4/base 2025-10-10T00:44:16.7936646Z * [new branch] gh/lw/4/head -> origin/gh/lw/4/head 2025-10-10T00:44:16.7938475Z * [new branch] gh/lw/4/orig -> origin/gh/lw/4/orig 2025-10-10T00:44:16.7940975Z * [new branch] gh/lw/5/base -> origin/gh/lw/5/base 2025-10-10T00:44:16.7942825Z * [new branch] gh/lw/5/head -> origin/gh/lw/5/head 2025-10-10T00:44:16.7944669Z * [new branch] gh/lw/5/orig -> origin/gh/lw/5/orig 2025-10-10T00:44:16.7947257Z * [new branch] gh/lw/6/base -> origin/gh/lw/6/base 2025-10-10T00:44:16.7949223Z * [new branch] gh/lw/6/head -> origin/gh/lw/6/head 2025-10-10T00:44:16.7951030Z * [new branch] gh/lw/6/orig -> origin/gh/lw/6/orig 2025-10-10T00:44:16.7954292Z * [new branch] gh/malfet/14/base -> origin/gh/malfet/14/base 2025-10-10T00:44:16.7956840Z * [new branch] gh/malfet/396/base -> origin/gh/malfet/396/base 2025-10-10T00:44:16.7958715Z * [new branch] gh/malfet/396/head -> origin/gh/malfet/396/head 2025-10-10T00:44:16.7960513Z * [new branch] gh/malfet/396/orig -> origin/gh/malfet/396/orig 2025-10-10T00:44:16.7963019Z * [new branch] gh/malfet/397/base -> origin/gh/malfet/397/base 2025-10-10T00:44:16.7964918Z * [new branch] gh/malfet/397/head -> origin/gh/malfet/397/head 2025-10-10T00:44:16.7966784Z * [new branch] gh/malfet/397/orig -> origin/gh/malfet/397/orig 2025-10-10T00:44:16.7969470Z * [new branch] gh/malfet/398/base -> origin/gh/malfet/398/base 2025-10-10T00:44:16.7971208Z * [new branch] gh/malfet/398/head -> origin/gh/malfet/398/head 2025-10-10T00:44:16.7973110Z * [new branch] gh/malfet/398/orig -> origin/gh/malfet/398/orig 2025-10-10T00:44:16.7975686Z * [new branch] gh/malfet/399/base -> origin/gh/malfet/399/base 2025-10-10T00:44:16.7977519Z * [new branch] gh/malfet/399/head -> origin/gh/malfet/399/head 2025-10-10T00:44:16.7979561Z * [new branch] gh/malfet/399/orig -> origin/gh/malfet/399/orig 2025-10-10T00:44:16.7982001Z * [new branch] gh/malfet/414/base -> origin/gh/malfet/414/base 2025-10-10T00:44:16.7983840Z * [new branch] gh/malfet/414/head -> origin/gh/malfet/414/head 2025-10-10T00:44:16.7985710Z * [new branch] gh/malfet/414/orig -> origin/gh/malfet/414/orig 2025-10-10T00:44:16.7988280Z * [new branch] gh/malfet/417/base -> origin/gh/malfet/417/base 2025-10-10T00:44:16.7990120Z * [new branch] gh/malfet/417/head -> origin/gh/malfet/417/head 2025-10-10T00:44:16.7991985Z * [new branch] gh/malfet/417/orig -> origin/gh/malfet/417/orig 2025-10-10T00:44:16.7994409Z * [new branch] gh/malfet/418/base -> origin/gh/malfet/418/base 2025-10-10T00:44:16.7996242Z * [new branch] gh/malfet/418/head -> origin/gh/malfet/418/head 2025-10-10T00:44:16.7998079Z * [new branch] gh/malfet/418/orig -> origin/gh/malfet/418/orig 2025-10-10T00:44:16.8001451Z * [new branch] gh/malfet/505/base -> origin/gh/malfet/505/base 2025-10-10T00:44:16.8003278Z * [new branch] gh/malfet/505/head -> origin/gh/malfet/505/head 2025-10-10T00:44:16.8005315Z * [new branch] gh/malfet/505/orig -> origin/gh/malfet/505/orig 2025-10-10T00:44:16.8007920Z * [new branch] gh/malfet/506/base -> origin/gh/malfet/506/base 2025-10-10T00:44:16.8010036Z * [new branch] gh/malfet/506/head -> origin/gh/malfet/506/head 2025-10-10T00:44:16.8011821Z * [new branch] gh/malfet/506/orig -> origin/gh/malfet/506/orig 2025-10-10T00:44:16.8014344Z * [new branch] gh/malfet/507/base -> origin/gh/malfet/507/base 2025-10-10T00:44:16.8016177Z * [new branch] gh/malfet/507/head -> origin/gh/malfet/507/head 2025-10-10T00:44:16.8018016Z * [new branch] gh/malfet/507/orig -> origin/gh/malfet/507/orig 2025-10-10T00:44:16.8020600Z * [new branch] gh/malfet/513/base -> origin/gh/malfet/513/base 2025-10-10T00:44:16.8022672Z * [new branch] gh/malfet/513/head -> origin/gh/malfet/513/head 2025-10-10T00:44:16.8024554Z * [new branch] gh/malfet/513/orig -> origin/gh/malfet/513/orig 2025-10-10T00:44:16.8027544Z * [new branch] gh/malfet/516/base -> origin/gh/malfet/516/base 2025-10-10T00:44:16.8029413Z * [new branch] gh/malfet/516/head -> origin/gh/malfet/516/head 2025-10-10T00:44:16.8031350Z * [new branch] gh/malfet/516/orig -> origin/gh/malfet/516/orig 2025-10-10T00:44:16.8034102Z * [new branch] gh/malfet/517/base -> origin/gh/malfet/517/base 2025-10-10T00:44:16.8036036Z * [new branch] gh/malfet/517/head -> origin/gh/malfet/517/head 2025-10-10T00:44:16.8038566Z * [new branch] gh/malfet/518/base -> origin/gh/malfet/518/base 2025-10-10T00:44:16.8040429Z * [new branch] gh/malfet/518/head -> origin/gh/malfet/518/head 2025-10-10T00:44:16.8042317Z * [new branch] gh/malfet/518/orig -> origin/gh/malfet/518/orig 2025-10-10T00:44:16.8044969Z * [new branch] gh/malfet/519/base -> origin/gh/malfet/519/base 2025-10-10T00:44:16.8046892Z * [new branch] gh/malfet/519/head -> origin/gh/malfet/519/head 2025-10-10T00:44:16.8048949Z * [new branch] gh/malfet/519/orig -> origin/gh/malfet/519/orig 2025-10-10T00:44:16.8051516Z * [new branch] gh/malfet/520/base -> origin/gh/malfet/520/base 2025-10-10T00:44:16.8053375Z * [new branch] gh/malfet/520/head -> origin/gh/malfet/520/head 2025-10-10T00:44:16.8055237Z * [new branch] gh/malfet/520/orig -> origin/gh/malfet/520/orig 2025-10-10T00:44:16.8058383Z * [new branch] gh/malfet/521/base -> origin/gh/malfet/521/base 2025-10-10T00:44:16.8060276Z * [new branch] gh/malfet/521/head -> origin/gh/malfet/521/head 2025-10-10T00:44:16.8062173Z * [new branch] gh/malfet/521/orig -> origin/gh/malfet/521/orig 2025-10-10T00:44:16.8064817Z * [new branch] gh/malfet/522/base -> origin/gh/malfet/522/base 2025-10-10T00:44:16.8066725Z * [new branch] gh/malfet/522/head -> origin/gh/malfet/522/head 2025-10-10T00:44:16.8068679Z * [new branch] gh/malfet/522/orig -> origin/gh/malfet/522/orig 2025-10-10T00:44:16.8071454Z * [new branch] gh/malfet/523/base -> origin/gh/malfet/523/base 2025-10-10T00:44:16.8073856Z * [new branch] gh/malfet/523/head -> origin/gh/malfet/523/head 2025-10-10T00:44:16.8075798Z * [new branch] gh/malfet/523/orig -> origin/gh/malfet/523/orig 2025-10-10T00:44:16.8078503Z * [new branch] gh/malfet/524/base -> origin/gh/malfet/524/base 2025-10-10T00:44:16.8080458Z * [new branch] gh/malfet/524/head -> origin/gh/malfet/524/head 2025-10-10T00:44:16.8082248Z * [new branch] gh/malfet/524/orig -> origin/gh/malfet/524/orig 2025-10-10T00:44:16.8084995Z * [new branch] gh/malfet/525/base -> origin/gh/malfet/525/base 2025-10-10T00:44:16.8086843Z * [new branch] gh/malfet/525/head -> origin/gh/malfet/525/head 2025-10-10T00:44:16.8089018Z * [new branch] gh/malfet/525/orig -> origin/gh/malfet/525/orig 2025-10-10T00:44:16.8091499Z * [new branch] gh/malfet/526/base -> origin/gh/malfet/526/base 2025-10-10T00:44:16.8093377Z * [new branch] gh/malfet/526/head -> origin/gh/malfet/526/head 2025-10-10T00:44:16.8095287Z * [new branch] gh/malfet/526/orig -> origin/gh/malfet/526/orig 2025-10-10T00:44:16.8097785Z * [new branch] gh/malfet/527/base -> origin/gh/malfet/527/base 2025-10-10T00:44:16.8099919Z * [new branch] gh/malfet/527/head -> origin/gh/malfet/527/head 2025-10-10T00:44:16.8101676Z * [new branch] gh/malfet/527/orig -> origin/gh/malfet/527/orig 2025-10-10T00:44:16.8104307Z * [new branch] gh/malfet/528/base -> origin/gh/malfet/528/base 2025-10-10T00:44:16.8106191Z * [new branch] gh/malfet/528/head -> origin/gh/malfet/528/head 2025-10-10T00:44:16.8108054Z * [new branch] gh/malfet/528/orig -> origin/gh/malfet/528/orig 2025-10-10T00:44:16.8110747Z * [new branch] gh/malfet/529/base -> origin/gh/malfet/529/base 2025-10-10T00:44:16.8112633Z * [new branch] gh/malfet/529/head -> origin/gh/malfet/529/head 2025-10-10T00:44:16.8114493Z * [new branch] gh/malfet/529/orig -> origin/gh/malfet/529/orig 2025-10-10T00:44:16.8117105Z * [new branch] gh/malfet/530/base -> origin/gh/malfet/530/base 2025-10-10T00:44:16.8118950Z * [new branch] gh/malfet/530/head -> origin/gh/malfet/530/head 2025-10-10T00:44:16.8120799Z * [new branch] gh/malfet/530/orig -> origin/gh/malfet/530/orig 2025-10-10T00:44:16.8123367Z * [new branch] gh/malfet/531/base -> origin/gh/malfet/531/base 2025-10-10T00:44:16.8125225Z * [new branch] gh/malfet/531/head -> origin/gh/malfet/531/head 2025-10-10T00:44:16.8127214Z * [new branch] gh/malfet/531/orig -> origin/gh/malfet/531/orig 2025-10-10T00:44:16.8130560Z * [new branch] gh/malfet/532/base -> origin/gh/malfet/532/base 2025-10-10T00:44:16.8132373Z * [new branch] gh/malfet/532/head -> origin/gh/malfet/532/head 2025-10-10T00:44:16.8134236Z * [new branch] gh/malfet/532/orig -> origin/gh/malfet/532/orig 2025-10-10T00:44:16.8137047Z * [new branch] gh/malfet/533/base -> origin/gh/malfet/533/base 2025-10-10T00:44:16.8138917Z * [new branch] gh/malfet/533/head -> origin/gh/malfet/533/head 2025-10-10T00:44:16.8140758Z * [new branch] gh/malfet/533/orig -> origin/gh/malfet/533/orig 2025-10-10T00:44:16.8143344Z * [new branch] gh/malfet/534/base -> origin/gh/malfet/534/base 2025-10-10T00:44:16.8145246Z * [new branch] gh/malfet/534/head -> origin/gh/malfet/534/head 2025-10-10T00:44:16.8147101Z * [new branch] gh/malfet/534/orig -> origin/gh/malfet/534/orig 2025-10-10T00:44:16.8149720Z * [new branch] gh/malfet/535/base -> origin/gh/malfet/535/base 2025-10-10T00:44:16.8151606Z * [new branch] gh/malfet/535/head -> origin/gh/malfet/535/head 2025-10-10T00:44:16.8153425Z * [new branch] gh/malfet/535/orig -> origin/gh/malfet/535/orig 2025-10-10T00:44:16.8156039Z * [new branch] gh/malfet/536/base -> origin/gh/malfet/536/base 2025-10-10T00:44:16.8157863Z * [new branch] gh/malfet/536/head -> origin/gh/malfet/536/head 2025-10-10T00:44:16.8159746Z * [new branch] gh/malfet/536/orig -> origin/gh/malfet/536/orig 2025-10-10T00:44:16.8162391Z * [new branch] gh/malfet/537/base -> origin/gh/malfet/537/base 2025-10-10T00:44:16.8164260Z * [new branch] gh/malfet/537/head -> origin/gh/malfet/537/head 2025-10-10T00:44:16.8166329Z * [new branch] gh/malfet/537/orig -> origin/gh/malfet/537/orig 2025-10-10T00:44:16.8169069Z * [new branch] gh/malfet/538/base -> origin/gh/malfet/538/base 2025-10-10T00:44:16.8170924Z * [new branch] gh/malfet/538/head -> origin/gh/malfet/538/head 2025-10-10T00:44:16.8172825Z * [new branch] gh/malfet/538/orig -> origin/gh/malfet/538/orig 2025-10-10T00:44:16.8175316Z * [new branch] gh/malfet/539/base -> origin/gh/malfet/539/base 2025-10-10T00:44:16.8177209Z * [new branch] gh/malfet/539/head -> origin/gh/malfet/539/head 2025-10-10T00:44:16.8179079Z * [new branch] gh/malfet/539/orig -> origin/gh/malfet/539/orig 2025-10-10T00:44:16.8181644Z * [new branch] gh/malfet/540/base -> origin/gh/malfet/540/base 2025-10-10T00:44:16.8183465Z * [new branch] gh/malfet/540/head -> origin/gh/malfet/540/head 2025-10-10T00:44:16.8185478Z * [new branch] gh/malfet/540/orig -> origin/gh/malfet/540/orig 2025-10-10T00:44:16.8188302Z * [new branch] gh/malfet/541/base -> origin/gh/malfet/541/base 2025-10-10T00:44:16.8190214Z * [new branch] gh/malfet/541/head -> origin/gh/malfet/541/head 2025-10-10T00:44:16.8192110Z * [new branch] gh/malfet/541/orig -> origin/gh/malfet/541/orig 2025-10-10T00:44:16.8194749Z * [new branch] gh/malfet/542/base -> origin/gh/malfet/542/base 2025-10-10T00:44:16.8196616Z * [new branch] gh/malfet/542/head -> origin/gh/malfet/542/head 2025-10-10T00:44:16.8198771Z * [new branch] gh/malfet/542/orig -> origin/gh/malfet/542/orig 2025-10-10T00:44:16.8203089Z * [new branch] gh/malfet/543/base -> origin/gh/malfet/543/base 2025-10-10T00:44:16.8204961Z * [new branch] gh/malfet/543/head -> origin/gh/malfet/543/head 2025-10-10T00:44:16.8206793Z * [new branch] gh/malfet/543/orig -> origin/gh/malfet/543/orig 2025-10-10T00:44:16.8209621Z * [new branch] gh/malfet/544/base -> origin/gh/malfet/544/base 2025-10-10T00:44:16.8211415Z * [new branch] gh/malfet/544/head -> origin/gh/malfet/544/head 2025-10-10T00:44:16.8213296Z * [new branch] gh/malfet/544/orig -> origin/gh/malfet/544/orig 2025-10-10T00:44:16.8215994Z * [new branch] gh/malfet/545/base -> origin/gh/malfet/545/base 2025-10-10T00:44:16.8217871Z * [new branch] gh/malfet/545/head -> origin/gh/malfet/545/head 2025-10-10T00:44:16.8219778Z * [new branch] gh/malfet/545/orig -> origin/gh/malfet/545/orig 2025-10-10T00:44:16.8222349Z * [new branch] gh/malfet/546/base -> origin/gh/malfet/546/base 2025-10-10T00:44:16.8224209Z * [new branch] gh/malfet/546/head -> origin/gh/malfet/546/head 2025-10-10T00:44:16.8226067Z * [new branch] gh/malfet/546/orig -> origin/gh/malfet/546/orig 2025-10-10T00:44:16.8228688Z * [new branch] gh/malfet/547/base -> origin/gh/malfet/547/base 2025-10-10T00:44:16.8230497Z * [new branch] gh/malfet/547/head -> origin/gh/malfet/547/head 2025-10-10T00:44:16.8232396Z * [new branch] gh/malfet/547/orig -> origin/gh/malfet/547/orig 2025-10-10T00:44:16.8234970Z * [new branch] gh/malfet/548/base -> origin/gh/malfet/548/base 2025-10-10T00:44:16.8236798Z * [new branch] gh/malfet/548/head -> origin/gh/malfet/548/head 2025-10-10T00:44:16.8238766Z * [new branch] gh/malfet/548/orig -> origin/gh/malfet/548/orig 2025-10-10T00:44:16.8241499Z * [new branch] gh/malfet/549/base -> origin/gh/malfet/549/base 2025-10-10T00:44:16.8243451Z * [new branch] gh/malfet/549/head -> origin/gh/malfet/549/head 2025-10-10T00:44:16.8245216Z * [new branch] gh/malfet/549/orig -> origin/gh/malfet/549/orig 2025-10-10T00:44:16.8247953Z * [new branch] gh/malfet/550/base -> origin/gh/malfet/550/base 2025-10-10T00:44:16.8249947Z * [new branch] gh/malfet/550/head -> origin/gh/malfet/550/head 2025-10-10T00:44:16.8251731Z * [new branch] gh/malfet/550/orig -> origin/gh/malfet/550/orig 2025-10-10T00:44:16.8254286Z * [new branch] gh/malfet/551/base -> origin/gh/malfet/551/base 2025-10-10T00:44:16.8256167Z * [new branch] gh/malfet/551/head -> origin/gh/malfet/551/head 2025-10-10T00:44:16.8258060Z * [new branch] gh/malfet/551/orig -> origin/gh/malfet/551/orig 2025-10-10T00:44:16.8260630Z * [new branch] gh/malfet/552/base -> origin/gh/malfet/552/base 2025-10-10T00:44:16.8263100Z * [new branch] gh/malfet/552/head -> origin/gh/malfet/552/head 2025-10-10T00:44:16.8264952Z * [new branch] gh/malfet/552/orig -> origin/gh/malfet/552/orig 2025-10-10T00:44:16.8267654Z * [new branch] gh/malfet/553/base -> origin/gh/malfet/553/base 2025-10-10T00:44:16.8269511Z * [new branch] gh/malfet/553/head -> origin/gh/malfet/553/head 2025-10-10T00:44:16.8271352Z * [new branch] gh/malfet/553/orig -> origin/gh/malfet/553/orig 2025-10-10T00:44:16.8274054Z * [new branch] gh/malfet/64/base -> origin/gh/malfet/64/base 2025-10-10T00:44:16.8276060Z * [new branch] gh/malfet/64/head -> origin/gh/malfet/64/head 2025-10-10T00:44:16.8279053Z * [new branch] gh/manuelcandales/10/base -> origin/gh/manuelcandales/10/base 2025-10-10T00:44:16.8280888Z * [new branch] gh/manuelcandales/10/head -> origin/gh/manuelcandales/10/head 2025-10-10T00:44:16.8282741Z * [new branch] gh/manuelcandales/10/orig -> origin/gh/manuelcandales/10/orig 2025-10-10T00:44:16.8285254Z * [new branch] gh/manuelcandales/11/base -> origin/gh/manuelcandales/11/base 2025-10-10T00:44:16.8287133Z * [new branch] gh/manuelcandales/11/head -> origin/gh/manuelcandales/11/head 2025-10-10T00:44:16.8289141Z * [new branch] gh/manuelcandales/11/orig -> origin/gh/manuelcandales/11/orig 2025-10-10T00:44:16.8291797Z * [new branch] gh/manuelcandales/9/base -> origin/gh/manuelcandales/9/base 2025-10-10T00:44:16.8293658Z * [new branch] gh/manuelcandales/9/head -> origin/gh/manuelcandales/9/head 2025-10-10T00:44:16.8295504Z * [new branch] gh/manuelcandales/9/orig -> origin/gh/manuelcandales/9/orig 2025-10-10T00:44:16.8298781Z * [new branch] gh/markkm/1/base -> origin/gh/markkm/1/base 2025-10-10T00:44:16.8302163Z * [new branch] gh/masnesral/235/base -> origin/gh/masnesral/235/base 2025-10-10T00:44:16.8304117Z * [new branch] gh/masnesral/235/head -> origin/gh/masnesral/235/head 2025-10-10T00:44:16.8305997Z * [new branch] gh/masnesral/235/orig -> origin/gh/masnesral/235/orig 2025-10-10T00:44:16.8308527Z * [new branch] gh/masnesral/236/base -> origin/gh/masnesral/236/base 2025-10-10T00:44:16.8310437Z * [new branch] gh/masnesral/236/head -> origin/gh/masnesral/236/head 2025-10-10T00:44:16.8312370Z * [new branch] gh/masnesral/236/orig -> origin/gh/masnesral/236/orig 2025-10-10T00:44:16.8315047Z * [new branch] gh/masnesral/237/base -> origin/gh/masnesral/237/base 2025-10-10T00:44:16.8317176Z * [new branch] gh/masnesral/237/head -> origin/gh/masnesral/237/head 2025-10-10T00:44:16.8319230Z * [new branch] gh/masnesral/237/orig -> origin/gh/masnesral/237/orig 2025-10-10T00:44:16.8322007Z * [new branch] gh/masnesral/238/base -> origin/gh/masnesral/238/base 2025-10-10T00:44:16.8323783Z * [new branch] gh/masnesral/238/head -> origin/gh/masnesral/238/head 2025-10-10T00:44:16.8326071Z * [new branch] gh/masnesral/238/orig -> origin/gh/masnesral/238/orig 2025-10-10T00:44:16.8329577Z * [new branch] gh/mhorowitz/0/base -> origin/gh/mhorowitz/0/base 2025-10-10T00:44:16.8331438Z * [new branch] gh/mhorowitz/0/head -> origin/gh/mhorowitz/0/head 2025-10-10T00:44:16.8333903Z * [new branch] gh/mhorowitz/1/base -> origin/gh/mhorowitz/1/base 2025-10-10T00:44:16.8335788Z * [new branch] gh/mhorowitz/1/head -> origin/gh/mhorowitz/1/head 2025-10-10T00:44:16.8338334Z * [new branch] gh/mhorowitz/2/base -> origin/gh/mhorowitz/2/base 2025-10-10T00:44:16.8340217Z * [new branch] gh/mhorowitz/2/head -> origin/gh/mhorowitz/2/head 2025-10-10T00:44:16.8342572Z * [new branch] gh/mhorowitz/3/base -> origin/gh/mhorowitz/3/base 2025-10-10T00:44:16.8344307Z * [new branch] gh/mhorowitz/3/head -> origin/gh/mhorowitz/3/head 2025-10-10T00:44:16.8346718Z * [new branch] gh/mhorowitz/4/base -> origin/gh/mhorowitz/4/base 2025-10-10T00:44:16.8348569Z * [new branch] gh/mhorowitz/4/head -> origin/gh/mhorowitz/4/head 2025-10-10T00:44:16.8350939Z * [new branch] gh/mhorowitz/5/base -> origin/gh/mhorowitz/5/base 2025-10-10T00:44:16.8352750Z * [new branch] gh/mhorowitz/5/head -> origin/gh/mhorowitz/5/head 2025-10-10T00:44:16.8355220Z * [new branch] gh/mhorowitz/6/base -> origin/gh/mhorowitz/6/base 2025-10-10T00:44:16.8356951Z * [new branch] gh/mhorowitz/6/head -> origin/gh/mhorowitz/6/head 2025-10-10T00:44:16.8360427Z * [new branch] gh/mikaylagawarecki/234/base -> origin/gh/mikaylagawarecki/234/base 2025-10-10T00:44:16.8362288Z * [new branch] gh/mikaylagawarecki/234/head -> origin/gh/mikaylagawarecki/234/head 2025-10-10T00:44:16.8364772Z * [new branch] gh/mikaylagawarecki/235/base -> origin/gh/mikaylagawarecki/235/base 2025-10-10T00:44:16.8366547Z * [new branch] gh/mikaylagawarecki/235/head -> origin/gh/mikaylagawarecki/235/head 2025-10-10T00:44:16.8369240Z * [new branch] gh/mikaylagawarecki/236/base -> origin/gh/mikaylagawarecki/236/base 2025-10-10T00:44:16.8371054Z * [new branch] gh/mikaylagawarecki/236/head -> origin/gh/mikaylagawarecki/236/head 2025-10-10T00:44:16.8373563Z * [new branch] gh/mikaylagawarecki/237/base -> origin/gh/mikaylagawarecki/237/base 2025-10-10T00:44:16.8375352Z * [new branch] gh/mikaylagawarecki/237/head -> origin/gh/mikaylagawarecki/237/head 2025-10-10T00:44:16.8377926Z * [new branch] gh/mikaylagawarecki/238/base -> origin/gh/mikaylagawarecki/238/base 2025-10-10T00:44:16.8379822Z * [new branch] gh/mikaylagawarecki/238/head -> origin/gh/mikaylagawarecki/238/head 2025-10-10T00:44:16.8382410Z * [new branch] gh/mikaylagawarecki/317/base -> origin/gh/mikaylagawarecki/317/base 2025-10-10T00:44:16.8384382Z * [new branch] gh/mikaylagawarecki/317/head -> origin/gh/mikaylagawarecki/317/head 2025-10-10T00:44:16.8386257Z * [new branch] gh/mikaylagawarecki/317/orig -> origin/gh/mikaylagawarecki/317/orig 2025-10-10T00:44:16.8388899Z * [new branch] gh/mikaylagawarecki/336/base -> origin/gh/mikaylagawarecki/336/base 2025-10-10T00:44:16.8390824Z * [new branch] gh/mikaylagawarecki/336/head -> origin/gh/mikaylagawarecki/336/head 2025-10-10T00:44:16.8392661Z * [new branch] gh/mikaylagawarecki/336/orig -> origin/gh/mikaylagawarecki/336/orig 2025-10-10T00:44:16.8395701Z * [new branch] gh/mikaylagawarecki/337/base -> origin/gh/mikaylagawarecki/337/base 2025-10-10T00:44:16.8397788Z * [new branch] gh/mikaylagawarecki/337/head -> origin/gh/mikaylagawarecki/337/head 2025-10-10T00:44:16.8399824Z * [new branch] gh/mikaylagawarecki/337/orig -> origin/gh/mikaylagawarecki/337/orig 2025-10-10T00:44:16.8402463Z * [new branch] gh/mikaylagawarecki/340/base -> origin/gh/mikaylagawarecki/340/base 2025-10-10T00:44:16.8404433Z * [new branch] gh/mikaylagawarecki/340/head -> origin/gh/mikaylagawarecki/340/head 2025-10-10T00:44:16.8406308Z * [new branch] gh/mikaylagawarecki/340/orig -> origin/gh/mikaylagawarecki/340/orig 2025-10-10T00:44:16.8409221Z * [new branch] gh/mikaylagawarecki/341/base -> origin/gh/mikaylagawarecki/341/base 2025-10-10T00:44:16.8411180Z * [new branch] gh/mikaylagawarecki/341/head -> origin/gh/mikaylagawarecki/341/head 2025-10-10T00:44:16.8413082Z * [new branch] gh/mikaylagawarecki/341/orig -> origin/gh/mikaylagawarecki/341/orig 2025-10-10T00:44:16.8415813Z * [new branch] gh/mikaylagawarecki/342/base -> origin/gh/mikaylagawarecki/342/base 2025-10-10T00:44:16.8417624Z * [new branch] gh/mikaylagawarecki/342/head -> origin/gh/mikaylagawarecki/342/head 2025-10-10T00:44:16.8419513Z * [new branch] gh/mikaylagawarecki/342/orig -> origin/gh/mikaylagawarecki/342/orig 2025-10-10T00:44:16.8422048Z * [new branch] gh/mikaylagawarecki/343/base -> origin/gh/mikaylagawarecki/343/base 2025-10-10T00:44:16.8423910Z * [new branch] gh/mikaylagawarecki/343/head -> origin/gh/mikaylagawarecki/343/head 2025-10-10T00:44:16.8425836Z * [new branch] gh/mikaylagawarecki/343/orig -> origin/gh/mikaylagawarecki/343/orig 2025-10-10T00:44:16.8428422Z * [new branch] gh/mikaylagawarecki/344/base -> origin/gh/mikaylagawarecki/344/base 2025-10-10T00:44:16.8430309Z * [new branch] gh/mikaylagawarecki/344/head -> origin/gh/mikaylagawarecki/344/head 2025-10-10T00:44:16.8432200Z * [new branch] gh/mikaylagawarecki/344/orig -> origin/gh/mikaylagawarecki/344/orig 2025-10-10T00:44:16.8434731Z * [new branch] gh/mikaylagawarecki/345/base -> origin/gh/mikaylagawarecki/345/base 2025-10-10T00:44:16.8436638Z * [new branch] gh/mikaylagawarecki/345/head -> origin/gh/mikaylagawarecki/345/head 2025-10-10T00:44:16.8438542Z * [new branch] gh/mikaylagawarecki/345/orig -> origin/gh/mikaylagawarecki/345/orig 2025-10-10T00:44:16.8441254Z * [new branch] gh/mikaylagawarecki/346/base -> origin/gh/mikaylagawarecki/346/base 2025-10-10T00:44:16.8443077Z * [new branch] gh/mikaylagawarecki/346/head -> origin/gh/mikaylagawarecki/346/head 2025-10-10T00:44:16.8444980Z * [new branch] gh/mikaylagawarecki/346/orig -> origin/gh/mikaylagawarecki/346/orig 2025-10-10T00:44:16.8447863Z * [new branch] gh/mikaylagawarecki/347/base -> origin/gh/mikaylagawarecki/347/base 2025-10-10T00:44:16.8449707Z * [new branch] gh/mikaylagawarecki/347/head -> origin/gh/mikaylagawarecki/347/head 2025-10-10T00:44:16.8451436Z * [new branch] gh/mikaylagawarecki/347/orig -> origin/gh/mikaylagawarecki/347/orig 2025-10-10T00:44:16.8454403Z * [new branch] gh/mikaylagawarecki/348/base -> origin/gh/mikaylagawarecki/348/base 2025-10-10T00:44:16.8456775Z * [new branch] gh/mikaylagawarecki/348/head -> origin/gh/mikaylagawarecki/348/head 2025-10-10T00:44:16.8458659Z * [new branch] gh/mikaylagawarecki/348/orig -> origin/gh/mikaylagawarecki/348/orig 2025-10-10T00:44:16.8461491Z * [new branch] gh/mikaylagawarecki/349/base -> origin/gh/mikaylagawarecki/349/base 2025-10-10T00:44:16.8463404Z * [new branch] gh/mikaylagawarecki/349/head -> origin/gh/mikaylagawarecki/349/head 2025-10-10T00:44:16.8465254Z * [new branch] gh/mikaylagawarecki/349/orig -> origin/gh/mikaylagawarecki/349/orig 2025-10-10T00:44:16.8468124Z * [new branch] gh/mikaylagawarecki/350/base -> origin/gh/mikaylagawarecki/350/base 2025-10-10T00:44:16.8469901Z * [new branch] gh/mikaylagawarecki/350/head -> origin/gh/mikaylagawarecki/350/head 2025-10-10T00:44:16.8472401Z * [new branch] gh/mikaylagawarecki/350/orig -> origin/gh/mikaylagawarecki/350/orig 2025-10-10T00:44:16.8475451Z * [new branch] gh/mlazos/18/base -> origin/gh/mlazos/18/base 2025-10-10T00:44:16.8477362Z * [new branch] gh/mlazos/18/head -> origin/gh/mlazos/18/head 2025-10-10T00:44:16.8479210Z * [new branch] gh/mlazos/18/orig -> origin/gh/mlazos/18/orig 2025-10-10T00:44:16.8481678Z * [new branch] gh/mlazos/19/base -> origin/gh/mlazos/19/base 2025-10-10T00:44:16.8484395Z * [new branch] gh/mlazos/19/head -> origin/gh/mlazos/19/head 2025-10-10T00:44:16.8486297Z * [new branch] gh/mlazos/19/orig -> origin/gh/mlazos/19/orig 2025-10-10T00:44:16.8489443Z * [new branch] gh/mlazos/20/base -> origin/gh/mlazos/20/base 2025-10-10T00:44:16.8491934Z * [new branch] gh/mlazos/20/head -> origin/gh/mlazos/20/head 2025-10-10T00:44:16.8493826Z * [new branch] gh/mlazos/20/orig -> origin/gh/mlazos/20/orig 2025-10-10T00:44:16.8496539Z * [new branch] gh/mlazos/21/base -> origin/gh/mlazos/21/base 2025-10-10T00:44:16.8498650Z * [new branch] gh/mlazos/21/head -> origin/gh/mlazos/21/head 2025-10-10T00:44:16.8500626Z * [new branch] gh/mlazos/21/orig -> origin/gh/mlazos/21/orig 2025-10-10T00:44:16.8503259Z * [new branch] gh/mlazos/22/base -> origin/gh/mlazos/22/base 2025-10-10T00:44:16.8505142Z * [new branch] gh/mlazos/22/head -> origin/gh/mlazos/22/head 2025-10-10T00:44:16.8506951Z * [new branch] gh/mlazos/22/orig -> origin/gh/mlazos/22/orig 2025-10-10T00:44:16.8509673Z * [new branch] gh/mlazos/23/base -> origin/gh/mlazos/23/base 2025-10-10T00:44:16.8511666Z * [new branch] gh/mlazos/23/head -> origin/gh/mlazos/23/head 2025-10-10T00:44:16.8513529Z * [new branch] gh/mlazos/23/orig -> origin/gh/mlazos/23/orig 2025-10-10T00:44:16.8516295Z * [new branch] gh/mlazos/24/base -> origin/gh/mlazos/24/base 2025-10-10T00:44:16.8518185Z * [new branch] gh/mlazos/24/head -> origin/gh/mlazos/24/head 2025-10-10T00:44:16.8520066Z * [new branch] gh/mlazos/24/orig -> origin/gh/mlazos/24/orig 2025-10-10T00:44:16.8522733Z * [new branch] gh/mlazos/25/base -> origin/gh/mlazos/25/base 2025-10-10T00:44:16.8524596Z * [new branch] gh/mlazos/25/head -> origin/gh/mlazos/25/head 2025-10-10T00:44:16.8526547Z * [new branch] gh/mlazos/25/orig -> origin/gh/mlazos/25/orig 2025-10-10T00:44:16.8529431Z * [new branch] gh/mlazos/26/base -> origin/gh/mlazos/26/base 2025-10-10T00:44:16.8531173Z * [new branch] gh/mlazos/26/head -> origin/gh/mlazos/26/head 2025-10-10T00:44:16.8533006Z * [new branch] gh/mlazos/26/orig -> origin/gh/mlazos/26/orig 2025-10-10T00:44:16.8535479Z * [new branch] gh/mlazos/27/base -> origin/gh/mlazos/27/base 2025-10-10T00:44:16.8537354Z * [new branch] gh/mlazos/27/head -> origin/gh/mlazos/27/head 2025-10-10T00:44:16.8539128Z * [new branch] gh/mlazos/27/orig -> origin/gh/mlazos/27/orig 2025-10-10T00:44:16.8541777Z * [new branch] gh/mlazos/28/base -> origin/gh/mlazos/28/base 2025-10-10T00:44:16.8543635Z * [new branch] gh/mlazos/28/head -> origin/gh/mlazos/28/head 2025-10-10T00:44:16.8545534Z * [new branch] gh/mlazos/28/orig -> origin/gh/mlazos/28/orig 2025-10-10T00:44:16.8548226Z * [new branch] gh/mlazos/29/base -> origin/gh/mlazos/29/base 2025-10-10T00:44:16.8549956Z * [new branch] gh/mlazos/29/head -> origin/gh/mlazos/29/head 2025-10-10T00:44:16.8551795Z * [new branch] gh/mlazos/29/orig -> origin/gh/mlazos/29/orig 2025-10-10T00:44:16.8554265Z * [new branch] gh/mlazos/30/base -> origin/gh/mlazos/30/base 2025-10-10T00:44:16.8556047Z * [new branch] gh/mlazos/30/head -> origin/gh/mlazos/30/head 2025-10-10T00:44:16.8557999Z * [new branch] gh/mlazos/30/orig -> origin/gh/mlazos/30/orig 2025-10-10T00:44:16.8560450Z * [new branch] gh/mlazos/31/base -> origin/gh/mlazos/31/base 2025-10-10T00:44:16.8562304Z * [new branch] gh/mlazos/31/head -> origin/gh/mlazos/31/head 2025-10-10T00:44:16.8564151Z * [new branch] gh/mlazos/31/orig -> origin/gh/mlazos/31/orig 2025-10-10T00:44:16.8566808Z * [new branch] gh/mlazos/32/base -> origin/gh/mlazos/32/base 2025-10-10T00:44:16.8568870Z * [new branch] gh/mlazos/32/head -> origin/gh/mlazos/32/head 2025-10-10T00:44:16.8570815Z * [new branch] gh/mlazos/32/orig -> origin/gh/mlazos/32/orig 2025-10-10T00:44:16.8573428Z * [new branch] gh/mlazos/33/base -> origin/gh/mlazos/33/base 2025-10-10T00:44:16.8575315Z * [new branch] gh/mlazos/33/head -> origin/gh/mlazos/33/head 2025-10-10T00:44:16.8577183Z * [new branch] gh/mlazos/33/orig -> origin/gh/mlazos/33/orig 2025-10-10T00:44:16.8579713Z * [new branch] gh/mlazos/34/base -> origin/gh/mlazos/34/base 2025-10-10T00:44:16.8581517Z * [new branch] gh/mlazos/34/head -> origin/gh/mlazos/34/head 2025-10-10T00:44:16.8583476Z * [new branch] gh/mlazos/34/orig -> origin/gh/mlazos/34/orig 2025-10-10T00:44:16.8585842Z * [new branch] gh/mlazos/35/base -> origin/gh/mlazos/35/base 2025-10-10T00:44:16.8587723Z * [new branch] gh/mlazos/35/head -> origin/gh/mlazos/35/head 2025-10-10T00:44:16.8589556Z * [new branch] gh/mlazos/35/orig -> origin/gh/mlazos/35/orig 2025-10-10T00:44:16.8592697Z * [new branch] gh/mlazos/36/base -> origin/gh/mlazos/36/base 2025-10-10T00:44:16.8595053Z * [new branch] gh/mlazos/36/head -> origin/gh/mlazos/36/head 2025-10-10T00:44:16.8596858Z * [new branch] gh/mlazos/36/orig -> origin/gh/mlazos/36/orig 2025-10-10T00:44:16.8602097Z * [new branch] gh/mlazos/37/base -> origin/gh/mlazos/37/base 2025-10-10T00:44:16.8604002Z * [new branch] gh/mlazos/37/head -> origin/gh/mlazos/37/head 2025-10-10T00:44:16.8605824Z * [new branch] gh/mlazos/37/orig -> origin/gh/mlazos/37/orig 2025-10-10T00:44:16.8609204Z * [new branch] gh/mrmiywj/1/base -> origin/gh/mrmiywj/1/base 2025-10-10T00:44:16.8611056Z * [new branch] gh/mrmiywj/1/head -> origin/gh/mrmiywj/1/head 2025-10-10T00:44:16.8614235Z * [new branch] gh/muchulee8/62/base -> origin/gh/muchulee8/62/base 2025-10-10T00:44:16.8616242Z * [new branch] gh/muchulee8/62/head -> origin/gh/muchulee8/62/head 2025-10-10T00:44:16.8618139Z * [new branch] gh/muchulee8/62/orig -> origin/gh/muchulee8/62/orig 2025-10-10T00:44:16.8620904Z * [new branch] gh/muchulee8/64/base -> origin/gh/muchulee8/64/base 2025-10-10T00:44:16.8622894Z * [new branch] gh/muchulee8/64/head -> origin/gh/muchulee8/64/head 2025-10-10T00:44:16.8624822Z * [new branch] gh/muchulee8/64/orig -> origin/gh/muchulee8/64/orig 2025-10-10T00:44:16.8627504Z * [new branch] gh/muchulee8/65/base -> origin/gh/muchulee8/65/base 2025-10-10T00:44:16.8629467Z * [new branch] gh/muchulee8/65/head -> origin/gh/muchulee8/65/head 2025-10-10T00:44:16.8631365Z * [new branch] gh/muchulee8/65/orig -> origin/gh/muchulee8/65/orig 2025-10-10T00:44:16.8633971Z * [new branch] gh/muchulee8/66/base -> origin/gh/muchulee8/66/base 2025-10-10T00:44:16.8635850Z * [new branch] gh/muchulee8/66/head -> origin/gh/muchulee8/66/head 2025-10-10T00:44:16.8637729Z * [new branch] gh/muchulee8/66/orig -> origin/gh/muchulee8/66/orig 2025-10-10T00:44:16.8640332Z * [new branch] gh/muchulee8/67/base -> origin/gh/muchulee8/67/base 2025-10-10T00:44:16.8642254Z * [new branch] gh/muchulee8/67/head -> origin/gh/muchulee8/67/head 2025-10-10T00:44:16.8644088Z * [new branch] gh/muchulee8/67/orig -> origin/gh/muchulee8/67/orig 2025-10-10T00:44:16.8648066Z * [new branch] gh/naveenthangudu/1/base -> origin/gh/naveenthangudu/1/base 2025-10-10T00:44:16.8649979Z * [new branch] gh/naveenthangudu/1/head -> origin/gh/naveenthangudu/1/head 2025-10-10T00:44:16.8651920Z * [new branch] gh/naveenthangudu/1/orig -> origin/gh/naveenthangudu/1/orig 2025-10-10T00:44:16.8654465Z * [new branch] gh/naveenthangudu/2/base -> origin/gh/naveenthangudu/2/base 2025-10-10T00:44:16.8656361Z * [new branch] gh/naveenthangudu/2/head -> origin/gh/naveenthangudu/2/head 2025-10-10T00:44:16.8658277Z * [new branch] gh/naveenthangudu/2/orig -> origin/gh/naveenthangudu/2/orig 2025-10-10T00:44:16.8660907Z * [new branch] gh/naveenthangudu/3/base -> origin/gh/naveenthangudu/3/base 2025-10-10T00:44:16.8662763Z * [new branch] gh/naveenthangudu/3/head -> origin/gh/naveenthangudu/3/head 2025-10-10T00:44:16.8664692Z * [new branch] gh/naveenthangudu/3/orig -> origin/gh/naveenthangudu/3/orig 2025-10-10T00:44:16.8667194Z * [new branch] gh/naveenthangudu/4/base -> origin/gh/naveenthangudu/4/base 2025-10-10T00:44:16.8669099Z * [new branch] gh/naveenthangudu/4/head -> origin/gh/naveenthangudu/4/head 2025-10-10T00:44:16.8671087Z * [new branch] gh/naveenthangudu/4/orig -> origin/gh/naveenthangudu/4/orig 2025-10-10T00:44:16.8673639Z * [new branch] gh/naveenthangudu/5/base -> origin/gh/naveenthangudu/5/base 2025-10-10T00:44:16.8675535Z * [new branch] gh/naveenthangudu/5/head -> origin/gh/naveenthangudu/5/head 2025-10-10T00:44:16.8677575Z * [new branch] gh/naveenthangudu/5/orig -> origin/gh/naveenthangudu/5/orig 2025-10-10T00:44:16.8680090Z * [new branch] gh/naveenthangudu/6/base -> origin/gh/naveenthangudu/6/base 2025-10-10T00:44:16.8681933Z * [new branch] gh/naveenthangudu/6/head -> origin/gh/naveenthangudu/6/head 2025-10-10T00:44:16.8683742Z * [new branch] gh/naveenthangudu/6/orig -> origin/gh/naveenthangudu/6/orig 2025-10-10T00:44:16.8686361Z * [new branch] gh/naveenthangudu/7/base -> origin/gh/naveenthangudu/7/base 2025-10-10T00:44:16.8688367Z * [new branch] gh/naveenthangudu/7/head -> origin/gh/naveenthangudu/7/head 2025-10-10T00:44:16.8690197Z * [new branch] gh/naveenthangudu/7/orig -> origin/gh/naveenthangudu/7/orig 2025-10-10T00:44:16.8692679Z * [new branch] gh/naveenthangudu/8/base -> origin/gh/naveenthangudu/8/base 2025-10-10T00:44:16.8694539Z * [new branch] gh/naveenthangudu/8/head -> origin/gh/naveenthangudu/8/head 2025-10-10T00:44:16.8696575Z * [new branch] gh/naveenthangudu/8/orig -> origin/gh/naveenthangudu/8/orig 2025-10-10T00:44:16.8700196Z * [new branch] gh/nikitaved/1/base -> origin/gh/nikitaved/1/base 2025-10-10T00:44:16.8702035Z * [new branch] gh/nikitaved/1/head -> origin/gh/nikitaved/1/head 2025-10-10T00:44:16.8703968Z * [new branch] gh/nikitaved/1/orig -> origin/gh/nikitaved/1/orig 2025-10-10T00:44:16.8706445Z * [new branch] gh/nikitaved/2/base -> origin/gh/nikitaved/2/base 2025-10-10T00:44:16.8708290Z * [new branch] gh/nikitaved/2/head -> origin/gh/nikitaved/2/head 2025-10-10T00:44:16.8710213Z * [new branch] gh/nikitaved/2/orig -> origin/gh/nikitaved/2/orig 2025-10-10T00:44:16.8712733Z * [new branch] gh/nikitaved/3/base -> origin/gh/nikitaved/3/base 2025-10-10T00:44:16.8714556Z * [new branch] gh/nikitaved/3/head -> origin/gh/nikitaved/3/head 2025-10-10T00:44:16.8716407Z * [new branch] gh/nikitaved/3/orig -> origin/gh/nikitaved/3/orig 2025-10-10T00:44:16.8719601Z * [new branch] gh/oulgen/35/base -> origin/gh/oulgen/35/base 2025-10-10T00:44:16.8721538Z * [new branch] gh/oulgen/35/head -> origin/gh/oulgen/35/head 2025-10-10T00:44:16.8723389Z * [new branch] gh/oulgen/35/orig -> origin/gh/oulgen/35/orig 2025-10-10T00:44:16.8726036Z * [new branch] gh/patvig/mtia-serialization -> origin/gh/patvig/mtia-serialization 2025-10-10T00:44:16.8729562Z * [new branch] gh/pearu/108/base -> origin/gh/pearu/108/base 2025-10-10T00:44:16.8731383Z * [new branch] gh/pearu/108/head -> origin/gh/pearu/108/head 2025-10-10T00:44:16.8733335Z * [new branch] gh/pearu/108/orig -> origin/gh/pearu/108/orig 2025-10-10T00:44:16.8735929Z * [new branch] gh/pearu/109/base -> origin/gh/pearu/109/base 2025-10-10T00:44:16.8737831Z * [new branch] gh/pearu/109/head -> origin/gh/pearu/109/head 2025-10-10T00:44:16.8739725Z * [new branch] gh/pearu/109/orig -> origin/gh/pearu/109/orig 2025-10-10T00:44:16.8742299Z * [new branch] gh/pearu/110/base -> origin/gh/pearu/110/base 2025-10-10T00:44:16.8744188Z * [new branch] gh/pearu/110/head -> origin/gh/pearu/110/head 2025-10-10T00:44:16.8746100Z * [new branch] gh/pearu/110/orig -> origin/gh/pearu/110/orig 2025-10-10T00:44:16.8748641Z * [new branch] gh/pearu/111/base -> origin/gh/pearu/111/base 2025-10-10T00:44:16.8750461Z * [new branch] gh/pearu/111/head -> origin/gh/pearu/111/head 2025-10-10T00:44:16.8752419Z * [new branch] gh/pearu/111/orig -> origin/gh/pearu/111/orig 2025-10-10T00:44:16.8754996Z * [new branch] gh/pearu/112/base -> origin/gh/pearu/112/base 2025-10-10T00:44:16.8756915Z * [new branch] gh/pearu/112/head -> origin/gh/pearu/112/head 2025-10-10T00:44:16.8758788Z * [new branch] gh/pearu/112/orig -> origin/gh/pearu/112/orig 2025-10-10T00:44:16.8761265Z * [new branch] gh/pearu/113/base -> origin/gh/pearu/113/base 2025-10-10T00:44:16.8763130Z * [new branch] gh/pearu/113/head -> origin/gh/pearu/113/head 2025-10-10T00:44:16.8764965Z * [new branch] gh/pearu/113/orig -> origin/gh/pearu/113/orig 2025-10-10T00:44:16.8767526Z * [new branch] gh/pearu/114/base -> origin/gh/pearu/114/base 2025-10-10T00:44:16.8769547Z * [new branch] gh/pearu/114/head -> origin/gh/pearu/114/head 2025-10-10T00:44:16.8771454Z * [new branch] gh/pearu/114/orig -> origin/gh/pearu/114/orig 2025-10-10T00:44:16.8773888Z * [new branch] gh/pearu/115/base -> origin/gh/pearu/115/base 2025-10-10T00:44:16.8775714Z * [new branch] gh/pearu/115/head -> origin/gh/pearu/115/head 2025-10-10T00:44:16.8777591Z * [new branch] gh/pearu/115/orig -> origin/gh/pearu/115/orig 2025-10-10T00:44:16.8780133Z * [new branch] gh/pearu/116/base -> origin/gh/pearu/116/base 2025-10-10T00:44:16.8782070Z * [new branch] gh/pearu/116/head -> origin/gh/pearu/116/head 2025-10-10T00:44:16.8783768Z * [new branch] gh/pearu/116/orig -> origin/gh/pearu/116/orig 2025-10-10T00:44:16.8786277Z * [new branch] gh/pearu/117/base -> origin/gh/pearu/117/base 2025-10-10T00:44:16.8788143Z * [new branch] gh/pearu/117/head -> origin/gh/pearu/117/head 2025-10-10T00:44:16.8790066Z * [new branch] gh/pearu/117/orig -> origin/gh/pearu/117/orig 2025-10-10T00:44:16.8792583Z * [new branch] gh/pearu/118/base -> origin/gh/pearu/118/base 2025-10-10T00:44:16.8794547Z * [new branch] gh/pearu/118/head -> origin/gh/pearu/118/head 2025-10-10T00:44:16.8796387Z * [new branch] gh/pearu/118/orig -> origin/gh/pearu/118/orig 2025-10-10T00:44:16.8799116Z * [new branch] gh/pearu/119/base -> origin/gh/pearu/119/base 2025-10-10T00:44:16.8800964Z * [new branch] gh/pearu/119/head -> origin/gh/pearu/119/head 2025-10-10T00:44:16.8802853Z * [new branch] gh/pearu/119/orig -> origin/gh/pearu/119/orig 2025-10-10T00:44:16.8805417Z * [new branch] gh/pearu/120/base -> origin/gh/pearu/120/base 2025-10-10T00:44:16.8808606Z * [new branch] gh/pearu/120/head -> origin/gh/pearu/120/head 2025-10-10T00:44:16.8810323Z * [new branch] gh/pearu/120/orig -> origin/gh/pearu/120/orig 2025-10-10T00:44:16.8813151Z * [new branch] gh/pearu/121/base -> origin/gh/pearu/121/base 2025-10-10T00:44:16.8814593Z * [new branch] gh/pearu/121/head -> origin/gh/pearu/121/head 2025-10-10T00:44:16.8816380Z * [new branch] gh/pearu/121/orig -> origin/gh/pearu/121/orig 2025-10-10T00:44:16.8818847Z * [new branch] gh/pearu/122/base -> origin/gh/pearu/122/base 2025-10-10T00:44:16.8820852Z * [new branch] gh/pearu/122/head -> origin/gh/pearu/122/head 2025-10-10T00:44:16.8822753Z * [new branch] gh/pearu/122/orig -> origin/gh/pearu/122/orig 2025-10-10T00:44:16.8825285Z * [new branch] gh/pearu/123/base -> origin/gh/pearu/123/base 2025-10-10T00:44:16.8827245Z * [new branch] gh/pearu/123/head -> origin/gh/pearu/123/head 2025-10-10T00:44:16.8829125Z * [new branch] gh/pearu/123/orig -> origin/gh/pearu/123/orig 2025-10-10T00:44:16.8832174Z * [new branch] gh/pearu/124/base -> origin/gh/pearu/124/base 2025-10-10T00:44:16.8834365Z * [new branch] gh/pearu/124/head -> origin/gh/pearu/124/head 2025-10-10T00:44:16.8836233Z * [new branch] gh/pearu/124/orig -> origin/gh/pearu/124/orig 2025-10-10T00:44:16.8843094Z * [new branch] gh/pearu/125/base -> origin/gh/pearu/125/base 2025-10-10T00:44:16.8843620Z * [new branch] gh/pearu/125/head -> origin/gh/pearu/125/head 2025-10-10T00:44:16.8843835Z * [new branch] gh/pearu/125/orig -> origin/gh/pearu/125/orig 2025-10-10T00:44:16.8844690Z * [new branch] gh/pearu/126/base -> origin/gh/pearu/126/base 2025-10-10T00:44:16.8846819Z * [new branch] gh/pearu/126/head -> origin/gh/pearu/126/head 2025-10-10T00:44:16.8848675Z * [new branch] gh/pearu/126/orig -> origin/gh/pearu/126/orig 2025-10-10T00:44:16.8851548Z * [new branch] gh/pearu/127/base -> origin/gh/pearu/127/base 2025-10-10T00:44:16.8853468Z * [new branch] gh/pearu/127/head -> origin/gh/pearu/127/head 2025-10-10T00:44:16.8855960Z * [new branch] gh/pearu/127/orig -> origin/gh/pearu/127/orig 2025-10-10T00:44:16.8858038Z * [new branch] gh/pearu/128/base -> origin/gh/pearu/128/base 2025-10-10T00:44:16.8859206Z * [new branch] gh/pearu/128/head -> origin/gh/pearu/128/head 2025-10-10T00:44:16.8860995Z * [new branch] gh/pearu/128/orig -> origin/gh/pearu/128/orig 2025-10-10T00:44:16.8863686Z * [new branch] gh/pearu/129/base -> origin/gh/pearu/129/base 2025-10-10T00:44:16.8865488Z * [new branch] gh/pearu/129/head -> origin/gh/pearu/129/head 2025-10-10T00:44:16.8867313Z * [new branch] gh/pearu/129/orig -> origin/gh/pearu/129/orig 2025-10-10T00:44:16.8869730Z * [new branch] gh/pearu/130/base -> origin/gh/pearu/130/base 2025-10-10T00:44:16.8871705Z * [new branch] gh/pearu/130/head -> origin/gh/pearu/130/head 2025-10-10T00:44:16.8873493Z * [new branch] gh/pearu/130/orig -> origin/gh/pearu/130/orig 2025-10-10T00:44:16.8875960Z * [new branch] gh/pearu/131/base -> origin/gh/pearu/131/base 2025-10-10T00:44:16.8877812Z * [new branch] gh/pearu/131/head -> origin/gh/pearu/131/head 2025-10-10T00:44:16.8879675Z * [new branch] gh/pearu/131/orig -> origin/gh/pearu/131/orig 2025-10-10T00:44:16.8882041Z * [new branch] gh/pearu/132/base -> origin/gh/pearu/132/base 2025-10-10T00:44:16.8883778Z * [new branch] gh/pearu/132/head -> origin/gh/pearu/132/head 2025-10-10T00:44:16.8885543Z * [new branch] gh/pearu/132/orig -> origin/gh/pearu/132/orig 2025-10-10T00:44:16.8888096Z * [new branch] gh/pearu/133/base -> origin/gh/pearu/133/base 2025-10-10T00:44:16.8889761Z * [new branch] gh/pearu/133/head -> origin/gh/pearu/133/head 2025-10-10T00:44:16.8891406Z * [new branch] gh/pearu/133/orig -> origin/gh/pearu/133/orig 2025-10-10T00:44:16.8893824Z * [new branch] gh/pearu/134/base -> origin/gh/pearu/134/base 2025-10-10T00:44:16.8895664Z * [new branch] gh/pearu/134/head -> origin/gh/pearu/134/head 2025-10-10T00:44:16.8897374Z * [new branch] gh/pearu/134/orig -> origin/gh/pearu/134/orig 2025-10-10T00:44:16.8900368Z * [new branch] gh/pearu/135/base -> origin/gh/pearu/135/base 2025-10-10T00:44:16.8902284Z * [new branch] gh/pearu/135/head -> origin/gh/pearu/135/head 2025-10-10T00:44:16.8903945Z * [new branch] gh/pearu/135/orig -> origin/gh/pearu/135/orig 2025-10-10T00:44:16.8906448Z * [new branch] gh/pearu/136/base -> origin/gh/pearu/136/base 2025-10-10T00:44:16.8908364Z * [new branch] gh/pearu/136/head -> origin/gh/pearu/136/head 2025-10-10T00:44:16.8910073Z * [new branch] gh/pearu/136/orig -> origin/gh/pearu/136/orig 2025-10-10T00:44:16.8912637Z * [new branch] gh/pearu/137/base -> origin/gh/pearu/137/base 2025-10-10T00:44:16.8914537Z * [new branch] gh/pearu/137/head -> origin/gh/pearu/137/head 2025-10-10T00:44:16.8916246Z * [new branch] gh/pearu/137/orig -> origin/gh/pearu/137/orig 2025-10-10T00:44:16.8918646Z * [new branch] gh/pearu/138/base -> origin/gh/pearu/138/base 2025-10-10T00:44:16.8920570Z * [new branch] gh/pearu/138/head -> origin/gh/pearu/138/head 2025-10-10T00:44:16.8922377Z * [new branch] gh/pearu/138/orig -> origin/gh/pearu/138/orig 2025-10-10T00:44:16.8924899Z * [new branch] gh/pearu/139/base -> origin/gh/pearu/139/base 2025-10-10T00:44:16.8926706Z * [new branch] gh/pearu/139/head -> origin/gh/pearu/139/head 2025-10-10T00:44:16.8928658Z * [new branch] gh/pearu/139/orig -> origin/gh/pearu/139/orig 2025-10-10T00:44:16.8931577Z * [new branch] gh/pearu/56/base -> origin/gh/pearu/56/base 2025-10-10T00:44:16.8933760Z * [new branch] gh/pearu/56/head -> origin/gh/pearu/56/head 2025-10-10T00:44:16.8935581Z * [new branch] gh/pearu/56/orig -> origin/gh/pearu/56/orig 2025-10-10T00:44:16.8938268Z * [new branch] gh/pearu/97/base -> origin/gh/pearu/97/base 2025-10-10T00:44:16.8940191Z * [new branch] gh/pearu/97/head -> origin/gh/pearu/97/head 2025-10-10T00:44:16.8941986Z * [new branch] gh/pearu/97/orig -> origin/gh/pearu/97/orig 2025-10-10T00:44:16.8945123Z * [new branch] gh/pianpwk/1/base -> origin/gh/pianpwk/1/base 2025-10-10T00:44:16.8946962Z * [new branch] gh/pianpwk/1/head -> origin/gh/pianpwk/1/head 2025-10-10T00:44:16.8948751Z * [new branch] gh/pianpwk/1/orig -> origin/gh/pianpwk/1/orig 2025-10-10T00:44:16.8951143Z * [new branch] gh/pianpwk/2/base -> origin/gh/pianpwk/2/base 2025-10-10T00:44:16.8953035Z * [new branch] gh/pianpwk/2/head -> origin/gh/pianpwk/2/head 2025-10-10T00:44:16.8954813Z * [new branch] gh/pianpwk/2/orig -> origin/gh/pianpwk/2/orig 2025-10-10T00:44:16.8957782Z * [new branch] gh/pianpwk/3/base -> origin/gh/pianpwk/3/base 2025-10-10T00:44:16.8959648Z * [new branch] gh/pianpwk/3/head -> origin/gh/pianpwk/3/head 2025-10-10T00:44:16.8961358Z * [new branch] gh/pianpwk/3/orig -> origin/gh/pianpwk/3/orig 2025-10-10T00:44:16.8963806Z * [new branch] gh/pianpwk/4/base -> origin/gh/pianpwk/4/base 2025-10-10T00:44:16.8965594Z * [new branch] gh/pianpwk/4/head -> origin/gh/pianpwk/4/head 2025-10-10T00:44:16.8967489Z * [new branch] gh/pianpwk/4/orig -> origin/gh/pianpwk/4/orig 2025-10-10T00:44:16.8970533Z * [new branch] gh/pianpwk/5/base -> origin/gh/pianpwk/5/base 2025-10-10T00:44:16.8972354Z * [new branch] gh/pianpwk/5/head -> origin/gh/pianpwk/5/head 2025-10-10T00:44:16.8974134Z * [new branch] gh/pianpwk/5/orig -> origin/gh/pianpwk/5/orig 2025-10-10T00:44:16.8976556Z * [new branch] gh/pianpwk/6/base -> origin/gh/pianpwk/6/base 2025-10-10T00:44:16.8978380Z * [new branch] gh/pianpwk/6/head -> origin/gh/pianpwk/6/head 2025-10-10T00:44:16.8980072Z * [new branch] gh/pianpwk/6/orig -> origin/gh/pianpwk/6/orig 2025-10-10T00:44:16.8982513Z * [new branch] gh/pianpwk/7/base -> origin/gh/pianpwk/7/base 2025-10-10T00:44:16.8984297Z * [new branch] gh/pianpwk/7/head -> origin/gh/pianpwk/7/head 2025-10-10T00:44:16.8986152Z * [new branch] gh/pianpwk/7/orig -> origin/gh/pianpwk/7/orig 2025-10-10T00:44:16.8988682Z * [new branch] gh/pianpwk/8/base -> origin/gh/pianpwk/8/base 2025-10-10T00:44:16.8990768Z * [new branch] gh/pianpwk/8/head -> origin/gh/pianpwk/8/head 2025-10-10T00:44:16.8993573Z * [new branch] gh/pianpwk/8/orig -> origin/gh/pianpwk/8/orig 2025-10-10T00:44:16.8996188Z * [new branch] gh/raymo/refresh-script -> origin/gh/raymo/refresh-script 2025-10-10T00:44:16.8999326Z * [new branch] gh/rec/141/base -> origin/gh/rec/141/base 2025-10-10T00:44:16.9003484Z * [new branch] gh/rec/141/head -> origin/gh/rec/141/head 2025-10-10T00:44:16.9005996Z * [new branch] gh/rec/153/base -> origin/gh/rec/153/base 2025-10-10T00:44:16.9007803Z * [new branch] gh/rec/153/head -> origin/gh/rec/153/head 2025-10-10T00:44:16.9009625Z * [new branch] gh/rec/153/orig -> origin/gh/rec/153/orig 2025-10-10T00:44:16.9012078Z * [new branch] gh/rec/154/base -> origin/gh/rec/154/base 2025-10-10T00:44:16.9014045Z * [new branch] gh/rec/154/head -> origin/gh/rec/154/head 2025-10-10T00:44:16.9015658Z * [new branch] gh/rec/154/orig -> origin/gh/rec/154/orig 2025-10-10T00:44:16.9018084Z * [new branch] gh/rec/162/base -> origin/gh/rec/162/base 2025-10-10T00:44:16.9019856Z * [new branch] gh/rec/162/head -> origin/gh/rec/162/head 2025-10-10T00:44:16.9021759Z * [new branch] gh/rec/162/orig -> origin/gh/rec/162/orig 2025-10-10T00:44:16.9024134Z * [new branch] gh/rec/164/base -> origin/gh/rec/164/base 2025-10-10T00:44:16.9025954Z * [new branch] gh/rec/164/head -> origin/gh/rec/164/head 2025-10-10T00:44:16.9028153Z * [new branch] gh/rec/164/orig -> origin/gh/rec/164/orig 2025-10-10T00:44:16.9030594Z * [new branch] gh/rec/166/base -> origin/gh/rec/166/base 2025-10-10T00:44:16.9032778Z * [new branch] gh/rec/166/head -> origin/gh/rec/166/head 2025-10-10T00:44:16.9034676Z * [new branch] gh/rec/166/orig -> origin/gh/rec/166/orig 2025-10-10T00:44:16.9037776Z * [new branch] gh/robert-hardwick/1/base -> origin/gh/robert-hardwick/1/base 2025-10-10T00:44:16.9039739Z * [new branch] gh/robert-hardwick/1/head -> origin/gh/robert-hardwick/1/head 2025-10-10T00:44:16.9041418Z * [new branch] gh/robert-hardwick/1/orig -> origin/gh/robert-hardwick/1/orig 2025-10-10T00:44:16.9044150Z * [new branch] gh/robert-hardwick/2/base -> origin/gh/robert-hardwick/2/base 2025-10-10T00:44:16.9046262Z * [new branch] gh/robert-hardwick/2/head -> origin/gh/robert-hardwick/2/head 2025-10-10T00:44:16.9048453Z * [new branch] gh/robert-hardwick/2/orig -> origin/gh/robert-hardwick/2/orig 2025-10-10T00:44:16.9050957Z * [new branch] gh/robert-hardwick/3/base -> origin/gh/robert-hardwick/3/base 2025-10-10T00:44:16.9052790Z * [new branch] gh/robert-hardwick/3/head -> origin/gh/robert-hardwick/3/head 2025-10-10T00:44:16.9054534Z * [new branch] gh/robert-hardwick/3/orig -> origin/gh/robert-hardwick/3/orig 2025-10-10T00:44:16.9057564Z * [new branch] gh/robert-hardwick/4/base -> origin/gh/robert-hardwick/4/base 2025-10-10T00:44:16.9059556Z * [new branch] gh/robert-hardwick/4/head -> origin/gh/robert-hardwick/4/head 2025-10-10T00:44:16.9061396Z * [new branch] gh/robert-hardwick/4/orig -> origin/gh/robert-hardwick/4/orig 2025-10-10T00:44:16.9064440Z * [new branch] gh/rtimpe/1/base -> origin/gh/rtimpe/1/base 2025-10-10T00:44:16.9066761Z * [new branch] gh/rtimpe/1/head -> origin/gh/rtimpe/1/head 2025-10-10T00:44:16.9069447Z * [new branch] gh/rtimpe/11/base -> origin/gh/rtimpe/11/base 2025-10-10T00:44:16.9071424Z * [new branch] gh/rtimpe/11/head -> origin/gh/rtimpe/11/head 2025-10-10T00:44:16.9073246Z * [new branch] gh/rtimpe/11/orig -> origin/gh/rtimpe/11/orig 2025-10-10T00:44:16.9075777Z * [new branch] gh/rtimpe/15/base -> origin/gh/rtimpe/15/base 2025-10-10T00:44:16.9077559Z * [new branch] gh/rtimpe/15/head -> origin/gh/rtimpe/15/head 2025-10-10T00:44:16.9079417Z * [new branch] gh/rtimpe/15/orig -> origin/gh/rtimpe/15/orig 2025-10-10T00:44:16.9081889Z * [new branch] gh/rtimpe/16/base -> origin/gh/rtimpe/16/base 2025-10-10T00:44:16.9083720Z * [new branch] gh/rtimpe/16/head -> origin/gh/rtimpe/16/head 2025-10-10T00:44:16.9085698Z * [new branch] gh/rtimpe/16/orig -> origin/gh/rtimpe/16/orig 2025-10-10T00:44:16.9088498Z * [new branch] gh/rtimpe/17/base -> origin/gh/rtimpe/17/base 2025-10-10T00:44:16.9090509Z * [new branch] gh/rtimpe/17/head -> origin/gh/rtimpe/17/head 2025-10-10T00:44:16.9092280Z * [new branch] gh/rtimpe/17/orig -> origin/gh/rtimpe/17/orig 2025-10-10T00:44:16.9095255Z * [new branch] gh/rtimpe/18/base -> origin/gh/rtimpe/18/base 2025-10-10T00:44:16.9097243Z * [new branch] gh/rtimpe/18/head -> origin/gh/rtimpe/18/head 2025-10-10T00:44:16.9099405Z * [new branch] gh/rtimpe/18/orig -> origin/gh/rtimpe/18/orig 2025-10-10T00:44:16.9101734Z * [new branch] gh/rtimpe/2/base -> origin/gh/rtimpe/2/base 2025-10-10T00:44:16.9103747Z * [new branch] gh/rtimpe/2/head -> origin/gh/rtimpe/2/head 2025-10-10T00:44:16.9105981Z * [new branch] gh/rtimpe/3/base -> origin/gh/rtimpe/3/base 2025-10-10T00:44:16.9107830Z * [new branch] gh/rtimpe/3/head -> origin/gh/rtimpe/3/head 2025-10-10T00:44:16.9110369Z * [new branch] gh/rtimpe/4/base -> origin/gh/rtimpe/4/base 2025-10-10T00:44:16.9112094Z * [new branch] gh/rtimpe/4/head -> origin/gh/rtimpe/4/head 2025-10-10T00:44:16.9115254Z * [new branch] gh/ruisizhang123/1/base -> origin/gh/ruisizhang123/1/base 2025-10-10T00:44:16.9117221Z * [new branch] gh/ruisizhang123/1/head -> origin/gh/ruisizhang123/1/head 2025-10-10T00:44:16.9119072Z * [new branch] gh/ruisizhang123/1/orig -> origin/gh/ruisizhang123/1/orig 2025-10-10T00:44:16.9122232Z * [new branch] gh/ruisizhang123/4/base -> origin/gh/ruisizhang123/4/base 2025-10-10T00:44:16.9124057Z * [new branch] gh/ruisizhang123/4/head -> origin/gh/ruisizhang123/4/head 2025-10-10T00:44:16.9125956Z * [new branch] gh/ruisizhang123/4/orig -> origin/gh/ruisizhang123/4/orig 2025-10-10T00:44:16.9128790Z * [new branch] gh/ruisizhang123/5/base -> origin/gh/ruisizhang123/5/base 2025-10-10T00:44:16.9130635Z * [new branch] gh/ruisizhang123/5/head -> origin/gh/ruisizhang123/5/head 2025-10-10T00:44:16.9132362Z * [new branch] gh/ruisizhang123/5/orig -> origin/gh/ruisizhang123/5/orig 2025-10-10T00:44:16.9134920Z * [new branch] gh/ruisizhang123/6/base -> origin/gh/ruisizhang123/6/base 2025-10-10T00:44:16.9136860Z * [new branch] gh/ruisizhang123/6/head -> origin/gh/ruisizhang123/6/head 2025-10-10T00:44:16.9138639Z * [new branch] gh/ruisizhang123/6/orig -> origin/gh/ruisizhang123/6/orig 2025-10-10T00:44:16.9141170Z * [new branch] gh/ruisizhang123/7/base -> origin/gh/ruisizhang123/7/base 2025-10-10T00:44:16.9143138Z * [new branch] gh/ruisizhang123/7/head -> origin/gh/ruisizhang123/7/head 2025-10-10T00:44:16.9144899Z * [new branch] gh/ruisizhang123/7/orig -> origin/gh/ruisizhang123/7/orig 2025-10-10T00:44:16.9148017Z * [new branch] gh/ruisizhang123/8/base -> origin/gh/ruisizhang123/8/base 2025-10-10T00:44:16.9150083Z * [new branch] gh/ruisizhang123/8/head -> origin/gh/ruisizhang123/8/head 2025-10-10T00:44:16.9152010Z * [new branch] gh/ruisizhang123/8/orig -> origin/gh/ruisizhang123/8/orig 2025-10-10T00:44:16.9154455Z * [new branch] gh/ruisizhang123/9/base -> origin/gh/ruisizhang123/9/base 2025-10-10T00:44:16.9156403Z * [new branch] gh/ruisizhang123/9/head -> origin/gh/ruisizhang123/9/head 2025-10-10T00:44:16.9158263Z * [new branch] gh/ruisizhang123/9/orig -> origin/gh/ruisizhang123/9/orig 2025-10-10T00:44:16.9161242Z * [new branch] gh/sarckk/2/base -> origin/gh/sarckk/2/base 2025-10-10T00:44:16.9163089Z * [new branch] gh/sarckk/2/head -> origin/gh/sarckk/2/head 2025-10-10T00:44:16.9164936Z * [new branch] gh/sarckk/2/orig -> origin/gh/sarckk/2/orig 2025-10-10T00:44:16.9168390Z * [new branch] gh/seemethere/35/base -> origin/gh/seemethere/35/base 2025-10-10T00:44:16.9170271Z * [new branch] gh/seemethere/35/head -> origin/gh/seemethere/35/head 2025-10-10T00:44:16.9172070Z * [new branch] gh/seemethere/35/orig -> origin/gh/seemethere/35/orig 2025-10-10T00:44:16.9175078Z * [new branch] gh/seemethere/37/base -> origin/gh/seemethere/37/base 2025-10-10T00:44:16.9177038Z * [new branch] gh/seemethere/37/head -> origin/gh/seemethere/37/head 2025-10-10T00:44:16.9179379Z * [new branch] gh/seemethere/37/orig -> origin/gh/seemethere/37/orig 2025-10-10T00:44:16.9182004Z * [new branch] gh/seemethere/43/base -> origin/gh/seemethere/43/base 2025-10-10T00:44:16.9183872Z * [new branch] gh/seemethere/43/head -> origin/gh/seemethere/43/head 2025-10-10T00:44:16.9185713Z * [new branch] gh/seemethere/43/orig -> origin/gh/seemethere/43/orig 2025-10-10T00:44:16.9188153Z * [new branch] gh/seemethere/44/base -> origin/gh/seemethere/44/base 2025-10-10T00:44:16.9190148Z * [new branch] gh/seemethere/44/head -> origin/gh/seemethere/44/head 2025-10-10T00:44:16.9191981Z * [new branch] gh/seemethere/44/orig -> origin/gh/seemethere/44/orig 2025-10-10T00:44:16.9194421Z * [new branch] gh/seemethere/48/base -> origin/gh/seemethere/48/base 2025-10-10T00:44:16.9196491Z * [new branch] gh/seemethere/48/head -> origin/gh/seemethere/48/head 2025-10-10T00:44:16.9198210Z * [new branch] gh/seemethere/48/orig -> origin/gh/seemethere/48/orig 2025-10-10T00:44:16.9201177Z * [new branch] gh/seemethere/49/base -> origin/gh/seemethere/49/base 2025-10-10T00:44:16.9202950Z * [new branch] gh/seemethere/49/head -> origin/gh/seemethere/49/head 2025-10-10T00:44:16.9204810Z * [new branch] gh/seemethere/49/orig -> origin/gh/seemethere/49/orig 2025-10-10T00:44:16.9207356Z * [new branch] gh/seemethere/52/base -> origin/gh/seemethere/52/base 2025-10-10T00:44:16.9209413Z * [new branch] gh/seemethere/52/head -> origin/gh/seemethere/52/head 2025-10-10T00:44:16.9211385Z * [new branch] gh/seemethere/52/orig -> origin/gh/seemethere/52/orig 2025-10-10T00:44:16.9213808Z * [new branch] gh/seemethere/53/base -> origin/gh/seemethere/53/base 2025-10-10T00:44:16.9215778Z * [new branch] gh/seemethere/53/head -> origin/gh/seemethere/53/head 2025-10-10T00:44:16.9217652Z * [new branch] gh/seemethere/53/orig -> origin/gh/seemethere/53/orig 2025-10-10T00:44:16.9220171Z * [new branch] gh/seemethere/54/base -> origin/gh/seemethere/54/base 2025-10-10T00:44:16.9222217Z * [new branch] gh/seemethere/54/head -> origin/gh/seemethere/54/head 2025-10-10T00:44:16.9224152Z * [new branch] gh/seemethere/54/orig -> origin/gh/seemethere/54/orig 2025-10-10T00:44:16.9226507Z * [new branch] gh/seemethere/55/base -> origin/gh/seemethere/55/base 2025-10-10T00:44:16.9228905Z * [new branch] gh/seemethere/55/head -> origin/gh/seemethere/55/head 2025-10-10T00:44:16.9230874Z * [new branch] gh/seemethere/55/orig -> origin/gh/seemethere/55/orig 2025-10-10T00:44:16.9233216Z * [new branch] gh/seemethere/59/base -> origin/gh/seemethere/59/base 2025-10-10T00:44:16.9235204Z * [new branch] gh/seemethere/59/head -> origin/gh/seemethere/59/head 2025-10-10T00:44:16.9237020Z * [new branch] gh/seemethere/59/orig -> origin/gh/seemethere/59/orig 2025-10-10T00:44:16.9239424Z * [new branch] gh/seemethere/62/base -> origin/gh/seemethere/62/base 2025-10-10T00:44:16.9241263Z * [new branch] gh/seemethere/62/head -> origin/gh/seemethere/62/head 2025-10-10T00:44:16.9243239Z * [new branch] gh/seemethere/62/orig -> origin/gh/seemethere/62/orig 2025-10-10T00:44:16.9245756Z * [new branch] gh/seemethere/63/base -> origin/gh/seemethere/63/base 2025-10-10T00:44:16.9248010Z * [new branch] gh/seemethere/63/head -> origin/gh/seemethere/63/head 2025-10-10T00:44:16.9249827Z * [new branch] gh/seemethere/63/orig -> origin/gh/seemethere/63/orig 2025-10-10T00:44:16.9252414Z * [new branch] gh/seemethere/64/base -> origin/gh/seemethere/64/base 2025-10-10T00:44:16.9254276Z * [new branch] gh/seemethere/64/head -> origin/gh/seemethere/64/head 2025-10-10T00:44:16.9256263Z * [new branch] gh/seemethere/64/orig -> origin/gh/seemethere/64/orig 2025-10-10T00:44:16.9258870Z * [new branch] gh/seemethere/65/base -> origin/gh/seemethere/65/base 2025-10-10T00:44:16.9260787Z * [new branch] gh/seemethere/65/head -> origin/gh/seemethere/65/head 2025-10-10T00:44:16.9262591Z * [new branch] gh/seemethere/65/orig -> origin/gh/seemethere/65/orig 2025-10-10T00:44:16.9265135Z * [new branch] gh/seemethere/66/base -> origin/gh/seemethere/66/base 2025-10-10T00:44:16.9266972Z * [new branch] gh/seemethere/66/head -> origin/gh/seemethere/66/head 2025-10-10T00:44:16.9268860Z * [new branch] gh/seemethere/66/orig -> origin/gh/seemethere/66/orig 2025-10-10T00:44:16.9271457Z * [new branch] gh/seemethere/67/base -> origin/gh/seemethere/67/base 2025-10-10T00:44:16.9273363Z * [new branch] gh/seemethere/67/head -> origin/gh/seemethere/67/head 2025-10-10T00:44:16.9275280Z * [new branch] gh/seemethere/67/orig -> origin/gh/seemethere/67/orig 2025-10-10T00:44:16.9277777Z * [new branch] gh/seemethere/68/base -> origin/gh/seemethere/68/base 2025-10-10T00:44:16.9279687Z * [new branch] gh/seemethere/68/head -> origin/gh/seemethere/68/head 2025-10-10T00:44:16.9281952Z * [new branch] gh/seemethere/68/orig -> origin/gh/seemethere/68/orig 2025-10-10T00:44:16.9285003Z * [new branch] gh/seemethere/69/base -> origin/gh/seemethere/69/base 2025-10-10T00:44:16.9286862Z * [new branch] gh/seemethere/69/head -> origin/gh/seemethere/69/head 2025-10-10T00:44:16.9289115Z * [new branch] gh/seemethere/69/orig -> origin/gh/seemethere/69/orig 2025-10-10T00:44:16.9291713Z * [new branch] gh/seemethere/70/base -> origin/gh/seemethere/70/base 2025-10-10T00:44:16.9293508Z * [new branch] gh/seemethere/70/head -> origin/gh/seemethere/70/head 2025-10-10T00:44:16.9295442Z * [new branch] gh/seemethere/70/orig -> origin/gh/seemethere/70/orig 2025-10-10T00:44:16.9297993Z * [new branch] gh/seemethere/71/base -> origin/gh/seemethere/71/base 2025-10-10T00:44:16.9300363Z * [new branch] gh/seemethere/71/head -> origin/gh/seemethere/71/head 2025-10-10T00:44:16.9302061Z * [new branch] gh/seemethere/71/orig -> origin/gh/seemethere/71/orig 2025-10-10T00:44:16.9305258Z * [new branch] gh/shunting314/145/base -> origin/gh/shunting314/145/base 2025-10-10T00:44:16.9307356Z * [new branch] gh/shunting314/145/head -> origin/gh/shunting314/145/head 2025-10-10T00:44:16.9312571Z * [new branch] gh/shunting314/145/orig -> origin/gh/shunting314/145/orig 2025-10-10T00:44:16.9315588Z * [new branch] gh/shunting314/176/base -> origin/gh/shunting314/176/base 2025-10-10T00:44:16.9317712Z * [new branch] gh/shunting314/176/head -> origin/gh/shunting314/176/head 2025-10-10T00:44:16.9319521Z * [new branch] gh/shunting314/176/orig -> origin/gh/shunting314/176/orig 2025-10-10T00:44:16.9322169Z * [new branch] gh/shunting314/211/base -> origin/gh/shunting314/211/base 2025-10-10T00:44:16.9323931Z * [new branch] gh/shunting314/211/head -> origin/gh/shunting314/211/head 2025-10-10T00:44:16.9325754Z * [new branch] gh/shunting314/211/orig -> origin/gh/shunting314/211/orig 2025-10-10T00:44:16.9328927Z * [new branch] gh/shunting314/212/base -> origin/gh/shunting314/212/base 2025-10-10T00:44:16.9330648Z * [new branch] gh/shunting314/212/head -> origin/gh/shunting314/212/head 2025-10-10T00:44:16.9332493Z * [new branch] gh/shunting314/212/orig -> origin/gh/shunting314/212/orig 2025-10-10T00:44:16.9335459Z * [new branch] gh/shunting314/213/base -> origin/gh/shunting314/213/base 2025-10-10T00:44:16.9337453Z * [new branch] gh/shunting314/213/head -> origin/gh/shunting314/213/head 2025-10-10T00:44:16.9339204Z * [new branch] gh/shunting314/213/orig -> origin/gh/shunting314/213/orig 2025-10-10T00:44:16.9341856Z * [new branch] gh/shunting314/215/base -> origin/gh/shunting314/215/base 2025-10-10T00:44:16.9343704Z * [new branch] gh/shunting314/215/head -> origin/gh/shunting314/215/head 2025-10-10T00:44:16.9345418Z * [new branch] gh/shunting314/215/orig -> origin/gh/shunting314/215/orig 2025-10-10T00:44:16.9348140Z * [new branch] gh/shunting314/216/base -> origin/gh/shunting314/216/base 2025-10-10T00:44:16.9350171Z * [new branch] gh/shunting314/216/head -> origin/gh/shunting314/216/head 2025-10-10T00:44:16.9352094Z * [new branch] gh/shunting314/216/orig -> origin/gh/shunting314/216/orig 2025-10-10T00:44:16.9354652Z * [new branch] gh/shunting314/217/base -> origin/gh/shunting314/217/base 2025-10-10T00:44:16.9356474Z * [new branch] gh/shunting314/217/head -> origin/gh/shunting314/217/head 2025-10-10T00:44:16.9358373Z * [new branch] gh/shunting314/217/orig -> origin/gh/shunting314/217/orig 2025-10-10T00:44:16.9360968Z * [new branch] gh/shunting314/218/base -> origin/gh/shunting314/218/base 2025-10-10T00:44:16.9362850Z * [new branch] gh/shunting314/218/head -> origin/gh/shunting314/218/head 2025-10-10T00:44:16.9364859Z * [new branch] gh/shunting314/218/orig -> origin/gh/shunting314/218/orig 2025-10-10T00:44:16.9367668Z * [new branch] gh/shunting314/219/base -> origin/gh/shunting314/219/base 2025-10-10T00:44:16.9369631Z * [new branch] gh/shunting314/219/head -> origin/gh/shunting314/219/head 2025-10-10T00:44:16.9371518Z * [new branch] gh/shunting314/219/orig -> origin/gh/shunting314/219/orig 2025-10-10T00:44:16.9374052Z * [new branch] gh/shunting314/223/base -> origin/gh/shunting314/223/base 2025-10-10T00:44:16.9376067Z * [new branch] gh/shunting314/223/head -> origin/gh/shunting314/223/head 2025-10-10T00:44:16.9378391Z * [new branch] gh/shunting314/223/orig -> origin/gh/shunting314/223/orig 2025-10-10T00:44:16.9381666Z * [new branch] gh/shunting314/224/base -> origin/gh/shunting314/224/base 2025-10-10T00:44:16.9383647Z * [new branch] gh/shunting314/224/head -> origin/gh/shunting314/224/head 2025-10-10T00:44:16.9385544Z * [new branch] gh/shunting314/224/orig -> origin/gh/shunting314/224/orig 2025-10-10T00:44:16.9388067Z * [new branch] gh/shunting314/225/base -> origin/gh/shunting314/225/base 2025-10-10T00:44:16.9389978Z * [new branch] gh/shunting314/225/head -> origin/gh/shunting314/225/head 2025-10-10T00:44:16.9391820Z * [new branch] gh/shunting314/225/orig -> origin/gh/shunting314/225/orig 2025-10-10T00:44:16.9394554Z * [new branch] gh/shunting314/226/base -> origin/gh/shunting314/226/base 2025-10-10T00:44:16.9396528Z * [new branch] gh/shunting314/226/head -> origin/gh/shunting314/226/head 2025-10-10T00:44:16.9398283Z * [new branch] gh/shunting314/226/orig -> origin/gh/shunting314/226/orig 2025-10-10T00:44:16.9403291Z * [new branch] gh/shunting314/227/base -> origin/gh/shunting314/227/base 2025-10-10T00:44:16.9405096Z * [new branch] gh/shunting314/227/head -> origin/gh/shunting314/227/head 2025-10-10T00:44:16.9407173Z * [new branch] gh/shunting314/227/orig -> origin/gh/shunting314/227/orig 2025-10-10T00:44:16.9410013Z * [new branch] gh/shunting314/228/base -> origin/gh/shunting314/228/base 2025-10-10T00:44:16.9411756Z * [new branch] gh/shunting314/228/head -> origin/gh/shunting314/228/head 2025-10-10T00:44:16.9413667Z * [new branch] gh/shunting314/228/orig -> origin/gh/shunting314/228/orig 2025-10-10T00:44:16.9416703Z * [new branch] gh/shunting314/229/base -> origin/gh/shunting314/229/base 2025-10-10T00:44:16.9418665Z * [new branch] gh/shunting314/229/head -> origin/gh/shunting314/229/head 2025-10-10T00:44:16.9420487Z * [new branch] gh/shunting314/229/orig -> origin/gh/shunting314/229/orig 2025-10-10T00:44:16.9423094Z * [new branch] gh/shunting314/230/base -> origin/gh/shunting314/230/base 2025-10-10T00:44:16.9425063Z * [new branch] gh/shunting314/230/head -> origin/gh/shunting314/230/head 2025-10-10T00:44:16.9426907Z * [new branch] gh/shunting314/230/orig -> origin/gh/shunting314/230/orig 2025-10-10T00:44:16.9429976Z * [new branch] gh/shunting314/231/base -> origin/gh/shunting314/231/base 2025-10-10T00:44:16.9432069Z * [new branch] gh/shunting314/231/head -> origin/gh/shunting314/231/head 2025-10-10T00:44:16.9433876Z * [new branch] gh/shunting314/231/orig -> origin/gh/shunting314/231/orig 2025-10-10T00:44:16.9436559Z * [new branch] gh/shunting314/232/base -> origin/gh/shunting314/232/base 2025-10-10T00:44:16.9438485Z * [new branch] gh/shunting314/232/head -> origin/gh/shunting314/232/head 2025-10-10T00:44:16.9440204Z * [new branch] gh/shunting314/232/orig -> origin/gh/shunting314/232/orig 2025-10-10T00:44:16.9442672Z * [new branch] gh/shunting314/233/base -> origin/gh/shunting314/233/base 2025-10-10T00:44:16.9444511Z * [new branch] gh/shunting314/233/head -> origin/gh/shunting314/233/head 2025-10-10T00:44:16.9446349Z * [new branch] gh/shunting314/233/orig -> origin/gh/shunting314/233/orig 2025-10-10T00:44:16.9449057Z * [new branch] gh/shunting314/234/base -> origin/gh/shunting314/234/base 2025-10-10T00:44:16.9451245Z * [new branch] gh/shunting314/234/head -> origin/gh/shunting314/234/head 2025-10-10T00:44:16.9453092Z * [new branch] gh/shunting314/234/orig -> origin/gh/shunting314/234/orig 2025-10-10T00:44:16.9455744Z * [new branch] gh/shunting314/235/base -> origin/gh/shunting314/235/base 2025-10-10T00:44:16.9458146Z * [new branch] gh/shunting314/235/head -> origin/gh/shunting314/235/head 2025-10-10T00:44:16.9459923Z * [new branch] gh/shunting314/235/orig -> origin/gh/shunting314/235/orig 2025-10-10T00:44:16.9463026Z * [new branch] gh/silverguo/1/base -> origin/gh/silverguo/1/base 2025-10-10T00:44:16.9464954Z * [new branch] gh/silverguo/1/head -> origin/gh/silverguo/1/head 2025-10-10T00:44:16.9467344Z * [new branch] gh/silverguo/2/base -> origin/gh/silverguo/2/base 2025-10-10T00:44:16.9469069Z * [new branch] gh/silverguo/2/head -> origin/gh/silverguo/2/head 2025-10-10T00:44:16.9471458Z * [new branch] gh/silverguo/3/base -> origin/gh/silverguo/3/base 2025-10-10T00:44:16.9473411Z * [new branch] gh/silverguo/3/head -> origin/gh/silverguo/3/head 2025-10-10T00:44:16.9475904Z * [new branch] gh/silverguo/4/base -> origin/gh/silverguo/4/base 2025-10-10T00:44:16.9477694Z * [new branch] gh/silverguo/4/head -> origin/gh/silverguo/4/head 2025-10-10T00:44:16.9480759Z * [new branch] gh/sinhaanhsul/1/base -> origin/gh/sinhaanhsul/1/base 2025-10-10T00:44:16.9482569Z * [new branch] gh/sinhaanhsul/1/head -> origin/gh/sinhaanhsul/1/head 2025-10-10T00:44:16.9485803Z * [new branch] gh/slayton58/1/base -> origin/gh/slayton58/1/base 2025-10-10T00:44:16.9487858Z * [new branch] gh/slayton58/1/head -> origin/gh/slayton58/1/head 2025-10-10T00:44:16.9489818Z * [new branch] gh/slayton58/1/orig -> origin/gh/slayton58/1/orig 2025-10-10T00:44:16.9492498Z * [new branch] gh/slayton58/10/base -> origin/gh/slayton58/10/base 2025-10-10T00:44:16.9494275Z * [new branch] gh/slayton58/10/head -> origin/gh/slayton58/10/head 2025-10-10T00:44:16.9496186Z * [new branch] gh/slayton58/10/orig -> origin/gh/slayton58/10/orig 2025-10-10T00:44:16.9499062Z * [new branch] gh/slayton58/11/base -> origin/gh/slayton58/11/base 2025-10-10T00:44:16.9500884Z * [new branch] gh/slayton58/11/head -> origin/gh/slayton58/11/head 2025-10-10T00:44:16.9502533Z * [new branch] gh/slayton58/11/orig -> origin/gh/slayton58/11/orig 2025-10-10T00:44:16.9504944Z * [new branch] gh/slayton58/12/base -> origin/gh/slayton58/12/base 2025-10-10T00:44:16.9506888Z * [new branch] gh/slayton58/12/head -> origin/gh/slayton58/12/head 2025-10-10T00:44:16.9508920Z * [new branch] gh/slayton58/12/orig -> origin/gh/slayton58/12/orig 2025-10-10T00:44:16.9511429Z * [new branch] gh/slayton58/13/base -> origin/gh/slayton58/13/base 2025-10-10T00:44:16.9513401Z * [new branch] gh/slayton58/13/head -> origin/gh/slayton58/13/head 2025-10-10T00:44:16.9515314Z * [new branch] gh/slayton58/13/orig -> origin/gh/slayton58/13/orig 2025-10-10T00:44:16.9517937Z * [new branch] gh/slayton58/14/base -> origin/gh/slayton58/14/base 2025-10-10T00:44:16.9519671Z * [new branch] gh/slayton58/14/head -> origin/gh/slayton58/14/head 2025-10-10T00:44:16.9521489Z * [new branch] gh/slayton58/14/orig -> origin/gh/slayton58/14/orig 2025-10-10T00:44:16.9524010Z * [new branch] gh/slayton58/15/base -> origin/gh/slayton58/15/base 2025-10-10T00:44:16.9525835Z * [new branch] gh/slayton58/15/head -> origin/gh/slayton58/15/head 2025-10-10T00:44:16.9527892Z * [new branch] gh/slayton58/15/orig -> origin/gh/slayton58/15/orig 2025-10-10T00:44:16.9530592Z * [new branch] gh/slayton58/16/base -> origin/gh/slayton58/16/base 2025-10-10T00:44:16.9532484Z * [new branch] gh/slayton58/16/head -> origin/gh/slayton58/16/head 2025-10-10T00:44:16.9534421Z * [new branch] gh/slayton58/16/orig -> origin/gh/slayton58/16/orig 2025-10-10T00:44:16.9537038Z * [new branch] gh/slayton58/17/base -> origin/gh/slayton58/17/base 2025-10-10T00:44:16.9538870Z * [new branch] gh/slayton58/17/head -> origin/gh/slayton58/17/head 2025-10-10T00:44:16.9540767Z * [new branch] gh/slayton58/17/orig -> origin/gh/slayton58/17/orig 2025-10-10T00:44:16.9543183Z * [new branch] gh/slayton58/18/base -> origin/gh/slayton58/18/base 2025-10-10T00:44:16.9545023Z * [new branch] gh/slayton58/18/head -> origin/gh/slayton58/18/head 2025-10-10T00:44:16.9547714Z * [new branch] gh/slayton58/19/base -> origin/gh/slayton58/19/base 2025-10-10T00:44:16.9549553Z * [new branch] gh/slayton58/19/head -> origin/gh/slayton58/19/head 2025-10-10T00:44:16.9551497Z * [new branch] gh/slayton58/19/orig -> origin/gh/slayton58/19/orig 2025-10-10T00:44:16.9554719Z * [new branch] gh/slayton58/2/base -> origin/gh/slayton58/2/base 2025-10-10T00:44:16.9556516Z * [new branch] gh/slayton58/2/head -> origin/gh/slayton58/2/head 2025-10-10T00:44:16.9558524Z * [new branch] gh/slayton58/2/orig -> origin/gh/slayton58/2/orig 2025-10-10T00:44:16.9561097Z * [new branch] gh/slayton58/20/base -> origin/gh/slayton58/20/base 2025-10-10T00:44:16.9562931Z * [new branch] gh/slayton58/20/head -> origin/gh/slayton58/20/head 2025-10-10T00:44:16.9564798Z * [new branch] gh/slayton58/20/orig -> origin/gh/slayton58/20/orig 2025-10-10T00:44:16.9567645Z * [new branch] gh/slayton58/21/base -> origin/gh/slayton58/21/base 2025-10-10T00:44:16.9569502Z * [new branch] gh/slayton58/21/head -> origin/gh/slayton58/21/head 2025-10-10T00:44:16.9571353Z * [new branch] gh/slayton58/21/orig -> origin/gh/slayton58/21/orig 2025-10-10T00:44:16.9573803Z * [new branch] gh/slayton58/22/base -> origin/gh/slayton58/22/base 2025-10-10T00:44:16.9575759Z * [new branch] gh/slayton58/22/head -> origin/gh/slayton58/22/head 2025-10-10T00:44:16.9577702Z * [new branch] gh/slayton58/22/orig -> origin/gh/slayton58/22/orig 2025-10-10T00:44:16.9580113Z * [new branch] gh/slayton58/23/base -> origin/gh/slayton58/23/base 2025-10-10T00:44:16.9582025Z * [new branch] gh/slayton58/23/head -> origin/gh/slayton58/23/head 2025-10-10T00:44:16.9583933Z * [new branch] gh/slayton58/23/orig -> origin/gh/slayton58/23/orig 2025-10-10T00:44:16.9586529Z * [new branch] gh/slayton58/24/base -> origin/gh/slayton58/24/base 2025-10-10T00:44:16.9588380Z * [new branch] gh/slayton58/24/head -> origin/gh/slayton58/24/head 2025-10-10T00:44:16.9590263Z * [new branch] gh/slayton58/24/orig -> origin/gh/slayton58/24/orig 2025-10-10T00:44:16.9592785Z * [new branch] gh/slayton58/25/base -> origin/gh/slayton58/25/base 2025-10-10T00:44:16.9594686Z * [new branch] gh/slayton58/25/head -> origin/gh/slayton58/25/head 2025-10-10T00:44:16.9596596Z * [new branch] gh/slayton58/25/orig -> origin/gh/slayton58/25/orig 2025-10-10T00:44:16.9602525Z * [new branch] gh/slayton58/26/base -> origin/gh/slayton58/26/base 2025-10-10T00:44:16.9604300Z * [new branch] gh/slayton58/26/head -> origin/gh/slayton58/26/head 2025-10-10T00:44:16.9606103Z * [new branch] gh/slayton58/26/orig -> origin/gh/slayton58/26/orig 2025-10-10T00:44:16.9608867Z * [new branch] gh/slayton58/3/base -> origin/gh/slayton58/3/base 2025-10-10T00:44:16.9610780Z * [new branch] gh/slayton58/3/head -> origin/gh/slayton58/3/head 2025-10-10T00:44:16.9612693Z * [new branch] gh/slayton58/3/orig -> origin/gh/slayton58/3/orig 2025-10-10T00:44:16.9615296Z * [new branch] gh/slayton58/4/base -> origin/gh/slayton58/4/base 2025-10-10T00:44:16.9616988Z * [new branch] gh/slayton58/4/head -> origin/gh/slayton58/4/head 2025-10-10T00:44:16.9618871Z * [new branch] gh/slayton58/4/orig -> origin/gh/slayton58/4/orig 2025-10-10T00:44:16.9621442Z * [new branch] gh/slayton58/5/base -> origin/gh/slayton58/5/base 2025-10-10T00:44:16.9623251Z * [new branch] gh/slayton58/5/head -> origin/gh/slayton58/5/head 2025-10-10T00:44:16.9625165Z * [new branch] gh/slayton58/5/orig -> origin/gh/slayton58/5/orig 2025-10-10T00:44:16.9628173Z * [new branch] gh/slayton58/6/base -> origin/gh/slayton58/6/base 2025-10-10T00:44:16.9630272Z * [new branch] gh/slayton58/6/head -> origin/gh/slayton58/6/head 2025-10-10T00:44:16.9632639Z * [new branch] gh/slayton58/7/base -> origin/gh/slayton58/7/base 2025-10-10T00:44:16.9634387Z * [new branch] gh/slayton58/7/head -> origin/gh/slayton58/7/head 2025-10-10T00:44:16.9636836Z * [new branch] gh/slayton58/8/base -> origin/gh/slayton58/8/base 2025-10-10T00:44:16.9638951Z * [new branch] gh/slayton58/8/head -> origin/gh/slayton58/8/head 2025-10-10T00:44:16.9640918Z * [new branch] gh/slayton58/8/orig -> origin/gh/slayton58/8/orig 2025-10-10T00:44:16.9643424Z * [new branch] gh/slayton58/9/base -> origin/gh/slayton58/9/base 2025-10-10T00:44:16.9645122Z * [new branch] gh/slayton58/9/head -> origin/gh/slayton58/9/head 2025-10-10T00:44:16.9647168Z * [new branch] gh/slayton58/9/orig -> origin/gh/slayton58/9/orig 2025-10-10T00:44:16.9650728Z * [new branch] gh/soulitzer/269/base -> origin/gh/soulitzer/269/base 2025-10-10T00:44:16.9652904Z * [new branch] gh/soulitzer/269/head -> origin/gh/soulitzer/269/head 2025-10-10T00:44:16.9654772Z * [new branch] gh/soulitzer/269/orig -> origin/gh/soulitzer/269/orig 2025-10-10T00:44:16.9657348Z * [new branch] gh/soulitzer/276/base -> origin/gh/soulitzer/276/base 2025-10-10T00:44:16.9659293Z * [new branch] gh/soulitzer/276/head -> origin/gh/soulitzer/276/head 2025-10-10T00:44:16.9661148Z * [new branch] gh/soulitzer/276/orig -> origin/gh/soulitzer/276/orig 2025-10-10T00:44:16.9664018Z * [new branch] gh/soulitzer/287/base -> origin/gh/soulitzer/287/base 2025-10-10T00:44:16.9665922Z * [new branch] gh/soulitzer/287/head -> origin/gh/soulitzer/287/head 2025-10-10T00:44:16.9667842Z * [new branch] gh/soulitzer/287/orig -> origin/gh/soulitzer/287/orig 2025-10-10T00:44:16.9670424Z * [new branch] gh/soulitzer/296/base -> origin/gh/soulitzer/296/base 2025-10-10T00:44:16.9672413Z * [new branch] gh/soulitzer/296/head -> origin/gh/soulitzer/296/head 2025-10-10T00:44:16.9674334Z * [new branch] gh/soulitzer/296/orig -> origin/gh/soulitzer/296/orig 2025-10-10T00:44:16.9676800Z * [new branch] gh/soulitzer/299/base -> origin/gh/soulitzer/299/base 2025-10-10T00:44:16.9678810Z * [new branch] gh/soulitzer/299/head -> origin/gh/soulitzer/299/head 2025-10-10T00:44:16.9680559Z * [new branch] gh/soulitzer/299/orig -> origin/gh/soulitzer/299/orig 2025-10-10T00:44:16.9683065Z * [new branch] gh/soulitzer/300/base -> origin/gh/soulitzer/300/base 2025-10-10T00:44:16.9685025Z * [new branch] gh/soulitzer/300/head -> origin/gh/soulitzer/300/head 2025-10-10T00:44:16.9686943Z * [new branch] gh/soulitzer/300/orig -> origin/gh/soulitzer/300/orig 2025-10-10T00:44:16.9689940Z * [new branch] gh/soulitzer/301/base -> origin/gh/soulitzer/301/base 2025-10-10T00:44:16.9691760Z * [new branch] gh/soulitzer/301/head -> origin/gh/soulitzer/301/head 2025-10-10T00:44:16.9693548Z * [new branch] gh/soulitzer/301/orig -> origin/gh/soulitzer/301/orig 2025-10-10T00:44:16.9696049Z * [new branch] gh/soulitzer/313/base -> origin/gh/soulitzer/313/base 2025-10-10T00:44:16.9697865Z * [new branch] gh/soulitzer/313/head -> origin/gh/soulitzer/313/head 2025-10-10T00:44:16.9699977Z * [new branch] gh/soulitzer/313/orig -> origin/gh/soulitzer/313/orig 2025-10-10T00:44:16.9702292Z * [new branch] gh/soulitzer/319/base -> origin/gh/soulitzer/319/base 2025-10-10T00:44:16.9704165Z * [new branch] gh/soulitzer/319/head -> origin/gh/soulitzer/319/head 2025-10-10T00:44:16.9706396Z * [new branch] gh/soulitzer/319/orig -> origin/gh/soulitzer/319/orig 2025-10-10T00:44:16.9708836Z * [new branch] gh/soulitzer/320/base -> origin/gh/soulitzer/320/base 2025-10-10T00:44:16.9710651Z * [new branch] gh/soulitzer/320/head -> origin/gh/soulitzer/320/head 2025-10-10T00:44:16.9712412Z * [new branch] gh/soulitzer/320/orig -> origin/gh/soulitzer/320/orig 2025-10-10T00:44:16.9715117Z * [new branch] gh/soulitzer/336/base -> origin/gh/soulitzer/336/base 2025-10-10T00:44:16.9716970Z * [new branch] gh/soulitzer/336/head -> origin/gh/soulitzer/336/head 2025-10-10T00:44:16.9718789Z * [new branch] gh/soulitzer/336/orig -> origin/gh/soulitzer/336/orig 2025-10-10T00:44:16.9721299Z * [new branch] gh/soulitzer/347/base -> origin/gh/soulitzer/347/base 2025-10-10T00:44:16.9723031Z * [new branch] gh/soulitzer/347/head -> origin/gh/soulitzer/347/head 2025-10-10T00:44:16.9724808Z * [new branch] gh/soulitzer/347/orig -> origin/gh/soulitzer/347/orig 2025-10-10T00:44:16.9727642Z * [new branch] gh/soulitzer/349/base -> origin/gh/soulitzer/349/base 2025-10-10T00:44:16.9729524Z * [new branch] gh/soulitzer/349/head -> origin/gh/soulitzer/349/head 2025-10-10T00:44:16.9731858Z * [new branch] gh/soulitzer/349/orig -> origin/gh/soulitzer/349/orig 2025-10-10T00:44:16.9734159Z * [new branch] gh/soulitzer/350/base -> origin/gh/soulitzer/350/base 2025-10-10T00:44:16.9735975Z * [new branch] gh/soulitzer/350/head -> origin/gh/soulitzer/350/head 2025-10-10T00:44:16.9737761Z * [new branch] gh/soulitzer/350/orig -> origin/gh/soulitzer/350/orig 2025-10-10T00:44:16.9740459Z * [new branch] gh/soulitzer/351/base -> origin/gh/soulitzer/351/base 2025-10-10T00:44:16.9742337Z * [new branch] gh/soulitzer/351/head -> origin/gh/soulitzer/351/head 2025-10-10T00:44:16.9744085Z * [new branch] gh/soulitzer/351/orig -> origin/gh/soulitzer/351/orig 2025-10-10T00:44:16.9746515Z * [new branch] gh/soulitzer/353/base -> origin/gh/soulitzer/353/base 2025-10-10T00:44:16.9748527Z * [new branch] gh/soulitzer/353/head -> origin/gh/soulitzer/353/head 2025-10-10T00:44:16.9750282Z * [new branch] gh/soulitzer/353/orig -> origin/gh/soulitzer/353/orig 2025-10-10T00:44:16.9753411Z * [new branch] gh/soulitzer/358/base -> origin/gh/soulitzer/358/base 2025-10-10T00:44:16.9755376Z * [new branch] gh/soulitzer/358/head -> origin/gh/soulitzer/358/head 2025-10-10T00:44:16.9757214Z * [new branch] gh/soulitzer/358/orig -> origin/gh/soulitzer/358/orig 2025-10-10T00:44:16.9760051Z * [new branch] gh/soulitzer/359/base -> origin/gh/soulitzer/359/base 2025-10-10T00:44:16.9761927Z * [new branch] gh/soulitzer/359/head -> origin/gh/soulitzer/359/head 2025-10-10T00:44:16.9763753Z * [new branch] gh/soulitzer/359/orig -> origin/gh/soulitzer/359/orig 2025-10-10T00:44:16.9766416Z * [new branch] gh/soulitzer/372/base -> origin/gh/soulitzer/372/base 2025-10-10T00:44:16.9768489Z * [new branch] gh/soulitzer/372/head -> origin/gh/soulitzer/372/head 2025-10-10T00:44:16.9770262Z * [new branch] gh/soulitzer/372/orig -> origin/gh/soulitzer/372/orig 2025-10-10T00:44:16.9772703Z * [new branch] gh/soulitzer/374/base -> origin/gh/soulitzer/374/base 2025-10-10T00:44:16.9774597Z * [new branch] gh/soulitzer/374/head -> origin/gh/soulitzer/374/head 2025-10-10T00:44:16.9776974Z * [new branch] gh/soulitzer/374/orig -> origin/gh/soulitzer/374/orig 2025-10-10T00:44:16.9779501Z * [new branch] gh/soulitzer/375/base -> origin/gh/soulitzer/375/base 2025-10-10T00:44:16.9781257Z * [new branch] gh/soulitzer/375/head -> origin/gh/soulitzer/375/head 2025-10-10T00:44:16.9783040Z * [new branch] gh/soulitzer/375/orig -> origin/gh/soulitzer/375/orig 2025-10-10T00:44:16.9785467Z * [new branch] gh/soulitzer/380/base -> origin/gh/soulitzer/380/base 2025-10-10T00:44:16.9787340Z * [new branch] gh/soulitzer/380/head -> origin/gh/soulitzer/380/head 2025-10-10T00:44:16.9789157Z * [new branch] gh/soulitzer/380/orig -> origin/gh/soulitzer/380/orig 2025-10-10T00:44:16.9791706Z * [new branch] gh/soulitzer/381/base -> origin/gh/soulitzer/381/base 2025-10-10T00:44:16.9793576Z * [new branch] gh/soulitzer/381/head -> origin/gh/soulitzer/381/head 2025-10-10T00:44:16.9795382Z * [new branch] gh/soulitzer/381/orig -> origin/gh/soulitzer/381/orig 2025-10-10T00:44:16.9797853Z * [new branch] gh/soulitzer/382/base -> origin/gh/soulitzer/382/base 2025-10-10T00:44:16.9800691Z * [new branch] gh/soulitzer/382/head -> origin/gh/soulitzer/382/head 2025-10-10T00:44:16.9802765Z * [new branch] gh/soulitzer/382/orig -> origin/gh/soulitzer/382/orig 2025-10-10T00:44:16.9805375Z * [new branch] gh/soulitzer/383/base -> origin/gh/soulitzer/383/base 2025-10-10T00:44:16.9807264Z * [new branch] gh/soulitzer/383/head -> origin/gh/soulitzer/383/head 2025-10-10T00:44:16.9809277Z * [new branch] gh/soulitzer/383/orig -> origin/gh/soulitzer/383/orig 2025-10-10T00:44:16.9811782Z * [new branch] gh/soulitzer/384/base -> origin/gh/soulitzer/384/base 2025-10-10T00:44:16.9813470Z * [new branch] gh/soulitzer/384/head -> origin/gh/soulitzer/384/head 2025-10-10T00:44:16.9815326Z * [new branch] gh/soulitzer/384/orig -> origin/gh/soulitzer/384/orig 2025-10-10T00:44:16.9818457Z * [new branch] gh/swolchok/728/next -> origin/gh/swolchok/728/next 2025-10-10T00:44:16.9820880Z * [new branch] gh/swolchok/786/base -> origin/gh/swolchok/786/base 2025-10-10T00:44:16.9822721Z * [new branch] gh/swolchok/786/head -> origin/gh/swolchok/786/head 2025-10-10T00:44:16.9824449Z * [new branch] gh/swolchok/786/orig -> origin/gh/swolchok/786/orig 2025-10-10T00:44:16.9826961Z * [new branch] gh/swolchok/787/base -> origin/gh/swolchok/787/base 2025-10-10T00:44:16.9828861Z * [new branch] gh/swolchok/787/head -> origin/gh/swolchok/787/head 2025-10-10T00:44:16.9830709Z * [new branch] gh/swolchok/787/orig -> origin/gh/swolchok/787/orig 2025-10-10T00:44:16.9833300Z * [new branch] gh/swolchok/809/base -> origin/gh/swolchok/809/base 2025-10-10T00:44:16.9835599Z * [new branch] gh/swolchok/809/head -> origin/gh/swolchok/809/head 2025-10-10T00:44:16.9837441Z * [new branch] gh/swolchok/809/orig -> origin/gh/swolchok/809/orig 2025-10-10T00:44:16.9840133Z * [new branch] gh/swolchok/815/base -> origin/gh/swolchok/815/base 2025-10-10T00:44:16.9841860Z * [new branch] gh/swolchok/815/head -> origin/gh/swolchok/815/head 2025-10-10T00:44:16.9844124Z * [new branch] gh/swolchok/815/orig -> origin/gh/swolchok/815/orig 2025-10-10T00:44:16.9846366Z * [new branch] gh/swolchok/819/base -> origin/gh/swolchok/819/base 2025-10-10T00:44:16.9848349Z * [new branch] gh/swolchok/819/head -> origin/gh/swolchok/819/head 2025-10-10T00:44:16.9850105Z * [new branch] gh/swolchok/819/orig -> origin/gh/swolchok/819/orig 2025-10-10T00:44:16.9852667Z * [new branch] gh/swolchok/821/base -> origin/gh/swolchok/821/base 2025-10-10T00:44:16.9854694Z * [new branch] gh/swolchok/821/head -> origin/gh/swolchok/821/head 2025-10-10T00:44:16.9856531Z * [new branch] gh/swolchok/821/orig -> origin/gh/swolchok/821/orig 2025-10-10T00:44:16.9859066Z * [new branch] gh/swolchok/823/base -> origin/gh/swolchok/823/base 2025-10-10T00:44:16.9860785Z * [new branch] gh/swolchok/823/head -> origin/gh/swolchok/823/head 2025-10-10T00:44:16.9862644Z * [new branch] gh/swolchok/823/orig -> origin/gh/swolchok/823/orig 2025-10-10T00:44:16.9865132Z * [new branch] gh/swolchok/824/base -> origin/gh/swolchok/824/base 2025-10-10T00:44:16.9867108Z * [new branch] gh/swolchok/824/head -> origin/gh/swolchok/824/head 2025-10-10T00:44:16.9868931Z * [new branch] gh/swolchok/824/orig -> origin/gh/swolchok/824/orig 2025-10-10T00:44:16.9871363Z * [new branch] gh/swolchok/826/base -> origin/gh/swolchok/826/base 2025-10-10T00:44:16.9873333Z * [new branch] gh/swolchok/826/head -> origin/gh/swolchok/826/head 2025-10-10T00:44:16.9875175Z * [new branch] gh/swolchok/826/orig -> origin/gh/swolchok/826/orig 2025-10-10T00:44:16.9877695Z * [new branch] gh/swolchok/829/base -> origin/gh/swolchok/829/base 2025-10-10T00:44:16.9879432Z * [new branch] gh/swolchok/829/head -> origin/gh/swolchok/829/head 2025-10-10T00:44:16.9881179Z * [new branch] gh/swolchok/829/orig -> origin/gh/swolchok/829/orig 2025-10-10T00:44:16.9883859Z * [new branch] gh/swolchok/830/base -> origin/gh/swolchok/830/base 2025-10-10T00:44:16.9885727Z * [new branch] gh/swolchok/830/head -> origin/gh/swolchok/830/head 2025-10-10T00:44:16.9887658Z * [new branch] gh/swolchok/830/orig -> origin/gh/swolchok/830/orig 2025-10-10T00:44:16.9890830Z * [new branch] gh/swolchok/831/base -> origin/gh/swolchok/831/base 2025-10-10T00:44:16.9892638Z * [new branch] gh/swolchok/831/head -> origin/gh/swolchok/831/head 2025-10-10T00:44:16.9894515Z * [new branch] gh/swolchok/831/orig -> origin/gh/swolchok/831/orig 2025-10-10T00:44:16.9897287Z * [new branch] gh/swolchok/832/base -> origin/gh/swolchok/832/base 2025-10-10T00:44:16.9898828Z * [new branch] gh/swolchok/832/head -> origin/gh/swolchok/832/head 2025-10-10T00:44:16.9903516Z * [new branch] gh/swolchok/832/orig -> origin/gh/swolchok/832/orig 2025-10-10T00:44:16.9905836Z * [new branch] gh/swolchok/833/base -> origin/gh/swolchok/833/base 2025-10-10T00:44:16.9907595Z * [new branch] gh/swolchok/833/head -> origin/gh/swolchok/833/head 2025-10-10T00:44:16.9909415Z * [new branch] gh/swolchok/833/orig -> origin/gh/swolchok/833/orig 2025-10-10T00:44:16.9912070Z * [new branch] gh/swolchok/834/base -> origin/gh/swolchok/834/base 2025-10-10T00:44:16.9913828Z * [new branch] gh/swolchok/834/head -> origin/gh/swolchok/834/head 2025-10-10T00:44:16.9915557Z * [new branch] gh/swolchok/834/orig -> origin/gh/swolchok/834/orig 2025-10-10T00:44:16.9918633Z * [new branch] gh/swolchok/835/base -> origin/gh/swolchok/835/base 2025-10-10T00:44:16.9920542Z * [new branch] gh/swolchok/835/head -> origin/gh/swolchok/835/head 2025-10-10T00:44:16.9922359Z * [new branch] gh/swolchok/835/orig -> origin/gh/swolchok/835/orig 2025-10-10T00:44:16.9925112Z * [new branch] gh/swolchok/836/base -> origin/gh/swolchok/836/base 2025-10-10T00:44:16.9926972Z * [new branch] gh/swolchok/836/head -> origin/gh/swolchok/836/head 2025-10-10T00:44:16.9928976Z * [new branch] gh/swolchok/836/orig -> origin/gh/swolchok/836/orig 2025-10-10T00:44:16.9931710Z * [new branch] gh/swolchok/837/base -> origin/gh/swolchok/837/base 2025-10-10T00:44:16.9933457Z * [new branch] gh/swolchok/837/head -> origin/gh/swolchok/837/head 2025-10-10T00:44:16.9935245Z * [new branch] gh/swolchok/837/orig -> origin/gh/swolchok/837/orig 2025-10-10T00:44:16.9937955Z * [new branch] gh/swolchok/838/base -> origin/gh/swolchok/838/base 2025-10-10T00:44:16.9939696Z * [new branch] gh/swolchok/838/head -> origin/gh/swolchok/838/head 2025-10-10T00:44:16.9941461Z * [new branch] gh/swolchok/838/orig -> origin/gh/swolchok/838/orig 2025-10-10T00:44:16.9944170Z * [new branch] gh/swolchok/839/base -> origin/gh/swolchok/839/base 2025-10-10T00:44:16.9946157Z * [new branch] gh/swolchok/839/head -> origin/gh/swolchok/839/head 2025-10-10T00:44:16.9947951Z * [new branch] gh/swolchok/839/orig -> origin/gh/swolchok/839/orig 2025-10-10T00:44:16.9950700Z * [new branch] gh/swolchok/840/base -> origin/gh/swolchok/840/base 2025-10-10T00:44:16.9952610Z * [new branch] gh/swolchok/840/head -> origin/gh/swolchok/840/head 2025-10-10T00:44:16.9954483Z * [new branch] gh/swolchok/840/orig -> origin/gh/swolchok/840/orig 2025-10-10T00:44:16.9957045Z * [new branch] gh/swolchok/841/base -> origin/gh/swolchok/841/base 2025-10-10T00:44:16.9958893Z * [new branch] gh/swolchok/841/head -> origin/gh/swolchok/841/head 2025-10-10T00:44:16.9960617Z * [new branch] gh/swolchok/841/orig -> origin/gh/swolchok/841/orig 2025-10-10T00:44:16.9963365Z * [new branch] gh/swolchok/842/base -> origin/gh/swolchok/842/base 2025-10-10T00:44:16.9965215Z * [new branch] gh/swolchok/842/head -> origin/gh/swolchok/842/head 2025-10-10T00:44:16.9967021Z * [new branch] gh/swolchok/842/orig -> origin/gh/swolchok/842/orig 2025-10-10T00:44:16.9969922Z * [new branch] gh/swolchok/843/base -> origin/gh/swolchok/843/base 2025-10-10T00:44:16.9971735Z * [new branch] gh/swolchok/843/head -> origin/gh/swolchok/843/head 2025-10-10T00:44:16.9973472Z * [new branch] gh/swolchok/843/orig -> origin/gh/swolchok/843/orig 2025-10-10T00:44:16.9976142Z * [new branch] gh/swolchok/844/base -> origin/gh/swolchok/844/base 2025-10-10T00:44:16.9977913Z * [new branch] gh/swolchok/844/head -> origin/gh/swolchok/844/head 2025-10-10T00:44:16.9979770Z * [new branch] gh/swolchok/844/orig -> origin/gh/swolchok/844/orig 2025-10-10T00:44:16.9982238Z * [new branch] gh/swolchok/845/base -> origin/gh/swolchok/845/base 2025-10-10T00:44:16.9984085Z * [new branch] gh/swolchok/845/head -> origin/gh/swolchok/845/head 2025-10-10T00:44:16.9986009Z * [new branch] gh/swolchok/845/orig -> origin/gh/swolchok/845/orig 2025-10-10T00:44:16.9988578Z * [new branch] gh/swolchok/846/base -> origin/gh/swolchok/846/base 2025-10-10T00:44:16.9990553Z * [new branch] gh/swolchok/846/head -> origin/gh/swolchok/846/head 2025-10-10T00:44:16.9992429Z * [new branch] gh/swolchok/846/orig -> origin/gh/swolchok/846/orig 2025-10-10T00:44:16.9995137Z * [new branch] gh/swolchok/847/base -> origin/gh/swolchok/847/base 2025-10-10T00:44:16.9996974Z * [new branch] gh/swolchok/847/head -> origin/gh/swolchok/847/head 2025-10-10T00:44:16.9998959Z * [new branch] gh/swolchok/847/orig -> origin/gh/swolchok/847/orig 2025-10-10T00:44:17.0001894Z * [new branch] gh/swolchok/848/base -> origin/gh/swolchok/848/base 2025-10-10T00:44:17.0003656Z * [new branch] gh/swolchok/848/head -> origin/gh/swolchok/848/head 2025-10-10T00:44:17.0005746Z * [new branch] gh/swolchok/848/orig -> origin/gh/swolchok/848/orig 2025-10-10T00:44:17.0008535Z * [new branch] gh/swolchok/849/base -> origin/gh/swolchok/849/base 2025-10-10T00:44:17.0010329Z * [new branch] gh/swolchok/849/head -> origin/gh/swolchok/849/head 2025-10-10T00:44:17.0012063Z * [new branch] gh/swolchok/849/orig -> origin/gh/swolchok/849/orig 2025-10-10T00:44:17.0014979Z * [new branch] gh/swolchok/850/base -> origin/gh/swolchok/850/base 2025-10-10T00:44:17.0016966Z * [new branch] gh/swolchok/850/head -> origin/gh/swolchok/850/head 2025-10-10T00:44:17.0018783Z * [new branch] gh/swolchok/850/orig -> origin/gh/swolchok/850/orig 2025-10-10T00:44:17.0021202Z * [new branch] gh/swolchok/851/base -> origin/gh/swolchok/851/base 2025-10-10T00:44:17.0023236Z * [new branch] gh/swolchok/851/head -> origin/gh/swolchok/851/head 2025-10-10T00:44:17.0025038Z * [new branch] gh/swolchok/851/orig -> origin/gh/swolchok/851/orig 2025-10-10T00:44:17.0027684Z * [new branch] gh/swolchok/852/base -> origin/gh/swolchok/852/base 2025-10-10T00:44:17.0029601Z * [new branch] gh/swolchok/852/head -> origin/gh/swolchok/852/head 2025-10-10T00:44:17.0031523Z * [new branch] gh/swolchok/852/orig -> origin/gh/swolchok/852/orig 2025-10-10T00:44:17.0034618Z * [new branch] gh/syed-ahmed/5/base -> origin/gh/syed-ahmed/5/base 2025-10-10T00:44:17.0036951Z * [new branch] gh/syed-ahmed/5/head -> origin/gh/syed-ahmed/5/head 2025-10-10T00:44:17.0038727Z * [new branch] gh/syed-ahmed/5/orig -> origin/gh/syed-ahmed/5/orig 2025-10-10T00:44:17.0041132Z * [new branch] gh/syed-ahmed/6/base -> origin/gh/syed-ahmed/6/base 2025-10-10T00:44:17.0043195Z * [new branch] gh/syed-ahmed/6/head -> origin/gh/syed-ahmed/6/head 2025-10-10T00:44:17.0045097Z * [new branch] gh/syed-ahmed/6/orig -> origin/gh/syed-ahmed/6/orig 2025-10-10T00:44:17.0047723Z * [new branch] gh/syed-ahmed/7/base -> origin/gh/syed-ahmed/7/base 2025-10-10T00:44:17.0055530Z * [new branch] gh/syed-ahmed/7/head -> origin/gh/syed-ahmed/7/head 2025-10-10T00:44:17.0056279Z * [new branch] gh/syed-ahmed/7/orig -> origin/gh/syed-ahmed/7/orig 2025-10-10T00:44:17.0056787Z * [new branch] gh/teja-rao/4/base -> origin/gh/teja-rao/4/base 2025-10-10T00:44:17.0057269Z * [new branch] gh/teja-rao/4/head -> origin/gh/teja-rao/4/head 2025-10-10T00:44:17.0057788Z * [new branch] gh/teja-rao/4/orig -> origin/gh/teja-rao/4/orig 2025-10-10T00:44:17.0061272Z * [new branch] gh/tianyu-l/2/base -> origin/gh/tianyu-l/2/base 2025-10-10T00:44:17.0063085Z * [new branch] gh/tianyu-l/2/head -> origin/gh/tianyu-l/2/head 2025-10-10T00:44:17.0064974Z * [new branch] gh/tianyu-l/2/orig -> origin/gh/tianyu-l/2/orig 2025-10-10T00:44:17.0067566Z * [new branch] gh/tianyu-l/5/base -> origin/gh/tianyu-l/5/base 2025-10-10T00:44:17.0069352Z * [new branch] gh/tianyu-l/5/orig -> origin/gh/tianyu-l/5/orig 2025-10-10T00:44:17.0071918Z * [new branch] gh/tianyu-l/6/base -> origin/gh/tianyu-l/6/base 2025-10-10T00:44:17.0073723Z * [new branch] gh/tianyu-l/6/head -> origin/gh/tianyu-l/6/head 2025-10-10T00:44:17.0075535Z * [new branch] gh/tianyu-l/6/orig -> origin/gh/tianyu-l/6/orig 2025-10-10T00:44:17.0077978Z * [new branch] gh/tianyu-l/7/base -> origin/gh/tianyu-l/7/base 2025-10-10T00:44:17.0079709Z * [new branch] gh/tianyu-l/7/orig -> origin/gh/tianyu-l/7/orig 2025-10-10T00:44:17.0083270Z * [new branch] gh/tugsbayasgalan/10/base -> origin/gh/tugsbayasgalan/10/base 2025-10-10T00:44:17.0085025Z * [new branch] gh/tugsbayasgalan/10/head -> origin/gh/tugsbayasgalan/10/head 2025-10-10T00:44:17.0086913Z * [new branch] gh/tugsbayasgalan/10/orig -> origin/gh/tugsbayasgalan/10/orig 2025-10-10T00:44:17.0089670Z * [new branch] gh/tugsbayasgalan/11/base -> origin/gh/tugsbayasgalan/11/base 2025-10-10T00:44:17.0091565Z * [new branch] gh/tugsbayasgalan/11/head -> origin/gh/tugsbayasgalan/11/head 2025-10-10T00:44:17.0093197Z * [new branch] gh/tugsbayasgalan/11/orig -> origin/gh/tugsbayasgalan/11/orig 2025-10-10T00:44:17.0095781Z * [new branch] gh/tugsbayasgalan/12/base -> origin/gh/tugsbayasgalan/12/base 2025-10-10T00:44:17.0097657Z * [new branch] gh/tugsbayasgalan/12/head -> origin/gh/tugsbayasgalan/12/head 2025-10-10T00:44:17.0099698Z * [new branch] gh/tugsbayasgalan/12/orig -> origin/gh/tugsbayasgalan/12/orig 2025-10-10T00:44:17.0102165Z * [new branch] gh/tugsbayasgalan/13/base -> origin/gh/tugsbayasgalan/13/base 2025-10-10T00:44:17.0103983Z * [new branch] gh/tugsbayasgalan/13/head -> origin/gh/tugsbayasgalan/13/head 2025-10-10T00:44:17.0105812Z * [new branch] gh/tugsbayasgalan/13/orig -> origin/gh/tugsbayasgalan/13/orig 2025-10-10T00:44:17.0108305Z * [new branch] gh/tugsbayasgalan/14/base -> origin/gh/tugsbayasgalan/14/base 2025-10-10T00:44:17.0110263Z * [new branch] gh/tugsbayasgalan/14/head -> origin/gh/tugsbayasgalan/14/head 2025-10-10T00:44:17.0112129Z * [new branch] gh/tugsbayasgalan/14/orig -> origin/gh/tugsbayasgalan/14/orig 2025-10-10T00:44:17.0114888Z * [new branch] gh/tugsbayasgalan/15/base -> origin/gh/tugsbayasgalan/15/base 2025-10-10T00:44:17.0116691Z * [new branch] gh/tugsbayasgalan/15/head -> origin/gh/tugsbayasgalan/15/head 2025-10-10T00:44:17.0119023Z * [new branch] gh/tugsbayasgalan/15/orig -> origin/gh/tugsbayasgalan/15/orig 2025-10-10T00:44:17.0121659Z * [new branch] gh/tugsbayasgalan/16/base -> origin/gh/tugsbayasgalan/16/base 2025-10-10T00:44:17.0123479Z * [new branch] gh/tugsbayasgalan/16/head -> origin/gh/tugsbayasgalan/16/head 2025-10-10T00:44:17.0125396Z * [new branch] gh/tugsbayasgalan/16/orig -> origin/gh/tugsbayasgalan/16/orig 2025-10-10T00:44:17.0128013Z * [new branch] gh/tugsbayasgalan/17/base -> origin/gh/tugsbayasgalan/17/base 2025-10-10T00:44:17.0129675Z * [new branch] gh/tugsbayasgalan/17/head -> origin/gh/tugsbayasgalan/17/head 2025-10-10T00:44:17.0131555Z * [new branch] gh/tugsbayasgalan/17/orig -> origin/gh/tugsbayasgalan/17/orig 2025-10-10T00:44:17.0134036Z * [new branch] gh/tugsbayasgalan/18/base -> origin/gh/tugsbayasgalan/18/base 2025-10-10T00:44:17.0135809Z * [new branch] gh/tugsbayasgalan/18/head -> origin/gh/tugsbayasgalan/18/head 2025-10-10T00:44:17.0137678Z * [new branch] gh/tugsbayasgalan/18/orig -> origin/gh/tugsbayasgalan/18/orig 2025-10-10T00:44:17.0140251Z * [new branch] gh/tugsbayasgalan/19/base -> origin/gh/tugsbayasgalan/19/base 2025-10-10T00:44:17.0142113Z * [new branch] gh/tugsbayasgalan/19/head -> origin/gh/tugsbayasgalan/19/head 2025-10-10T00:44:17.0143886Z * [new branch] gh/tugsbayasgalan/19/orig -> origin/gh/tugsbayasgalan/19/orig 2025-10-10T00:44:17.0146332Z * [new branch] gh/tugsbayasgalan/2/base -> origin/gh/tugsbayasgalan/2/base 2025-10-10T00:44:17.0148137Z * [new branch] gh/tugsbayasgalan/2/head -> origin/gh/tugsbayasgalan/2/head 2025-10-10T00:44:17.0149929Z * [new branch] gh/tugsbayasgalan/2/orig -> origin/gh/tugsbayasgalan/2/orig 2025-10-10T00:44:17.0152431Z * [new branch] gh/tugsbayasgalan/20/base -> origin/gh/tugsbayasgalan/20/base 2025-10-10T00:44:17.0154634Z * [new branch] gh/tugsbayasgalan/20/head -> origin/gh/tugsbayasgalan/20/head 2025-10-10T00:44:17.0155971Z * [new branch] gh/tugsbayasgalan/20/orig -> origin/gh/tugsbayasgalan/20/orig 2025-10-10T00:44:17.0158857Z * [new branch] gh/tugsbayasgalan/21/base -> origin/gh/tugsbayasgalan/21/base 2025-10-10T00:44:17.0160760Z * [new branch] gh/tugsbayasgalan/21/head -> origin/gh/tugsbayasgalan/21/head 2025-10-10T00:44:17.0162595Z * [new branch] gh/tugsbayasgalan/21/orig -> origin/gh/tugsbayasgalan/21/orig 2025-10-10T00:44:17.0165171Z * [new branch] gh/tugsbayasgalan/22/base -> origin/gh/tugsbayasgalan/22/base 2025-10-10T00:44:17.0167097Z * [new branch] gh/tugsbayasgalan/22/head -> origin/gh/tugsbayasgalan/22/head 2025-10-10T00:44:17.0168985Z * [new branch] gh/tugsbayasgalan/22/orig -> origin/gh/tugsbayasgalan/22/orig 2025-10-10T00:44:17.0171624Z * [new branch] gh/tugsbayasgalan/23/base -> origin/gh/tugsbayasgalan/23/base 2025-10-10T00:44:17.0173439Z * [new branch] gh/tugsbayasgalan/23/head -> origin/gh/tugsbayasgalan/23/head 2025-10-10T00:44:17.0175234Z * [new branch] gh/tugsbayasgalan/23/orig -> origin/gh/tugsbayasgalan/23/orig 2025-10-10T00:44:17.0177767Z * [new branch] gh/tugsbayasgalan/24/base -> origin/gh/tugsbayasgalan/24/base 2025-10-10T00:44:17.0179574Z * [new branch] gh/tugsbayasgalan/24/head -> origin/gh/tugsbayasgalan/24/head 2025-10-10T00:44:17.0181451Z * [new branch] gh/tugsbayasgalan/24/orig -> origin/gh/tugsbayasgalan/24/orig 2025-10-10T00:44:17.0184263Z * [new branch] gh/tugsbayasgalan/25/base -> origin/gh/tugsbayasgalan/25/base 2025-10-10T00:44:17.0186104Z * [new branch] gh/tugsbayasgalan/25/head -> origin/gh/tugsbayasgalan/25/head 2025-10-10T00:44:17.0187865Z * [new branch] gh/tugsbayasgalan/25/orig -> origin/gh/tugsbayasgalan/25/orig 2025-10-10T00:44:17.0190499Z * [new branch] gh/tugsbayasgalan/26/base -> origin/gh/tugsbayasgalan/26/base 2025-10-10T00:44:17.0192423Z * [new branch] gh/tugsbayasgalan/26/head -> origin/gh/tugsbayasgalan/26/head 2025-10-10T00:44:17.0194426Z * [new branch] gh/tugsbayasgalan/26/orig -> origin/gh/tugsbayasgalan/26/orig 2025-10-10T00:44:17.0198214Z * [new branch] gh/tugsbayasgalan/27/base -> origin/gh/tugsbayasgalan/27/base 2025-10-10T00:44:17.0200239Z * [new branch] gh/tugsbayasgalan/27/head -> origin/gh/tugsbayasgalan/27/head 2025-10-10T00:44:17.0202103Z * [new branch] gh/tugsbayasgalan/27/orig -> origin/gh/tugsbayasgalan/27/orig 2025-10-10T00:44:17.0204966Z * [new branch] gh/tugsbayasgalan/28/base -> origin/gh/tugsbayasgalan/28/base 2025-10-10T00:44:17.0206772Z * [new branch] gh/tugsbayasgalan/28/head -> origin/gh/tugsbayasgalan/28/head 2025-10-10T00:44:17.0208735Z * [new branch] gh/tugsbayasgalan/28/orig -> origin/gh/tugsbayasgalan/28/orig 2025-10-10T00:44:17.0211475Z * [new branch] gh/tugsbayasgalan/29/base -> origin/gh/tugsbayasgalan/29/base 2025-10-10T00:44:17.0213308Z * [new branch] gh/tugsbayasgalan/29/head -> origin/gh/tugsbayasgalan/29/head 2025-10-10T00:44:17.0215199Z * [new branch] gh/tugsbayasgalan/29/orig -> origin/gh/tugsbayasgalan/29/orig 2025-10-10T00:44:17.0217604Z * [new branch] gh/tugsbayasgalan/3/base -> origin/gh/tugsbayasgalan/3/base 2025-10-10T00:44:17.0219520Z * [new branch] gh/tugsbayasgalan/3/head -> origin/gh/tugsbayasgalan/3/head 2025-10-10T00:44:17.0221268Z * [new branch] gh/tugsbayasgalan/3/orig -> origin/gh/tugsbayasgalan/3/orig 2025-10-10T00:44:17.0223942Z * [new branch] gh/tugsbayasgalan/30/base -> origin/gh/tugsbayasgalan/30/base 2025-10-10T00:44:17.0225806Z * [new branch] gh/tugsbayasgalan/30/head -> origin/gh/tugsbayasgalan/30/head 2025-10-10T00:44:17.0227785Z * [new branch] gh/tugsbayasgalan/30/orig -> origin/gh/tugsbayasgalan/30/orig 2025-10-10T00:44:17.0230334Z * [new branch] gh/tugsbayasgalan/31/base -> origin/gh/tugsbayasgalan/31/base 2025-10-10T00:44:17.0232201Z * [new branch] gh/tugsbayasgalan/31/head -> origin/gh/tugsbayasgalan/31/head 2025-10-10T00:44:17.0234031Z * [new branch] gh/tugsbayasgalan/31/orig -> origin/gh/tugsbayasgalan/31/orig 2025-10-10T00:44:17.0236572Z * [new branch] gh/tugsbayasgalan/32/base -> origin/gh/tugsbayasgalan/32/base 2025-10-10T00:44:17.0238407Z * [new branch] gh/tugsbayasgalan/32/head -> origin/gh/tugsbayasgalan/32/head 2025-10-10T00:44:17.0240168Z * [new branch] gh/tugsbayasgalan/32/orig -> origin/gh/tugsbayasgalan/32/orig 2025-10-10T00:44:17.0242832Z * [new branch] gh/tugsbayasgalan/33/base -> origin/gh/tugsbayasgalan/33/base 2025-10-10T00:44:17.0244726Z * [new branch] gh/tugsbayasgalan/33/head -> origin/gh/tugsbayasgalan/33/head 2025-10-10T00:44:17.0246602Z * [new branch] gh/tugsbayasgalan/33/orig -> origin/gh/tugsbayasgalan/33/orig 2025-10-10T00:44:17.0249496Z * [new branch] gh/tugsbayasgalan/34/base -> origin/gh/tugsbayasgalan/34/base 2025-10-10T00:44:17.0251313Z * [new branch] gh/tugsbayasgalan/34/head -> origin/gh/tugsbayasgalan/34/head 2025-10-10T00:44:17.0253123Z * [new branch] gh/tugsbayasgalan/34/orig -> origin/gh/tugsbayasgalan/34/orig 2025-10-10T00:44:17.0255613Z * [new branch] gh/tugsbayasgalan/35/base -> origin/gh/tugsbayasgalan/35/base 2025-10-10T00:44:17.0257423Z * [new branch] gh/tugsbayasgalan/35/head -> origin/gh/tugsbayasgalan/35/head 2025-10-10T00:44:17.0259276Z * [new branch] gh/tugsbayasgalan/35/orig -> origin/gh/tugsbayasgalan/35/orig 2025-10-10T00:44:17.0261860Z * [new branch] gh/tugsbayasgalan/36/base -> origin/gh/tugsbayasgalan/36/base 2025-10-10T00:44:17.0263679Z * [new branch] gh/tugsbayasgalan/36/head -> origin/gh/tugsbayasgalan/36/head 2025-10-10T00:44:17.0265481Z * [new branch] gh/tugsbayasgalan/36/orig -> origin/gh/tugsbayasgalan/36/orig 2025-10-10T00:44:17.0268045Z * [new branch] gh/tugsbayasgalan/37/base -> origin/gh/tugsbayasgalan/37/base 2025-10-10T00:44:17.0269838Z * [new branch] gh/tugsbayasgalan/37/head -> origin/gh/tugsbayasgalan/37/head 2025-10-10T00:44:17.0271632Z * [new branch] gh/tugsbayasgalan/37/orig -> origin/gh/tugsbayasgalan/37/orig 2025-10-10T00:44:17.0274519Z * [new branch] gh/tugsbayasgalan/38/base -> origin/gh/tugsbayasgalan/38/base 2025-10-10T00:44:17.0276431Z * [new branch] gh/tugsbayasgalan/38/head -> origin/gh/tugsbayasgalan/38/head 2025-10-10T00:44:17.0278224Z * [new branch] gh/tugsbayasgalan/38/orig -> origin/gh/tugsbayasgalan/38/orig 2025-10-10T00:44:17.0280775Z * [new branch] gh/tugsbayasgalan/39/base -> origin/gh/tugsbayasgalan/39/base 2025-10-10T00:44:17.0282595Z * [new branch] gh/tugsbayasgalan/39/head -> origin/gh/tugsbayasgalan/39/head 2025-10-10T00:44:17.0284411Z * [new branch] gh/tugsbayasgalan/39/orig -> origin/gh/tugsbayasgalan/39/orig 2025-10-10T00:44:17.0287202Z * [new branch] gh/tugsbayasgalan/40/base -> origin/gh/tugsbayasgalan/40/base 2025-10-10T00:44:17.0289121Z * [new branch] gh/tugsbayasgalan/40/head -> origin/gh/tugsbayasgalan/40/head 2025-10-10T00:44:17.0290906Z * [new branch] gh/tugsbayasgalan/40/orig -> origin/gh/tugsbayasgalan/40/orig 2025-10-10T00:44:17.0293662Z * [new branch] gh/tugsbayasgalan/41/base -> origin/gh/tugsbayasgalan/41/base 2025-10-10T00:44:17.0295498Z * [new branch] gh/tugsbayasgalan/41/head -> origin/gh/tugsbayasgalan/41/head 2025-10-10T00:44:17.0297407Z * [new branch] gh/tugsbayasgalan/41/orig -> origin/gh/tugsbayasgalan/41/orig 2025-10-10T00:44:17.0301715Z * [new branch] gh/tugsbayasgalan/42/base -> origin/gh/tugsbayasgalan/42/base 2025-10-10T00:44:17.0303805Z * [new branch] gh/tugsbayasgalan/42/head -> origin/gh/tugsbayasgalan/42/head 2025-10-10T00:44:17.0305593Z * [new branch] gh/tugsbayasgalan/42/orig -> origin/gh/tugsbayasgalan/42/orig 2025-10-10T00:44:17.0308142Z * [new branch] gh/tugsbayasgalan/43/base -> origin/gh/tugsbayasgalan/43/base 2025-10-10T00:44:17.0309912Z * [new branch] gh/tugsbayasgalan/43/head -> origin/gh/tugsbayasgalan/43/head 2025-10-10T00:44:17.0311792Z * [new branch] gh/tugsbayasgalan/43/orig -> origin/gh/tugsbayasgalan/43/orig 2025-10-10T00:44:17.0314402Z * [new branch] gh/tugsbayasgalan/44/base -> origin/gh/tugsbayasgalan/44/base 2025-10-10T00:44:17.0316191Z * [new branch] gh/tugsbayasgalan/44/head -> origin/gh/tugsbayasgalan/44/head 2025-10-10T00:44:17.0317987Z * [new branch] gh/tugsbayasgalan/44/orig -> origin/gh/tugsbayasgalan/44/orig 2025-10-10T00:44:17.0320514Z * [new branch] gh/tugsbayasgalan/45/base -> origin/gh/tugsbayasgalan/45/base 2025-10-10T00:44:17.0322493Z * [new branch] gh/tugsbayasgalan/45/head -> origin/gh/tugsbayasgalan/45/head 2025-10-10T00:44:17.0324300Z * [new branch] gh/tugsbayasgalan/45/orig -> origin/gh/tugsbayasgalan/45/orig 2025-10-10T00:44:17.0327182Z * [new branch] gh/tugsbayasgalan/46/base -> origin/gh/tugsbayasgalan/46/base 2025-10-10T00:44:17.0329019Z * [new branch] gh/tugsbayasgalan/46/head -> origin/gh/tugsbayasgalan/46/head 2025-10-10T00:44:17.0330928Z * [new branch] gh/tugsbayasgalan/46/orig -> origin/gh/tugsbayasgalan/46/orig 2025-10-10T00:44:17.0333362Z * [new branch] gh/tugsbayasgalan/47/base -> origin/gh/tugsbayasgalan/47/base 2025-10-10T00:44:17.0335291Z * [new branch] gh/tugsbayasgalan/47/head -> origin/gh/tugsbayasgalan/47/head 2025-10-10T00:44:17.0337067Z * [new branch] gh/tugsbayasgalan/47/orig -> origin/gh/tugsbayasgalan/47/orig 2025-10-10T00:44:17.0339482Z * [new branch] gh/tugsbayasgalan/48/base -> origin/gh/tugsbayasgalan/48/base 2025-10-10T00:44:17.0341276Z * [new branch] gh/tugsbayasgalan/48/head -> origin/gh/tugsbayasgalan/48/head 2025-10-10T00:44:17.0343102Z * [new branch] gh/tugsbayasgalan/48/orig -> origin/gh/tugsbayasgalan/48/orig 2025-10-10T00:44:17.0345517Z * [new branch] gh/tugsbayasgalan/49/base -> origin/gh/tugsbayasgalan/49/base 2025-10-10T00:44:17.0347448Z * [new branch] gh/tugsbayasgalan/49/head -> origin/gh/tugsbayasgalan/49/head 2025-10-10T00:44:17.0349348Z * [new branch] gh/tugsbayasgalan/49/orig -> origin/gh/tugsbayasgalan/49/orig 2025-10-10T00:44:17.0352051Z * [new branch] gh/tugsbayasgalan/50/base -> origin/gh/tugsbayasgalan/50/base 2025-10-10T00:44:17.0354020Z * [new branch] gh/tugsbayasgalan/50/head -> origin/gh/tugsbayasgalan/50/head 2025-10-10T00:44:17.0355829Z * [new branch] gh/tugsbayasgalan/50/orig -> origin/gh/tugsbayasgalan/50/orig 2025-10-10T00:44:17.0358277Z * [new branch] gh/tugsbayasgalan/51/base -> origin/gh/tugsbayasgalan/51/base 2025-10-10T00:44:17.0360027Z * [new branch] gh/tugsbayasgalan/51/head -> origin/gh/tugsbayasgalan/51/head 2025-10-10T00:44:17.0361851Z * [new branch] gh/tugsbayasgalan/51/orig -> origin/gh/tugsbayasgalan/51/orig 2025-10-10T00:44:17.0364550Z * [new branch] gh/tugsbayasgalan/52/base -> origin/gh/tugsbayasgalan/52/base 2025-10-10T00:44:17.0366798Z * [new branch] gh/tugsbayasgalan/52/head -> origin/gh/tugsbayasgalan/52/head 2025-10-10T00:44:17.0368870Z * [new branch] gh/tugsbayasgalan/52/orig -> origin/gh/tugsbayasgalan/52/orig 2025-10-10T00:44:17.0371642Z * [new branch] gh/tugsbayasgalan/53/base -> origin/gh/tugsbayasgalan/53/base 2025-10-10T00:44:17.0373275Z * [new branch] gh/tugsbayasgalan/53/head -> origin/gh/tugsbayasgalan/53/head 2025-10-10T00:44:17.0375130Z * [new branch] gh/tugsbayasgalan/53/orig -> origin/gh/tugsbayasgalan/53/orig 2025-10-10T00:44:17.0378093Z * [new branch] gh/tugsbayasgalan/54/base -> origin/gh/tugsbayasgalan/54/base 2025-10-10T00:44:17.0379882Z * [new branch] gh/tugsbayasgalan/54/head -> origin/gh/tugsbayasgalan/54/head 2025-10-10T00:44:17.0381779Z * [new branch] gh/tugsbayasgalan/54/orig -> origin/gh/tugsbayasgalan/54/orig 2025-10-10T00:44:17.0384349Z * [new branch] gh/tugsbayasgalan/6/base -> origin/gh/tugsbayasgalan/6/base 2025-10-10T00:44:17.0386087Z * [new branch] gh/tugsbayasgalan/6/head -> origin/gh/tugsbayasgalan/6/head 2025-10-10T00:44:17.0388345Z * [new branch] gh/tugsbayasgalan/6/orig -> origin/gh/tugsbayasgalan/6/orig 2025-10-10T00:44:17.0390975Z * [new branch] gh/tugsbayasgalan/7/base -> origin/gh/tugsbayasgalan/7/base 2025-10-10T00:44:17.0392789Z * [new branch] gh/tugsbayasgalan/7/head -> origin/gh/tugsbayasgalan/7/head 2025-10-10T00:44:17.0394600Z * [new branch] gh/tugsbayasgalan/7/orig -> origin/gh/tugsbayasgalan/7/orig 2025-10-10T00:44:17.0397182Z * [new branch] gh/tugsbayasgalan/8/base -> origin/gh/tugsbayasgalan/8/base 2025-10-10T00:44:17.0399620Z * [new branch] gh/tugsbayasgalan/8/head -> origin/gh/tugsbayasgalan/8/head 2025-10-10T00:44:17.0401452Z * [new branch] gh/tugsbayasgalan/8/orig -> origin/gh/tugsbayasgalan/8/orig 2025-10-10T00:44:17.0404135Z * [new branch] gh/tugsbayasgalan/9/base -> origin/gh/tugsbayasgalan/9/base 2025-10-10T00:44:17.0405821Z * [new branch] gh/tugsbayasgalan/9/head -> origin/gh/tugsbayasgalan/9/head 2025-10-10T00:44:17.0407767Z * [new branch] gh/tugsbayasgalan/9/orig -> origin/gh/tugsbayasgalan/9/orig 2025-10-10T00:44:17.0410741Z * [new branch] gh/v0i0/10/base -> origin/gh/v0i0/10/base 2025-10-10T00:44:17.0412551Z * [new branch] gh/v0i0/10/head -> origin/gh/v0i0/10/head 2025-10-10T00:44:17.0414425Z * [new branch] gh/v0i0/10/orig -> origin/gh/v0i0/10/orig 2025-10-10T00:44:17.0416974Z * [new branch] gh/v0i0/11/base -> origin/gh/v0i0/11/base 2025-10-10T00:44:17.0418749Z * [new branch] gh/v0i0/11/head -> origin/gh/v0i0/11/head 2025-10-10T00:44:17.0420511Z * [new branch] gh/v0i0/11/orig -> origin/gh/v0i0/11/orig 2025-10-10T00:44:17.0423128Z * [new branch] gh/v0i0/12/base -> origin/gh/v0i0/12/base 2025-10-10T00:44:17.0424940Z * [new branch] gh/v0i0/12/head -> origin/gh/v0i0/12/head 2025-10-10T00:44:17.0426784Z * [new branch] gh/v0i0/12/orig -> origin/gh/v0i0/12/orig 2025-10-10T00:44:17.0429588Z * [new branch] gh/v0i0/13/base -> origin/gh/v0i0/13/base 2025-10-10T00:44:17.0431425Z * [new branch] gh/v0i0/13/head -> origin/gh/v0i0/13/head 2025-10-10T00:44:17.0433101Z * [new branch] gh/v0i0/13/orig -> origin/gh/v0i0/13/orig 2025-10-10T00:44:17.0435726Z * [new branch] gh/v0i0/7/base -> origin/gh/v0i0/7/base 2025-10-10T00:44:17.0437658Z * [new branch] gh/v0i0/7/head -> origin/gh/v0i0/7/head 2025-10-10T00:44:17.0439386Z * [new branch] gh/v0i0/7/orig -> origin/gh/v0i0/7/orig 2025-10-10T00:44:17.0441911Z * [new branch] gh/v0i0/8/base -> origin/gh/v0i0/8/base 2025-10-10T00:44:17.0443725Z * [new branch] gh/v0i0/8/head -> origin/gh/v0i0/8/head 2025-10-10T00:44:17.0445727Z * [new branch] gh/v0i0/8/orig -> origin/gh/v0i0/8/orig 2025-10-10T00:44:17.0448270Z * [new branch] gh/v0i0/9/base -> origin/gh/v0i0/9/base 2025-10-10T00:44:17.0450067Z * [new branch] gh/v0i0/9/head -> origin/gh/v0i0/9/head 2025-10-10T00:44:17.0452123Z * [new branch] gh/v0i0/9/orig -> origin/gh/v0i0/9/orig 2025-10-10T00:44:17.0455197Z * [new branch] gh/vishal9-team/1/base -> origin/gh/vishal9-team/1/base 2025-10-10T00:44:17.0457039Z * [new branch] gh/vishal9-team/1/head -> origin/gh/vishal9-team/1/head 2025-10-10T00:44:17.0459396Z * [new branch] gh/vishal9-team/2/base -> origin/gh/vishal9-team/2/base 2025-10-10T00:44:17.0461148Z * [new branch] gh/vishal9-team/2/head -> origin/gh/vishal9-team/2/head 2025-10-10T00:44:17.0462876Z * [new branch] gh/vishal9-team/2/orig -> origin/gh/vishal9-team/2/orig 2025-10-10T00:44:17.0466004Z * [new branch] gh/vkuzo/1/next -> origin/gh/vkuzo/1/next 2025-10-10T00:44:17.0468556Z * [new branch] gh/vkuzo/2/next -> origin/gh/vkuzo/2/next 2025-10-10T00:44:17.0471042Z * [new branch] gh/vkuzo/3/next -> origin/gh/vkuzo/3/next 2025-10-10T00:44:17.0473723Z * [new branch] gh/vkuzo/7/base -> origin/gh/vkuzo/7/base 2025-10-10T00:44:17.0475486Z * [new branch] gh/vkuzo/7/head -> origin/gh/vkuzo/7/head 2025-10-10T00:44:17.0477311Z * [new branch] gh/vkuzo/7/orig -> origin/gh/vkuzo/7/orig 2025-10-10T00:44:17.0480518Z * [new branch] gh/wconstab/419/base -> origin/gh/wconstab/419/base 2025-10-10T00:44:17.0482227Z * [new branch] gh/wconstab/419/head -> origin/gh/wconstab/419/head 2025-10-10T00:44:17.0484093Z * [new branch] gh/wconstab/419/orig -> origin/gh/wconstab/419/orig 2025-10-10T00:44:17.0486596Z * [new branch] gh/wconstab/424/base -> origin/gh/wconstab/424/base 2025-10-10T00:44:17.0488574Z * [new branch] gh/wconstab/424/head -> origin/gh/wconstab/424/head 2025-10-10T00:44:17.0490365Z * [new branch] gh/wconstab/424/orig -> origin/gh/wconstab/424/orig 2025-10-10T00:44:17.0492951Z * [new branch] gh/wconstab/435/base -> origin/gh/wconstab/435/base 2025-10-10T00:44:17.0494993Z * [new branch] gh/wconstab/435/head -> origin/gh/wconstab/435/head 2025-10-10T00:44:17.0496775Z * [new branch] gh/wconstab/435/orig -> origin/gh/wconstab/435/orig 2025-10-10T00:44:17.0503646Z * [new branch] gh/wconstab/438/base -> origin/gh/wconstab/438/base 2025-10-10T00:44:17.0504174Z * [new branch] gh/wconstab/438/head -> origin/gh/wconstab/438/head 2025-10-10T00:44:17.0504878Z * [new branch] gh/wconstab/438/orig -> origin/gh/wconstab/438/orig 2025-10-10T00:44:17.0507291Z * [new branch] gh/wconstab/444/base -> origin/gh/wconstab/444/base 2025-10-10T00:44:17.0509165Z * [new branch] gh/wconstab/444/head -> origin/gh/wconstab/444/head 2025-10-10T00:44:17.0510997Z * [new branch] gh/wconstab/444/orig -> origin/gh/wconstab/444/orig 2025-10-10T00:44:17.0513623Z * [new branch] gh/wconstab/447/base -> origin/gh/wconstab/447/base 2025-10-10T00:44:17.0515514Z * [new branch] gh/wconstab/447/head -> origin/gh/wconstab/447/head 2025-10-10T00:44:17.0517260Z * [new branch] gh/wconstab/447/orig -> origin/gh/wconstab/447/orig 2025-10-10T00:44:17.0520284Z * [new branch] gh/weifengpy/30/base -> origin/gh/weifengpy/30/base 2025-10-10T00:44:17.0522073Z * [new branch] gh/weifengpy/30/head -> origin/gh/weifengpy/30/head 2025-10-10T00:44:17.0523981Z * [new branch] gh/weifengpy/30/orig -> origin/gh/weifengpy/30/orig 2025-10-10T00:44:17.0526343Z * [new branch] gh/weifengpy/31/base -> origin/gh/weifengpy/31/base 2025-10-10T00:44:17.0528687Z * [new branch] gh/weifengpy/31/head -> origin/gh/weifengpy/31/head 2025-10-10T00:44:17.0530418Z * [new branch] gh/weifengpy/31/orig -> origin/gh/weifengpy/31/orig 2025-10-10T00:44:17.0532645Z * [new branch] gh/weifengpy/32/base -> origin/gh/weifengpy/32/base 2025-10-10T00:44:17.0535439Z * [new branch] gh/weifengpy/32/head -> origin/gh/weifengpy/32/head 2025-10-10T00:44:17.0537246Z * [new branch] gh/weifengpy/32/orig -> origin/gh/weifengpy/32/orig 2025-10-10T00:44:17.0540768Z * [new branch] gh/weifengpy/33/base -> origin/gh/weifengpy/33/base 2025-10-10T00:44:17.0541796Z * [new branch] gh/weifengpy/33/head -> origin/gh/weifengpy/33/head 2025-10-10T00:44:17.0543927Z * [new branch] gh/weifengpy/33/orig -> origin/gh/weifengpy/33/orig 2025-10-10T00:44:17.0546546Z * [new branch] gh/weifengpy/34/base -> origin/gh/weifengpy/34/base 2025-10-10T00:44:17.0548326Z * [new branch] gh/weifengpy/34/head -> origin/gh/weifengpy/34/head 2025-10-10T00:44:17.0550177Z * [new branch] gh/weifengpy/34/orig -> origin/gh/weifengpy/34/orig 2025-10-10T00:44:17.0552854Z * [new branch] gh/weifengpy/35/base -> origin/gh/weifengpy/35/base 2025-10-10T00:44:17.0554556Z * [new branch] gh/weifengpy/35/head -> origin/gh/weifengpy/35/head 2025-10-10T00:44:17.0556359Z * [new branch] gh/weifengpy/35/orig -> origin/gh/weifengpy/35/orig 2025-10-10T00:44:17.0558918Z * [new branch] gh/weifengpy/36/base -> origin/gh/weifengpy/36/base 2025-10-10T00:44:17.0560665Z * [new branch] gh/weifengpy/36/head -> origin/gh/weifengpy/36/head 2025-10-10T00:44:17.0562466Z * [new branch] gh/weifengpy/36/orig -> origin/gh/weifengpy/36/orig 2025-10-10T00:44:17.0564786Z * [new branch] gh/weifengpy/37/base -> origin/gh/weifengpy/37/base 2025-10-10T00:44:17.0566601Z * [new branch] gh/weifengpy/37/head -> origin/gh/weifengpy/37/head 2025-10-10T00:44:17.0568649Z * [new branch] gh/weifengpy/37/orig -> origin/gh/weifengpy/37/orig 2025-10-10T00:44:17.0571756Z * [new branch] gh/williamwen42/250/base -> origin/gh/williamwen42/250/base 2025-10-10T00:44:17.0573614Z * [new branch] gh/williamwen42/250/head -> origin/gh/williamwen42/250/head 2025-10-10T00:44:17.0575399Z * [new branch] gh/williamwen42/250/orig -> origin/gh/williamwen42/250/orig 2025-10-10T00:44:17.0578013Z * [new branch] gh/williamwen42/278/base -> origin/gh/williamwen42/278/base 2025-10-10T00:44:17.0579895Z * [new branch] gh/williamwen42/278/head -> origin/gh/williamwen42/278/head 2025-10-10T00:44:17.0581724Z * [new branch] gh/williamwen42/278/orig -> origin/gh/williamwen42/278/orig 2025-10-10T00:44:17.0584357Z * [new branch] gh/williamwen42/279/base -> origin/gh/williamwen42/279/base 2025-10-10T00:44:17.0586291Z * [new branch] gh/williamwen42/279/head -> origin/gh/williamwen42/279/head 2025-10-10T00:44:17.0588115Z * [new branch] gh/williamwen42/279/orig -> origin/gh/williamwen42/279/orig 2025-10-10T00:44:17.0590623Z * [new branch] gh/williamwen42/281/base -> origin/gh/williamwen42/281/base 2025-10-10T00:44:17.0592676Z * [new branch] gh/williamwen42/281/head -> origin/gh/williamwen42/281/head 2025-10-10T00:44:17.0594432Z * [new branch] gh/williamwen42/281/orig -> origin/gh/williamwen42/281/orig 2025-10-10T00:44:17.0596971Z * [new branch] gh/williamwen42/282/base -> origin/gh/williamwen42/282/base 2025-10-10T00:44:17.0599524Z * [new branch] gh/williamwen42/282/head -> origin/gh/williamwen42/282/head 2025-10-10T00:44:17.0601389Z * [new branch] gh/williamwen42/282/orig -> origin/gh/williamwen42/282/orig 2025-10-10T00:44:17.0603922Z * [new branch] gh/williamwen42/285/base -> origin/gh/williamwen42/285/base 2025-10-10T00:44:17.0605839Z * [new branch] gh/williamwen42/285/head -> origin/gh/williamwen42/285/head 2025-10-10T00:44:17.0607806Z * [new branch] gh/williamwen42/285/orig -> origin/gh/williamwen42/285/orig 2025-10-10T00:44:17.0610299Z * [new branch] gh/williamwen42/286/base -> origin/gh/williamwen42/286/base 2025-10-10T00:44:17.0611769Z * [new branch] gh/williamwen42/286/head -> origin/gh/williamwen42/286/head 2025-10-10T00:44:17.0613742Z * [new branch] gh/williamwen42/286/orig -> origin/gh/williamwen42/286/orig 2025-10-10T00:44:17.0616167Z * [new branch] gh/williamwen42/287/base -> origin/gh/williamwen42/287/base 2025-10-10T00:44:17.0618079Z * [new branch] gh/williamwen42/287/head -> origin/gh/williamwen42/287/head 2025-10-10T00:44:17.0619964Z * [new branch] gh/williamwen42/287/orig -> origin/gh/williamwen42/287/orig 2025-10-10T00:44:17.0622523Z * [new branch] gh/williamwen42/288/base -> origin/gh/williamwen42/288/base 2025-10-10T00:44:17.0624400Z * [new branch] gh/williamwen42/288/head -> origin/gh/williamwen42/288/head 2025-10-10T00:44:17.0626107Z * [new branch] gh/williamwen42/288/orig -> origin/gh/williamwen42/288/orig 2025-10-10T00:44:17.0628682Z * [new branch] gh/williamwen42/289/base -> origin/gh/williamwen42/289/base 2025-10-10T00:44:17.0630581Z * [new branch] gh/williamwen42/289/head -> origin/gh/williamwen42/289/head 2025-10-10T00:44:17.0632384Z * [new branch] gh/williamwen42/289/orig -> origin/gh/williamwen42/289/orig 2025-10-10T00:44:17.0634686Z * [new branch] gh/williamwen42/290/base -> origin/gh/williamwen42/290/base 2025-10-10T00:44:17.0636558Z * [new branch] gh/williamwen42/290/head -> origin/gh/williamwen42/290/head 2025-10-10T00:44:17.0638078Z * [new branch] gh/williamwen42/290/orig -> origin/gh/williamwen42/290/orig 2025-10-10T00:44:17.0640824Z * [new branch] gh/williamwen42/291/base -> origin/gh/williamwen42/291/base 2025-10-10T00:44:17.0642572Z * [new branch] gh/williamwen42/291/head -> origin/gh/williamwen42/291/head 2025-10-10T00:44:17.0644402Z * [new branch] gh/williamwen42/291/orig -> origin/gh/williamwen42/291/orig 2025-10-10T00:44:17.0646962Z * [new branch] gh/williamwen42/292/base -> origin/gh/williamwen42/292/base 2025-10-10T00:44:17.0648991Z * [new branch] gh/williamwen42/292/head -> origin/gh/williamwen42/292/head 2025-10-10T00:44:17.0650811Z * [new branch] gh/williamwen42/292/orig -> origin/gh/williamwen42/292/orig 2025-10-10T00:44:17.0653360Z * [new branch] gh/williamwen42/293/base -> origin/gh/williamwen42/293/base 2025-10-10T00:44:17.0654813Z * [new branch] gh/williamwen42/293/head -> origin/gh/williamwen42/293/head 2025-10-10T00:44:17.0656840Z * [new branch] gh/williamwen42/293/orig -> origin/gh/williamwen42/293/orig 2025-10-10T00:44:17.0659202Z * [new branch] gh/williamwen42/294/base -> origin/gh/williamwen42/294/base 2025-10-10T00:44:17.0661160Z * [new branch] gh/williamwen42/294/head -> origin/gh/williamwen42/294/head 2025-10-10T00:44:17.0662703Z * [new branch] gh/williamwen42/294/orig -> origin/gh/williamwen42/294/orig 2025-10-10T00:44:17.0665319Z * [new branch] gh/williamwen42/295/base -> origin/gh/williamwen42/295/base 2025-10-10T00:44:17.0667274Z * [new branch] gh/williamwen42/295/head -> origin/gh/williamwen42/295/head 2025-10-10T00:44:17.0669175Z * [new branch] gh/williamwen42/295/orig -> origin/gh/williamwen42/295/orig 2025-10-10T00:44:17.0672104Z * [new branch] gh/williamwen42/296/base -> origin/gh/williamwen42/296/base 2025-10-10T00:44:17.0674064Z * [new branch] gh/williamwen42/296/head -> origin/gh/williamwen42/296/head 2025-10-10T00:44:17.0675923Z * [new branch] gh/williamwen42/296/orig -> origin/gh/williamwen42/296/orig 2025-10-10T00:44:17.0678577Z * [new branch] gh/williamwen42/297/base -> origin/gh/williamwen42/297/base 2025-10-10T00:44:17.0680224Z * [new branch] gh/williamwen42/297/head -> origin/gh/williamwen42/297/head 2025-10-10T00:44:17.0681992Z * [new branch] gh/williamwen42/297/orig -> origin/gh/williamwen42/297/orig 2025-10-10T00:44:17.0684514Z * [new branch] gh/williamwen42/298/base -> origin/gh/williamwen42/298/base 2025-10-10T00:44:17.0686349Z * [new branch] gh/williamwen42/298/head -> origin/gh/williamwen42/298/head 2025-10-10T00:44:17.0688335Z * [new branch] gh/williamwen42/298/orig -> origin/gh/williamwen42/298/orig 2025-10-10T00:44:17.0690946Z * [new branch] gh/williamwen42/299/base -> origin/gh/williamwen42/299/base 2025-10-10T00:44:17.0692974Z * [new branch] gh/williamwen42/299/head -> origin/gh/williamwen42/299/head 2025-10-10T00:44:17.0694837Z * [new branch] gh/williamwen42/299/orig -> origin/gh/williamwen42/299/orig 2025-10-10T00:44:17.0697444Z * [new branch] gh/williamwen42/300/base -> origin/gh/williamwen42/300/base 2025-10-10T00:44:17.0699391Z * [new branch] gh/williamwen42/300/head -> origin/gh/williamwen42/300/head 2025-10-10T00:44:17.0701401Z * [new branch] gh/williamwen42/300/orig -> origin/gh/williamwen42/300/orig 2025-10-10T00:44:17.0704349Z * [new branch] gh/williamwen42/301/base -> origin/gh/williamwen42/301/base 2025-10-10T00:44:17.0706208Z * [new branch] gh/williamwen42/301/head -> origin/gh/williamwen42/301/head 2025-10-10T00:44:17.0707957Z * [new branch] gh/williamwen42/301/orig -> origin/gh/williamwen42/301/orig 2025-10-10T00:44:17.0710362Z * [new branch] gh/williamwen42/302/base -> origin/gh/williamwen42/302/base 2025-10-10T00:44:17.0712338Z * [new branch] gh/williamwen42/302/head -> origin/gh/williamwen42/302/head 2025-10-10T00:44:17.0714099Z * [new branch] gh/williamwen42/302/orig -> origin/gh/williamwen42/302/orig 2025-10-10T00:44:17.0716513Z * [new branch] gh/williamwen42/303/base -> origin/gh/williamwen42/303/base 2025-10-10T00:44:17.0718384Z * [new branch] gh/williamwen42/303/head -> origin/gh/williamwen42/303/head 2025-10-10T00:44:17.0720109Z * [new branch] gh/williamwen42/303/orig -> origin/gh/williamwen42/303/orig 2025-10-10T00:44:17.0722978Z * [new branch] gh/williamwen42/304/base -> origin/gh/williamwen42/304/base 2025-10-10T00:44:17.0724757Z * [new branch] gh/williamwen42/304/head -> origin/gh/williamwen42/304/head 2025-10-10T00:44:17.0726553Z * [new branch] gh/williamwen42/304/orig -> origin/gh/williamwen42/304/orig 2025-10-10T00:44:17.0729373Z * [new branch] gh/williamwen42/305/base -> origin/gh/williamwen42/305/base 2025-10-10T00:44:17.0731294Z * [new branch] gh/williamwen42/305/head -> origin/gh/williamwen42/305/head 2025-10-10T00:44:17.0733076Z * [new branch] gh/williamwen42/305/orig -> origin/gh/williamwen42/305/orig 2025-10-10T00:44:17.0735447Z * [new branch] gh/williamwen42/306/base -> origin/gh/williamwen42/306/base 2025-10-10T00:44:17.0737346Z * [new branch] gh/williamwen42/306/head -> origin/gh/williamwen42/306/head 2025-10-10T00:44:17.0739311Z * [new branch] gh/williamwen42/306/orig -> origin/gh/williamwen42/306/orig 2025-10-10T00:44:17.0741645Z * [new branch] gh/williamwen42/307/base -> origin/gh/williamwen42/307/base 2025-10-10T00:44:17.0743613Z * [new branch] gh/williamwen42/307/head -> origin/gh/williamwen42/307/head 2025-10-10T00:44:17.0745097Z * [new branch] gh/williamwen42/307/orig -> origin/gh/williamwen42/307/orig 2025-10-10T00:44:17.0748514Z * [new branch] gh/xmfan/169/base -> origin/gh/xmfan/169/base 2025-10-10T00:44:17.0750288Z * [new branch] gh/xmfan/169/head -> origin/gh/xmfan/169/head 2025-10-10T00:44:17.0752708Z * [new branch] gh/xmfan/170/base -> origin/gh/xmfan/170/base 2025-10-10T00:44:17.0754391Z * [new branch] gh/xmfan/170/head -> origin/gh/xmfan/170/head 2025-10-10T00:44:17.0756803Z * [new branch] gh/xmfan/244/base -> origin/gh/xmfan/244/base 2025-10-10T00:44:17.0758588Z * [new branch] gh/xmfan/244/head -> origin/gh/xmfan/244/head 2025-10-10T00:44:17.0760372Z * [new branch] gh/xmfan/244/orig -> origin/gh/xmfan/244/orig 2025-10-10T00:44:17.0762762Z * [new branch] gh/xmfan/246/base -> origin/gh/xmfan/246/base 2025-10-10T00:44:17.0764562Z * [new branch] gh/xmfan/246/head -> origin/gh/xmfan/246/head 2025-10-10T00:44:17.0766387Z * [new branch] gh/xmfan/246/orig -> origin/gh/xmfan/246/orig 2025-10-10T00:44:17.0769042Z * [new branch] gh/xmfan/253/base -> origin/gh/xmfan/253/base 2025-10-10T00:44:17.0770966Z * [new branch] gh/xmfan/253/head -> origin/gh/xmfan/253/head 2025-10-10T00:44:17.0772772Z * [new branch] gh/xmfan/253/orig -> origin/gh/xmfan/253/orig 2025-10-10T00:44:17.0775215Z * [new branch] gh/xmfan/260/base -> origin/gh/xmfan/260/base 2025-10-10T00:44:17.0776982Z * [new branch] gh/xmfan/260/head -> origin/gh/xmfan/260/head 2025-10-10T00:44:17.0778763Z * [new branch] gh/xmfan/260/orig -> origin/gh/xmfan/260/orig 2025-10-10T00:44:17.0781192Z * [new branch] gh/xmfan/262/base -> origin/gh/xmfan/262/base 2025-10-10T00:44:17.0782975Z * [new branch] gh/xmfan/262/head -> origin/gh/xmfan/262/head 2025-10-10T00:44:17.0784742Z * [new branch] gh/xmfan/262/orig -> origin/gh/xmfan/262/orig 2025-10-10T00:44:17.0787190Z * [new branch] gh/xmfan/274/base -> origin/gh/xmfan/274/base 2025-10-10T00:44:17.0788997Z * [new branch] gh/xmfan/274/head -> origin/gh/xmfan/274/head 2025-10-10T00:44:17.0790765Z * [new branch] gh/xmfan/274/orig -> origin/gh/xmfan/274/orig 2025-10-10T00:44:17.0793269Z * [new branch] gh/xmfan/277/base -> origin/gh/xmfan/277/base 2025-10-10T00:44:17.0795226Z * [new branch] gh/xmfan/277/head -> origin/gh/xmfan/277/head 2025-10-10T00:44:17.0796952Z * [new branch] gh/xmfan/277/orig -> origin/gh/xmfan/277/orig 2025-10-10T00:44:17.0799702Z * [new branch] gh/xmfan/281/base -> origin/gh/xmfan/281/base 2025-10-10T00:44:17.0801399Z * [new branch] gh/xmfan/281/head -> origin/gh/xmfan/281/head 2025-10-10T00:44:17.0810245Z * [new branch] gh/xmfan/281/orig -> origin/gh/xmfan/281/orig 2025-10-10T00:44:17.0810807Z * [new branch] gh/xmfan/284/base -> origin/gh/xmfan/284/base 2025-10-10T00:44:17.0811291Z * [new branch] gh/xmfan/284/head -> origin/gh/xmfan/284/head 2025-10-10T00:44:17.0811776Z * [new branch] gh/xmfan/284/orig -> origin/gh/xmfan/284/orig 2025-10-10T00:44:17.0812515Z * [new branch] gh/xmfan/285/base -> origin/gh/xmfan/285/base 2025-10-10T00:44:17.0814758Z * [new branch] gh/xmfan/285/head -> origin/gh/xmfan/285/head 2025-10-10T00:44:17.0816418Z * [new branch] gh/xmfan/285/orig -> origin/gh/xmfan/285/orig 2025-10-10T00:44:17.0819308Z * [new branch] gh/xmfan/286/base -> origin/gh/xmfan/286/base 2025-10-10T00:44:17.0821240Z * [new branch] gh/xmfan/286/head -> origin/gh/xmfan/286/head 2025-10-10T00:44:17.0823112Z * [new branch] gh/xmfan/286/orig -> origin/gh/xmfan/286/orig 2025-10-10T00:44:17.0825606Z * [new branch] gh/xmfan/287/base -> origin/gh/xmfan/287/base 2025-10-10T00:44:17.0827428Z * [new branch] gh/xmfan/287/head -> origin/gh/xmfan/287/head 2025-10-10T00:44:17.0829169Z * [new branch] gh/xmfan/287/orig -> origin/gh/xmfan/287/orig 2025-10-10T00:44:17.0831741Z * [new branch] gh/xmfan/288/base -> origin/gh/xmfan/288/base 2025-10-10T00:44:17.0833652Z * [new branch] gh/xmfan/288/head -> origin/gh/xmfan/288/head 2025-10-10T00:44:17.0835625Z * [new branch] gh/xmfan/288/orig -> origin/gh/xmfan/288/orig 2025-10-10T00:44:17.0838032Z * [new branch] gh/xmfan/289/base -> origin/gh/xmfan/289/base 2025-10-10T00:44:17.0839883Z * [new branch] gh/xmfan/289/head -> origin/gh/xmfan/289/head 2025-10-10T00:44:17.0841699Z * [new branch] gh/xmfan/289/orig -> origin/gh/xmfan/289/orig 2025-10-10T00:44:17.0845254Z * [new branch] gh/xmfan/290/base -> origin/gh/xmfan/290/base 2025-10-10T00:44:17.0847249Z * [new branch] gh/xmfan/290/head -> origin/gh/xmfan/290/head 2025-10-10T00:44:17.0849127Z * [new branch] gh/xmfan/290/orig -> origin/gh/xmfan/290/orig 2025-10-10T00:44:17.0851794Z * [new branch] gh/xmfan/291/base -> origin/gh/xmfan/291/base 2025-10-10T00:44:17.0853666Z * [new branch] gh/xmfan/291/head -> origin/gh/xmfan/291/head 2025-10-10T00:44:17.0855885Z * [new branch] gh/xmfan/291/orig -> origin/gh/xmfan/291/orig 2025-10-10T00:44:17.0858569Z * [new branch] gh/xmfan/292/base -> origin/gh/xmfan/292/base 2025-10-10T00:44:17.0860727Z * [new branch] gh/xmfan/292/head -> origin/gh/xmfan/292/head 2025-10-10T00:44:17.0862389Z * [new branch] gh/xmfan/292/orig -> origin/gh/xmfan/292/orig 2025-10-10T00:44:17.0864913Z * [new branch] gh/xmfan/293/base -> origin/gh/xmfan/293/base 2025-10-10T00:44:17.0866698Z * [new branch] gh/xmfan/293/head -> origin/gh/xmfan/293/head 2025-10-10T00:44:17.0868579Z * [new branch] gh/xmfan/293/orig -> origin/gh/xmfan/293/orig 2025-10-10T00:44:17.0870903Z * [new branch] gh/xmfan/294/base -> origin/gh/xmfan/294/base 2025-10-10T00:44:17.0872742Z * [new branch] gh/xmfan/294/head -> origin/gh/xmfan/294/head 2025-10-10T00:44:17.0874571Z * [new branch] gh/xmfan/294/orig -> origin/gh/xmfan/294/orig 2025-10-10T00:44:17.0877356Z * [new branch] gh/xmfan/295/base -> origin/gh/xmfan/295/base 2025-10-10T00:44:17.0879356Z * [new branch] gh/xmfan/295/head -> origin/gh/xmfan/295/head 2025-10-10T00:44:17.0881132Z * [new branch] gh/xmfan/295/orig -> origin/gh/xmfan/295/orig 2025-10-10T00:44:17.0884165Z * [new branch] gh/xmfan/296/base -> origin/gh/xmfan/296/base 2025-10-10T00:44:17.0886060Z * [new branch] gh/xmfan/296/head -> origin/gh/xmfan/296/head 2025-10-10T00:44:17.0887983Z * [new branch] gh/xmfan/296/orig -> origin/gh/xmfan/296/orig 2025-10-10T00:44:17.0890575Z * [new branch] gh/xmfan/297/base -> origin/gh/xmfan/297/base 2025-10-10T00:44:17.0892668Z * [new branch] gh/xmfan/297/head -> origin/gh/xmfan/297/head 2025-10-10T00:44:17.0894346Z * [new branch] gh/xmfan/297/orig -> origin/gh/xmfan/297/orig 2025-10-10T00:44:17.0896904Z * [new branch] gh/xmfan/298/base -> origin/gh/xmfan/298/base 2025-10-10T00:44:17.0898868Z * [new branch] gh/xmfan/298/head -> origin/gh/xmfan/298/head 2025-10-10T00:44:17.0902958Z * [new branch] gh/xmfan/298/orig -> origin/gh/xmfan/298/orig 2025-10-10T00:44:17.0905609Z * [new branch] gh/xmfan/299/base -> origin/gh/xmfan/299/base 2025-10-10T00:44:17.0907379Z * [new branch] gh/xmfan/299/head -> origin/gh/xmfan/299/head 2025-10-10T00:44:17.0909179Z * [new branch] gh/xmfan/299/orig -> origin/gh/xmfan/299/orig 2025-10-10T00:44:17.0911631Z * [new branch] gh/xmfan/300/base -> origin/gh/xmfan/300/base 2025-10-10T00:44:17.0913432Z * [new branch] gh/xmfan/300/head -> origin/gh/xmfan/300/head 2025-10-10T00:44:17.0915177Z * [new branch] gh/xmfan/300/orig -> origin/gh/xmfan/300/orig 2025-10-10T00:44:17.0917693Z * [new branch] gh/xmfan/301/base -> origin/gh/xmfan/301/base 2025-10-10T00:44:17.0919468Z * [new branch] gh/xmfan/301/head -> origin/gh/xmfan/301/head 2025-10-10T00:44:17.0921281Z * [new branch] gh/xmfan/301/orig -> origin/gh/xmfan/301/orig 2025-10-10T00:44:17.0923836Z * [new branch] gh/xmfan/302/base -> origin/gh/xmfan/302/base 2025-10-10T00:44:17.0925761Z * [new branch] gh/xmfan/302/head -> origin/gh/xmfan/302/head 2025-10-10T00:44:17.0927742Z * [new branch] gh/xmfan/302/orig -> origin/gh/xmfan/302/orig 2025-10-10T00:44:17.0930745Z * [new branch] gh/xmfan/303/base -> origin/gh/xmfan/303/base 2025-10-10T00:44:17.0932509Z * [new branch] gh/xmfan/303/head -> origin/gh/xmfan/303/head 2025-10-10T00:44:17.0934372Z * [new branch] gh/xmfan/303/orig -> origin/gh/xmfan/303/orig 2025-10-10T00:44:17.0936715Z * [new branch] gh/xmfan/304/base -> origin/gh/xmfan/304/base 2025-10-10T00:44:17.0938620Z * [new branch] gh/xmfan/304/head -> origin/gh/xmfan/304/head 2025-10-10T00:44:17.0940402Z * [new branch] gh/xmfan/304/orig -> origin/gh/xmfan/304/orig 2025-10-10T00:44:17.0943436Z * [new branch] gh/xuanzhang816/14/base -> origin/gh/xuanzhang816/14/base 2025-10-10T00:44:17.0945223Z * [new branch] gh/xuanzhang816/14/head -> origin/gh/xuanzhang816/14/head 2025-10-10T00:44:17.0947044Z * [new branch] gh/xuanzhang816/14/orig -> origin/gh/xuanzhang816/14/orig 2025-10-10T00:44:17.0950119Z * [new branch] gh/xuanzhang816/22/base -> origin/gh/xuanzhang816/22/base 2025-10-10T00:44:17.0952132Z * [new branch] gh/xuanzhang816/22/head -> origin/gh/xuanzhang816/22/head 2025-10-10T00:44:17.0954009Z * [new branch] gh/xuanzhang816/22/orig -> origin/gh/xuanzhang816/22/orig 2025-10-10T00:44:17.0956458Z * [new branch] gh/xuanzhang816/23/base -> origin/gh/xuanzhang816/23/base 2025-10-10T00:44:17.0958812Z * [new branch] gh/xuanzhang816/23/head -> origin/gh/xuanzhang816/23/head 2025-10-10T00:44:17.0960568Z * [new branch] gh/xuanzhang816/23/orig -> origin/gh/xuanzhang816/23/orig 2025-10-10T00:44:17.0963080Z * [new branch] gh/xuanzhang816/25/base -> origin/gh/xuanzhang816/25/base 2025-10-10T00:44:17.0964888Z * [new branch] gh/xuanzhang816/25/head -> origin/gh/xuanzhang816/25/head 2025-10-10T00:44:17.0966729Z * [new branch] gh/xuanzhang816/25/orig -> origin/gh/xuanzhang816/25/orig 2025-10-10T00:44:17.0969531Z * [new branch] gh/xuanzhang816/26/base -> origin/gh/xuanzhang816/26/base 2025-10-10T00:44:17.0971169Z * [new branch] gh/xuanzhang816/26/head -> origin/gh/xuanzhang816/26/head 2025-10-10T00:44:17.0973030Z * [new branch] gh/xuanzhang816/26/orig -> origin/gh/xuanzhang816/26/orig 2025-10-10T00:44:17.0975586Z * [new branch] gh/xuanzhang816/27/base -> origin/gh/xuanzhang816/27/base 2025-10-10T00:44:17.0977295Z * [new branch] gh/xuanzhang816/27/head -> origin/gh/xuanzhang816/27/head 2025-10-10T00:44:17.0979670Z * [new branch] gh/xuanzhang816/27/orig -> origin/gh/xuanzhang816/27/orig 2025-10-10T00:44:17.0982112Z * [new branch] gh/xuanzhang816/28/base -> origin/gh/xuanzhang816/28/base 2025-10-10T00:44:17.0983860Z * [new branch] gh/xuanzhang816/28/head -> origin/gh/xuanzhang816/28/head 2025-10-10T00:44:17.0985761Z * [new branch] gh/xuanzhang816/28/orig -> origin/gh/xuanzhang816/28/orig 2025-10-10T00:44:17.0988309Z * [new branch] gh/xuanzhang816/29/base -> origin/gh/xuanzhang816/29/base 2025-10-10T00:44:17.0990557Z * [new branch] gh/xuanzhang816/29/head -> origin/gh/xuanzhang816/29/head 2025-10-10T00:44:17.0992330Z * [new branch] gh/xuanzhang816/29/orig -> origin/gh/xuanzhang816/29/orig 2025-10-10T00:44:17.0994760Z * [new branch] gh/xuanzhang816/30/base -> origin/gh/xuanzhang816/30/base 2025-10-10T00:44:17.0996569Z * [new branch] gh/xuanzhang816/30/head -> origin/gh/xuanzhang816/30/head 2025-10-10T00:44:17.0998640Z * [new branch] gh/xuanzhang816/30/orig -> origin/gh/xuanzhang816/30/orig 2025-10-10T00:44:17.1001322Z * [new branch] gh/xuanzhang816/31/base -> origin/gh/xuanzhang816/31/base 2025-10-10T00:44:17.1003000Z * [new branch] gh/xuanzhang816/31/head -> origin/gh/xuanzhang816/31/head 2025-10-10T00:44:17.1004771Z * [new branch] gh/xuanzhang816/31/orig -> origin/gh/xuanzhang816/31/orig 2025-10-10T00:44:17.1007394Z * [new branch] gh/xuanzhang816/32/base -> origin/gh/xuanzhang816/32/base 2025-10-10T00:44:17.1009800Z * [new branch] gh/xuanzhang816/32/head -> origin/gh/xuanzhang816/32/head 2025-10-10T00:44:17.1012073Z * [new branch] gh/xuanzhang816/32/orig -> origin/gh/xuanzhang816/32/orig 2025-10-10T00:44:17.1014796Z * [new branch] gh/xuanzhang816/33/base -> origin/gh/xuanzhang816/33/base 2025-10-10T00:44:17.1016416Z * [new branch] gh/xuanzhang816/33/head -> origin/gh/xuanzhang816/33/head 2025-10-10T00:44:17.1018106Z * [new branch] gh/xuanzhang816/33/orig -> origin/gh/xuanzhang816/33/orig 2025-10-10T00:44:17.1021491Z * [new branch] gh/yanbing-j/11/base -> origin/gh/yanbing-j/11/base 2025-10-10T00:44:17.1023302Z * [new branch] gh/yanbing-j/11/head -> origin/gh/yanbing-j/11/head 2025-10-10T00:44:17.1025100Z * [new branch] gh/yanbing-j/11/orig -> origin/gh/yanbing-j/11/orig 2025-10-10T00:44:17.1027695Z * [new branch] gh/yanbing-j/12/base -> origin/gh/yanbing-j/12/base 2025-10-10T00:44:17.1029529Z * [new branch] gh/yanbing-j/12/head -> origin/gh/yanbing-j/12/head 2025-10-10T00:44:17.1031342Z * [new branch] gh/yanbing-j/12/orig -> origin/gh/yanbing-j/12/orig 2025-10-10T00:44:17.1034343Z * [new branch] gh/yanbing-j/13/base -> origin/gh/yanbing-j/13/base 2025-10-10T00:44:17.1036513Z * [new branch] gh/yanbing-j/13/head -> origin/gh/yanbing-j/13/head 2025-10-10T00:44:17.1038476Z * [new branch] gh/yanbing-j/13/orig -> origin/gh/yanbing-j/13/orig 2025-10-10T00:44:17.1041028Z * [new branch] gh/yanbing-j/14/base -> origin/gh/yanbing-j/14/base 2025-10-10T00:44:17.1042876Z * [new branch] gh/yanbing-j/14/head -> origin/gh/yanbing-j/14/head 2025-10-10T00:44:17.1045044Z * [new branch] gh/yanbing-j/14/orig -> origin/gh/yanbing-j/14/orig 2025-10-10T00:44:17.1047449Z * [new branch] gh/yanbing-j/15/base -> origin/gh/yanbing-j/15/base 2025-10-10T00:44:17.1049450Z * [new branch] gh/yanbing-j/15/head -> origin/gh/yanbing-j/15/head 2025-10-10T00:44:17.1051312Z * [new branch] gh/yanbing-j/15/orig -> origin/gh/yanbing-j/15/orig 2025-10-10T00:44:17.1054033Z * [new branch] gh/yanbing-j/18/base -> origin/gh/yanbing-j/18/base 2025-10-10T00:44:17.1055774Z * [new branch] gh/yanbing-j/18/head -> origin/gh/yanbing-j/18/head 2025-10-10T00:44:17.1057605Z * [new branch] gh/yanbing-j/18/orig -> origin/gh/yanbing-j/18/orig 2025-10-10T00:44:17.1060273Z * [new branch] gh/yanbing-j/19/base -> origin/gh/yanbing-j/19/base 2025-10-10T00:44:17.1062030Z * [new branch] gh/yanbing-j/19/head -> origin/gh/yanbing-j/19/head 2025-10-10T00:44:17.1063794Z * [new branch] gh/yanbing-j/19/orig -> origin/gh/yanbing-j/19/orig 2025-10-10T00:44:17.1066647Z * [new branch] gh/yanbing-j/20/base -> origin/gh/yanbing-j/20/base 2025-10-10T00:44:17.1068569Z * [new branch] gh/yanbing-j/20/head -> origin/gh/yanbing-j/20/head 2025-10-10T00:44:17.1070502Z * [new branch] gh/yanbing-j/20/orig -> origin/gh/yanbing-j/20/orig 2025-10-10T00:44:17.1073120Z * [new branch] gh/yanbing-j/21/base -> origin/gh/yanbing-j/21/base 2025-10-10T00:44:17.1075057Z * [new branch] gh/yanbing-j/21/head -> origin/gh/yanbing-j/21/head 2025-10-10T00:44:17.1077629Z * [new branch] gh/yanbing-j/22/base -> origin/gh/yanbing-j/22/base 2025-10-10T00:44:17.1079522Z * [new branch] gh/yanbing-j/22/head -> origin/gh/yanbing-j/22/head 2025-10-10T00:44:17.1081375Z * [new branch] gh/yanbing-j/22/orig -> origin/gh/yanbing-j/22/orig 2025-10-10T00:44:17.1083904Z * [new branch] gh/yanbing-j/23/base -> origin/gh/yanbing-j/23/base 2025-10-10T00:44:17.1085762Z * [new branch] gh/yanbing-j/23/head -> origin/gh/yanbing-j/23/head 2025-10-10T00:44:17.1088328Z * [new branch] gh/yanbing-j/23/orig -> origin/gh/yanbing-j/23/orig 2025-10-10T00:44:17.1090901Z * [new branch] gh/yanbing-j/24/base -> origin/gh/yanbing-j/24/base 2025-10-10T00:44:17.1092785Z * [new branch] gh/yanbing-j/24/head -> origin/gh/yanbing-j/24/head 2025-10-10T00:44:17.1094646Z * [new branch] gh/yanbing-j/24/orig -> origin/gh/yanbing-j/24/orig 2025-10-10T00:44:17.1097721Z * [new branch] gh/yanbing-j/25/base -> origin/gh/yanbing-j/25/base 2025-10-10T00:44:17.1100109Z * [new branch] gh/yanbing-j/25/head -> origin/gh/yanbing-j/25/head 2025-10-10T00:44:17.1101907Z * [new branch] gh/yanbing-j/25/orig -> origin/gh/yanbing-j/25/orig 2025-10-10T00:44:17.1104587Z * [new branch] gh/yanbing-j/26/base -> origin/gh/yanbing-j/26/base 2025-10-10T00:44:17.1106324Z * [new branch] gh/yanbing-j/26/head -> origin/gh/yanbing-j/26/head 2025-10-10T00:44:17.1108704Z * [new branch] gh/yanbing-j/26/orig -> origin/gh/yanbing-j/26/orig 2025-10-10T00:44:17.1111271Z * [new branch] gh/yanbing-j/36/base -> origin/gh/yanbing-j/36/base 2025-10-10T00:44:17.1113102Z * [new branch] gh/yanbing-j/36/head -> origin/gh/yanbing-j/36/head 2025-10-10T00:44:17.1114955Z * [new branch] gh/yanbing-j/36/orig -> origin/gh/yanbing-j/36/orig 2025-10-10T00:44:17.1118533Z * [new branch] gh/yangw-dev/12/base -> origin/gh/yangw-dev/12/base 2025-10-10T00:44:17.1120353Z * [new branch] gh/yangw-dev/12/head -> origin/gh/yangw-dev/12/head 2025-10-10T00:44:17.1122344Z * [new branch] gh/yangw-dev/12/orig -> origin/gh/yangw-dev/12/orig 2025-10-10T00:44:17.1124749Z * [new branch] gh/yangw-dev/13/base -> origin/gh/yangw-dev/13/base 2025-10-10T00:44:17.1126722Z * [new branch] gh/yangw-dev/13/head -> origin/gh/yangw-dev/13/head 2025-10-10T00:44:17.1128740Z * [new branch] gh/yangw-dev/13/orig -> origin/gh/yangw-dev/13/orig 2025-10-10T00:44:17.1131386Z * [new branch] gh/yangw-dev/14/base -> origin/gh/yangw-dev/14/base 2025-10-10T00:44:17.1133242Z * [new branch] gh/yangw-dev/14/head -> origin/gh/yangw-dev/14/head 2025-10-10T00:44:17.1135107Z * [new branch] gh/yangw-dev/14/orig -> origin/gh/yangw-dev/14/orig 2025-10-10T00:44:17.1137653Z * [new branch] gh/yangw-dev/15/base -> origin/gh/yangw-dev/15/base 2025-10-10T00:44:17.1139519Z * [new branch] gh/yangw-dev/15/head -> origin/gh/yangw-dev/15/head 2025-10-10T00:44:17.1141388Z * [new branch] gh/yangw-dev/15/orig -> origin/gh/yangw-dev/15/orig 2025-10-10T00:44:17.1143944Z * [new branch] gh/yangw-dev/19/base -> origin/gh/yangw-dev/19/base 2025-10-10T00:44:17.1145807Z * [new branch] gh/yangw-dev/19/head -> origin/gh/yangw-dev/19/head 2025-10-10T00:44:17.1147736Z * [new branch] gh/yangw-dev/19/orig -> origin/gh/yangw-dev/19/orig 2025-10-10T00:44:17.1150384Z * [new branch] gh/yangw-dev/26/base -> origin/gh/yangw-dev/26/base 2025-10-10T00:44:17.1152242Z * [new branch] gh/yangw-dev/26/head -> origin/gh/yangw-dev/26/head 2025-10-10T00:44:17.1154095Z * [new branch] gh/yangw-dev/26/orig -> origin/gh/yangw-dev/26/orig 2025-10-10T00:44:17.1156694Z * [new branch] gh/yangw-dev/27/base -> origin/gh/yangw-dev/27/base 2025-10-10T00:44:17.1158541Z * [new branch] gh/yangw-dev/27/head -> origin/gh/yangw-dev/27/head 2025-10-10T00:44:17.1160274Z * [new branch] gh/yangw-dev/27/orig -> origin/gh/yangw-dev/27/orig 2025-10-10T00:44:17.1163525Z * [new branch] gh/ydwu4/262/base -> origin/gh/ydwu4/262/base 2025-10-10T00:44:17.1165431Z * [new branch] gh/ydwu4/262/head -> origin/gh/ydwu4/262/head 2025-10-10T00:44:17.1167329Z * [new branch] gh/ydwu4/262/orig -> origin/gh/ydwu4/262/orig 2025-10-10T00:44:17.1169943Z * [new branch] gh/ydwu4/263/base -> origin/gh/ydwu4/263/base 2025-10-10T00:44:17.1171802Z * [new branch] gh/ydwu4/263/head -> origin/gh/ydwu4/263/head 2025-10-10T00:44:17.1173663Z * [new branch] gh/ydwu4/263/orig -> origin/gh/ydwu4/263/orig 2025-10-10T00:44:17.1176456Z * [new branch] gh/ydwu4/269/base -> origin/gh/ydwu4/269/base 2025-10-10T00:44:17.1178472Z * [new branch] gh/ydwu4/269/head -> origin/gh/ydwu4/269/head 2025-10-10T00:44:17.1180245Z * [new branch] gh/ydwu4/269/orig -> origin/gh/ydwu4/269/orig 2025-10-10T00:44:17.1182860Z * [new branch] gh/ydwu4/270/base -> origin/gh/ydwu4/270/base 2025-10-10T00:44:17.1184770Z * [new branch] gh/ydwu4/270/head -> origin/gh/ydwu4/270/head 2025-10-10T00:44:17.1186856Z * [new branch] gh/ydwu4/270/orig -> origin/gh/ydwu4/270/orig 2025-10-10T00:44:17.1189272Z * [new branch] gh/ydwu4/272/base -> origin/gh/ydwu4/272/base 2025-10-10T00:44:17.1191245Z * [new branch] gh/ydwu4/272/head -> origin/gh/ydwu4/272/head 2025-10-10T00:44:17.1193179Z * [new branch] gh/ydwu4/272/orig -> origin/gh/ydwu4/272/orig 2025-10-10T00:44:17.1195532Z * [new branch] gh/ydwu4/275/base -> origin/gh/ydwu4/275/base 2025-10-10T00:44:17.1197676Z * [new branch] gh/ydwu4/275/head -> origin/gh/ydwu4/275/head 2025-10-10T00:44:17.1199362Z * [new branch] gh/ydwu4/275/orig -> origin/gh/ydwu4/275/orig 2025-10-10T00:44:17.1202001Z * [new branch] gh/ydwu4/276/base -> origin/gh/ydwu4/276/base 2025-10-10T00:44:17.1203853Z * [new branch] gh/ydwu4/276/head -> origin/gh/ydwu4/276/head 2025-10-10T00:44:17.1205800Z * [new branch] gh/ydwu4/276/orig -> origin/gh/ydwu4/276/orig 2025-10-10T00:44:17.1208750Z * [new branch] gh/ydwu4/283/base -> origin/gh/ydwu4/283/base 2025-10-10T00:44:17.1210539Z * [new branch] gh/ydwu4/283/head -> origin/gh/ydwu4/283/head 2025-10-10T00:44:17.1212336Z * [new branch] gh/ydwu4/283/orig -> origin/gh/ydwu4/283/orig 2025-10-10T00:44:17.1214951Z * [new branch] gh/ydwu4/292/base -> origin/gh/ydwu4/292/base 2025-10-10T00:44:17.1216783Z * [new branch] gh/ydwu4/292/head -> origin/gh/ydwu4/292/head 2025-10-10T00:44:17.1218682Z * [new branch] gh/ydwu4/292/orig -> origin/gh/ydwu4/292/orig 2025-10-10T00:44:17.1221808Z * [new branch] gh/ydwu4/294/base -> origin/gh/ydwu4/294/base 2025-10-10T00:44:17.1223651Z * [new branch] gh/ydwu4/294/head -> origin/gh/ydwu4/294/head 2025-10-10T00:44:17.1225421Z * [new branch] gh/ydwu4/294/orig -> origin/gh/ydwu4/294/orig 2025-10-10T00:44:17.1228184Z * [new branch] gh/ydwu4/295/base -> origin/gh/ydwu4/295/base 2025-10-10T00:44:17.1230099Z * [new branch] gh/ydwu4/295/head -> origin/gh/ydwu4/295/head 2025-10-10T00:44:17.1232119Z * [new branch] gh/ydwu4/295/orig -> origin/gh/ydwu4/295/orig 2025-10-10T00:44:17.1234633Z * [new branch] gh/ydwu4/296/base -> origin/gh/ydwu4/296/base 2025-10-10T00:44:17.1236501Z * [new branch] gh/ydwu4/296/head -> origin/gh/ydwu4/296/head 2025-10-10T00:44:17.1238285Z * [new branch] gh/ydwu4/296/orig -> origin/gh/ydwu4/296/orig 2025-10-10T00:44:17.1240909Z * [new branch] gh/ydwu4/306/base -> origin/gh/ydwu4/306/base 2025-10-10T00:44:17.1242901Z * [new branch] gh/ydwu4/306/head -> origin/gh/ydwu4/306/head 2025-10-10T00:44:17.1244853Z * [new branch] gh/ydwu4/306/orig -> origin/gh/ydwu4/306/orig 2025-10-10T00:44:17.1247481Z * [new branch] gh/ydwu4/312/base -> origin/gh/ydwu4/312/base 2025-10-10T00:44:17.1249412Z * [new branch] gh/ydwu4/312/head -> origin/gh/ydwu4/312/head 2025-10-10T00:44:17.1251265Z * [new branch] gh/ydwu4/312/orig -> origin/gh/ydwu4/312/orig 2025-10-10T00:44:17.1254059Z * [new branch] gh/ydwu4/318/base -> origin/gh/ydwu4/318/base 2025-10-10T00:44:17.1255967Z * [new branch] gh/ydwu4/318/head -> origin/gh/ydwu4/318/head 2025-10-10T00:44:17.1257976Z * [new branch] gh/ydwu4/318/orig -> origin/gh/ydwu4/318/orig 2025-10-10T00:44:17.1260549Z * [new branch] gh/ydwu4/319/base -> origin/gh/ydwu4/319/base 2025-10-10T00:44:17.1262619Z * [new branch] gh/ydwu4/319/head -> origin/gh/ydwu4/319/head 2025-10-10T00:44:17.1264482Z * [new branch] gh/ydwu4/319/orig -> origin/gh/ydwu4/319/orig 2025-10-10T00:44:17.1267579Z * [new branch] gh/ydwu4/320/base -> origin/gh/ydwu4/320/base 2025-10-10T00:44:17.1269426Z * [new branch] gh/ydwu4/320/head -> origin/gh/ydwu4/320/head 2025-10-10T00:44:17.1271693Z * [new branch] gh/ydwu4/320/orig -> origin/gh/ydwu4/320/orig 2025-10-10T00:44:17.1274422Z * [new branch] gh/ydwu4/321/base -> origin/gh/ydwu4/321/base 2025-10-10T00:44:17.1276431Z * [new branch] gh/ydwu4/321/head -> origin/gh/ydwu4/321/head 2025-10-10T00:44:17.1278151Z * [new branch] gh/ydwu4/321/orig -> origin/gh/ydwu4/321/orig 2025-10-10T00:44:17.1280765Z * [new branch] gh/ydwu4/322/base -> origin/gh/ydwu4/322/base 2025-10-10T00:44:17.1282609Z * [new branch] gh/ydwu4/322/head -> origin/gh/ydwu4/322/head 2025-10-10T00:44:17.1284526Z * [new branch] gh/ydwu4/322/orig -> origin/gh/ydwu4/322/orig 2025-10-10T00:44:17.1287115Z * [new branch] gh/ydwu4/324/base -> origin/gh/ydwu4/324/base 2025-10-10T00:44:17.1289263Z * [new branch] gh/ydwu4/324/head -> origin/gh/ydwu4/324/head 2025-10-10T00:44:17.1291089Z * [new branch] gh/ydwu4/324/orig -> origin/gh/ydwu4/324/orig 2025-10-10T00:44:17.1294255Z * [new branch] gh/ydwu4/325/base -> origin/gh/ydwu4/325/base 2025-10-10T00:44:17.1296025Z * [new branch] gh/ydwu4/325/head -> origin/gh/ydwu4/325/head 2025-10-10T00:44:17.1297881Z * [new branch] gh/ydwu4/325/orig -> origin/gh/ydwu4/325/orig 2025-10-10T00:44:17.1300952Z * [new branch] gh/ydwu4/326/base -> origin/gh/ydwu4/326/base 2025-10-10T00:44:17.1302929Z * [new branch] gh/ydwu4/326/head -> origin/gh/ydwu4/326/head 2025-10-10T00:44:17.1304809Z * [new branch] gh/ydwu4/326/orig -> origin/gh/ydwu4/326/orig 2025-10-10T00:44:17.1307430Z * [new branch] gh/ydwu4/327/base -> origin/gh/ydwu4/327/base 2025-10-10T00:44:17.1309346Z * [new branch] gh/ydwu4/327/head -> origin/gh/ydwu4/327/head 2025-10-10T00:44:17.1311260Z * [new branch] gh/ydwu4/327/orig -> origin/gh/ydwu4/327/orig 2025-10-10T00:44:17.1314028Z * [new branch] gh/ydwu4/328/base -> origin/gh/ydwu4/328/base 2025-10-10T00:44:17.1315972Z * [new branch] gh/ydwu4/328/head -> origin/gh/ydwu4/328/head 2025-10-10T00:44:17.1317774Z * [new branch] gh/ydwu4/328/orig -> origin/gh/ydwu4/328/orig 2025-10-10T00:44:17.1320184Z * [new branch] gh/ydwu4/329/base -> origin/gh/ydwu4/329/base 2025-10-10T00:44:17.1322107Z * [new branch] gh/ydwu4/329/head -> origin/gh/ydwu4/329/head 2025-10-10T00:44:17.1323969Z * [new branch] gh/ydwu4/329/orig -> origin/gh/ydwu4/329/orig 2025-10-10T00:44:17.1326582Z * [new branch] gh/ydwu4/330/base -> origin/gh/ydwu4/330/base 2025-10-10T00:44:17.1328590Z * [new branch] gh/ydwu4/330/head -> origin/gh/ydwu4/330/head 2025-10-10T00:44:17.1330521Z * [new branch] gh/ydwu4/330/orig -> origin/gh/ydwu4/330/orig 2025-10-10T00:44:17.1332951Z * [new branch] gh/ydwu4/331/base -> origin/gh/ydwu4/331/base 2025-10-10T00:44:17.1334777Z * [new branch] gh/ydwu4/331/head -> origin/gh/ydwu4/331/head 2025-10-10T00:44:17.1336875Z * [new branch] gh/ydwu4/331/orig -> origin/gh/ydwu4/331/orig 2025-10-10T00:44:17.1339313Z * [new branch] gh/ydwu4/332/base -> origin/gh/ydwu4/332/base 2025-10-10T00:44:17.1341116Z * [new branch] gh/ydwu4/332/head -> origin/gh/ydwu4/332/head 2025-10-10T00:44:17.1342969Z * [new branch] gh/ydwu4/332/orig -> origin/gh/ydwu4/332/orig 2025-10-10T00:44:17.1345427Z * [new branch] gh/ydwu4/333/base -> origin/gh/ydwu4/333/base 2025-10-10T00:44:17.1347392Z * [new branch] gh/ydwu4/333/head -> origin/gh/ydwu4/333/head 2025-10-10T00:44:17.1349213Z * [new branch] gh/ydwu4/333/orig -> origin/gh/ydwu4/333/orig 2025-10-10T00:44:17.1351716Z * [new branch] gh/ydwu4/334/base -> origin/gh/ydwu4/334/base 2025-10-10T00:44:17.1353964Z * [new branch] gh/ydwu4/334/head -> origin/gh/ydwu4/334/head 2025-10-10T00:44:17.1355653Z * [new branch] gh/ydwu4/334/orig -> origin/gh/ydwu4/334/orig 2025-10-10T00:44:17.1358202Z * [new branch] gh/ydwu4/335/base -> origin/gh/ydwu4/335/base 2025-10-10T00:44:17.1359991Z * [new branch] gh/ydwu4/335/head -> origin/gh/ydwu4/335/head 2025-10-10T00:44:17.1361923Z * [new branch] gh/ydwu4/335/orig -> origin/gh/ydwu4/335/orig 2025-10-10T00:44:17.1364820Z * [new branch] gh/ydwu4/336/base -> origin/gh/ydwu4/336/base 2025-10-10T00:44:17.1366300Z * [new branch] gh/ydwu4/336/head -> origin/gh/ydwu4/336/head 2025-10-10T00:44:17.1368208Z * [new branch] gh/ydwu4/336/orig -> origin/gh/ydwu4/336/orig 2025-10-10T00:44:17.1371245Z * [new branch] gh/ydwu4/337/base -> origin/gh/ydwu4/337/base 2025-10-10T00:44:17.1373080Z * [new branch] gh/ydwu4/337/head -> origin/gh/ydwu4/337/head 2025-10-10T00:44:17.1374927Z * [new branch] gh/ydwu4/337/orig -> origin/gh/ydwu4/337/orig 2025-10-10T00:44:17.1378055Z * [new branch] gh/yf225/133/base -> origin/gh/yf225/133/base 2025-10-10T00:44:17.1380429Z * [new branch] gh/yf225/133/head -> origin/gh/yf225/133/head 2025-10-10T00:44:17.1383041Z * [new branch] gh/yf225/93/base -> origin/gh/yf225/93/base 2025-10-10T00:44:17.1384828Z * [new branch] gh/yf225/93/head -> origin/gh/yf225/93/head 2025-10-10T00:44:17.1388720Z * [new branch] gh/yifuwang/152/base -> origin/gh/yifuwang/152/base 2025-10-10T00:44:17.1390804Z * [new branch] gh/yifuwang/152/head -> origin/gh/yifuwang/152/head 2025-10-10T00:44:17.1392693Z * [new branch] gh/yifuwang/152/orig -> origin/gh/yifuwang/152/orig 2025-10-10T00:44:17.1395318Z * [new branch] gh/yifuwang/195/base -> origin/gh/yifuwang/195/base 2025-10-10T00:44:17.1397258Z * [new branch] gh/yifuwang/195/head -> origin/gh/yifuwang/195/head 2025-10-10T00:44:17.1399383Z * [new branch] gh/yifuwang/195/orig -> origin/gh/yifuwang/195/orig 2025-10-10T00:44:17.1403914Z * [new branch] gh/yiming0416/1/base -> origin/gh/yiming0416/1/base 2025-10-10T00:44:17.1405710Z * [new branch] gh/yiming0416/1/head -> origin/gh/yiming0416/1/head 2025-10-10T00:44:17.1408278Z * [new branch] gh/yiming0416/2/base -> origin/gh/yiming0416/2/base 2025-10-10T00:44:17.1410083Z * [new branch] gh/yiming0416/2/head -> origin/gh/yiming0416/2/head 2025-10-10T00:44:17.1413414Z * [new branch] gh/ysiraichi/88/base -> origin/gh/ysiraichi/88/base 2025-10-10T00:44:17.1415213Z * [new branch] gh/ysiraichi/88/head -> origin/gh/ysiraichi/88/head 2025-10-10T00:44:17.1417115Z * [new branch] gh/ysiraichi/88/orig -> origin/gh/ysiraichi/88/orig 2025-10-10T00:44:17.1420342Z * [new branch] gh/zhxchen17/25/base -> origin/gh/zhxchen17/25/base 2025-10-10T00:44:17.1422213Z * [new branch] gh/zhxchen17/25/head -> origin/gh/zhxchen17/25/head 2025-10-10T00:44:17.1424077Z * [new branch] gh/zhxchen17/25/orig -> origin/gh/zhxchen17/25/orig 2025-10-10T00:44:17.1426793Z * [new branch] gh/zhxchen17/31/base -> origin/gh/zhxchen17/31/base 2025-10-10T00:44:17.1428719Z * [new branch] gh/zhxchen17/31/head -> origin/gh/zhxchen17/31/head 2025-10-10T00:44:17.1430621Z * [new branch] gh/zhxchen17/31/orig -> origin/gh/zhxchen17/31/orig 2025-10-10T00:44:17.1433155Z * [new branch] gh/zhxchen17/34/base -> origin/gh/zhxchen17/34/base 2025-10-10T00:44:17.1435089Z * [new branch] gh/zhxchen17/34/head -> origin/gh/zhxchen17/34/head 2025-10-10T00:44:17.1437616Z * [new branch] gh/zhxchen17/35/base -> origin/gh/zhxchen17/35/base 2025-10-10T00:44:17.1439374Z * [new branch] gh/zhxchen17/35/head -> origin/gh/zhxchen17/35/head 2025-10-10T00:44:17.1442550Z * [new branch] gh/zklaus/10/base -> origin/gh/zklaus/10/base 2025-10-10T00:44:17.1444443Z * [new branch] gh/zklaus/10/head -> origin/gh/zklaus/10/head 2025-10-10T00:44:17.1446218Z * [new branch] gh/zklaus/10/orig -> origin/gh/zklaus/10/orig 2025-10-10T00:44:17.1448927Z * [new branch] gh/zklaus/11/base -> origin/gh/zklaus/11/base 2025-10-10T00:44:17.1450890Z * [new branch] gh/zklaus/11/head -> origin/gh/zklaus/11/head 2025-10-10T00:44:17.1452730Z * [new branch] gh/zklaus/11/orig -> origin/gh/zklaus/11/orig 2025-10-10T00:44:17.1455298Z * [new branch] gh/zklaus/15/base -> origin/gh/zklaus/15/base 2025-10-10T00:44:17.1457244Z * [new branch] gh/zklaus/15/head -> origin/gh/zklaus/15/head 2025-10-10T00:44:17.1459186Z * [new branch] gh/zklaus/15/orig -> origin/gh/zklaus/15/orig 2025-10-10T00:44:17.1461706Z * [new branch] gh/zklaus/16/base -> origin/gh/zklaus/16/base 2025-10-10T00:44:17.1463567Z * [new branch] gh/zklaus/16/head -> origin/gh/zklaus/16/head 2025-10-10T00:44:17.1465458Z * [new branch] gh/zklaus/16/orig -> origin/gh/zklaus/16/orig 2025-10-10T00:44:17.1467957Z * [new branch] gh/zklaus/17/base -> origin/gh/zklaus/17/base 2025-10-10T00:44:17.1469814Z * [new branch] gh/zklaus/17/head -> origin/gh/zklaus/17/head 2025-10-10T00:44:17.1471712Z * [new branch] gh/zklaus/17/orig -> origin/gh/zklaus/17/orig 2025-10-10T00:44:17.1474277Z * [new branch] gh/zklaus/18/base -> origin/gh/zklaus/18/base 2025-10-10T00:44:17.1476131Z * [new branch] gh/zklaus/18/head -> origin/gh/zklaus/18/head 2025-10-10T00:44:17.1477989Z * [new branch] gh/zklaus/18/orig -> origin/gh/zklaus/18/orig 2025-10-10T00:44:17.1480450Z * [new branch] gh/zklaus/19/base -> origin/gh/zklaus/19/base 2025-10-10T00:44:17.1482314Z * [new branch] gh/zklaus/19/head -> origin/gh/zklaus/19/head 2025-10-10T00:44:17.1484314Z * [new branch] gh/zklaus/19/orig -> origin/gh/zklaus/19/orig 2025-10-10T00:44:17.1486939Z * [new branch] gh/zklaus/7/base -> origin/gh/zklaus/7/base 2025-10-10T00:44:17.1489101Z * [new branch] gh/zklaus/7/head -> origin/gh/zklaus/7/head 2025-10-10T00:44:17.1490892Z * [new branch] gh/zklaus/7/orig -> origin/gh/zklaus/7/orig 2025-10-10T00:44:17.1494433Z * [new branch] gh/zou3519/1177/base -> origin/gh/zou3519/1177/base 2025-10-10T00:44:17.1496424Z * [new branch] gh/zou3519/1177/head -> origin/gh/zou3519/1177/head 2025-10-10T00:44:17.1498485Z * [new branch] gh/zou3519/1177/orig -> origin/gh/zou3519/1177/orig 2025-10-10T00:44:17.1501293Z * [new branch] gh/zou3519/1195/base -> origin/gh/zou3519/1195/base 2025-10-10T00:44:17.1503243Z * [new branch] gh/zou3519/1195/head -> origin/gh/zou3519/1195/head 2025-10-10T00:44:17.1505037Z * [new branch] gh/zou3519/1195/orig -> origin/gh/zou3519/1195/orig 2025-10-10T00:44:17.1507789Z * [new branch] gh/zou3519/1196/base -> origin/gh/zou3519/1196/base 2025-10-10T00:44:17.1509652Z * [new branch] gh/zou3519/1196/head -> origin/gh/zou3519/1196/head 2025-10-10T00:44:17.1511602Z * [new branch] gh/zou3519/1196/orig -> origin/gh/zou3519/1196/orig 2025-10-10T00:44:17.1514600Z * [new branch] gh/zou3519/1197/base -> origin/gh/zou3519/1197/base 2025-10-10T00:44:17.1516232Z * [new branch] gh/zou3519/1197/head -> origin/gh/zou3519/1197/head 2025-10-10T00:44:17.1518090Z * [new branch] gh/zou3519/1197/orig -> origin/gh/zou3519/1197/orig 2025-10-10T00:44:17.1521004Z * [new branch] gh/zou3519/1198/base -> origin/gh/zou3519/1198/base 2025-10-10T00:44:17.1522860Z * [new branch] gh/zou3519/1198/head -> origin/gh/zou3519/1198/head 2025-10-10T00:44:17.1524793Z * [new branch] gh/zou3519/1198/orig -> origin/gh/zou3519/1198/orig 2025-10-10T00:44:17.1527195Z * [new branch] gh/zou3519/1199/base -> origin/gh/zou3519/1199/base 2025-10-10T00:44:17.1529296Z * [new branch] gh/zou3519/1199/head -> origin/gh/zou3519/1199/head 2025-10-10T00:44:17.1531183Z * [new branch] gh/zou3519/1199/orig -> origin/gh/zou3519/1199/orig 2025-10-10T00:44:17.1533819Z * [new branch] gh/zou3519/1200/base -> origin/gh/zou3519/1200/base 2025-10-10T00:44:17.1535732Z * [new branch] gh/zou3519/1200/head -> origin/gh/zou3519/1200/head 2025-10-10T00:44:17.1537569Z * [new branch] gh/zou3519/1200/orig -> origin/gh/zou3519/1200/orig 2025-10-10T00:44:17.1540048Z * [new branch] gh/zou3519/1201/base -> origin/gh/zou3519/1201/base 2025-10-10T00:44:17.1542014Z * [new branch] gh/zou3519/1201/head -> origin/gh/zou3519/1201/head 2025-10-10T00:44:17.1543859Z * [new branch] gh/zou3519/1201/orig -> origin/gh/zou3519/1201/orig 2025-10-10T00:44:17.1547004Z * [new branch] gh/zpcore/1/base -> origin/gh/zpcore/1/base 2025-10-10T00:44:17.1548838Z * [new branch] gh/zpcore/1/head -> origin/gh/zpcore/1/head 2025-10-10T00:44:17.1551468Z * [new branch] gh/zpcore/11/base -> origin/gh/zpcore/11/base 2025-10-10T00:44:17.1553404Z * [new branch] gh/zpcore/11/head -> origin/gh/zpcore/11/head 2025-10-10T00:44:17.1555279Z * [new branch] gh/zpcore/11/orig -> origin/gh/zpcore/11/orig 2025-10-10T00:44:17.1558125Z * [new branch] gh/zpcore/12/base -> origin/gh/zpcore/12/base 2025-10-10T00:44:17.1560101Z * [new branch] gh/zpcore/12/head -> origin/gh/zpcore/12/head 2025-10-10T00:44:17.1561965Z * [new branch] gh/zpcore/12/orig -> origin/gh/zpcore/12/orig 2025-10-10T00:44:17.1564555Z * [new branch] gh/zpcore/13/base -> origin/gh/zpcore/13/base 2025-10-10T00:44:17.1566424Z * [new branch] gh/zpcore/13/head -> origin/gh/zpcore/13/head 2025-10-10T00:44:17.1568702Z * [new branch] gh/zpcore/13/orig -> origin/gh/zpcore/13/orig 2025-10-10T00:44:17.1571221Z * [new branch] gh/zpcore/14/base -> origin/gh/zpcore/14/base 2025-10-10T00:44:17.1573153Z * [new branch] gh/zpcore/14/head -> origin/gh/zpcore/14/head 2025-10-10T00:44:17.1575083Z * [new branch] gh/zpcore/14/orig -> origin/gh/zpcore/14/orig 2025-10-10T00:44:17.1577845Z * [new branch] gh/zpcore/15/base -> origin/gh/zpcore/15/base 2025-10-10T00:44:17.1579706Z * [new branch] gh/zpcore/15/head -> origin/gh/zpcore/15/head 2025-10-10T00:44:17.1581624Z * [new branch] gh/zpcore/15/orig -> origin/gh/zpcore/15/orig 2025-10-10T00:44:17.1584192Z * [new branch] gh/zpcore/16/base -> origin/gh/zpcore/16/base 2025-10-10T00:44:17.1586194Z * [new branch] gh/zpcore/16/head -> origin/gh/zpcore/16/head 2025-10-10T00:44:17.1588033Z * [new branch] gh/zpcore/16/orig -> origin/gh/zpcore/16/orig 2025-10-10T00:44:17.1590644Z * [new branch] gh/zpcore/17/base -> origin/gh/zpcore/17/base 2025-10-10T00:44:17.1592509Z * [new branch] gh/zpcore/17/head -> origin/gh/zpcore/17/head 2025-10-10T00:44:17.1594441Z * [new branch] gh/zpcore/17/orig -> origin/gh/zpcore/17/orig 2025-10-10T00:44:17.1597007Z * [new branch] gh/zpcore/18/base -> origin/gh/zpcore/18/base 2025-10-10T00:44:17.1598886Z * [new branch] gh/zpcore/18/head -> origin/gh/zpcore/18/head 2025-10-10T00:44:17.1601069Z * [new branch] gh/zpcore/18/orig -> origin/gh/zpcore/18/orig 2025-10-10T00:44:17.1603690Z * [new branch] gh/zpcore/19/base -> origin/gh/zpcore/19/base 2025-10-10T00:44:17.1605481Z * [new branch] gh/zpcore/19/head -> origin/gh/zpcore/19/head 2025-10-10T00:44:17.1607406Z * [new branch] gh/zpcore/19/orig -> origin/gh/zpcore/19/orig 2025-10-10T00:44:17.1610108Z * [new branch] gh/zpcore/2/base -> origin/gh/zpcore/2/base 2025-10-10T00:44:17.1612300Z * [new branch] gh/zpcore/2/head -> origin/gh/zpcore/2/head 2025-10-10T00:44:17.1614870Z * [new branch] gh/zpcore/20/base -> origin/gh/zpcore/20/base 2025-10-10T00:44:17.1617036Z * [new branch] gh/zpcore/20/head -> origin/gh/zpcore/20/head 2025-10-10T00:44:17.1618590Z * [new branch] gh/zpcore/20/orig -> origin/gh/zpcore/20/orig 2025-10-10T00:44:17.1621564Z * [new branch] gh/zpcore/21/base -> origin/gh/zpcore/21/base 2025-10-10T00:44:17.1623514Z * [new branch] gh/zpcore/21/head -> origin/gh/zpcore/21/head 2025-10-10T00:44:17.1625963Z * [new branch] gh/zpcore/21/orig -> origin/gh/zpcore/21/orig 2025-10-10T00:44:17.1628386Z * [new branch] gh/zpcore/3/base -> origin/gh/zpcore/3/base 2025-10-10T00:44:17.1630157Z * [new branch] gh/zpcore/3/head -> origin/gh/zpcore/3/head 2025-10-10T00:44:17.1632510Z * [new branch] gh/zpcore/4/base -> origin/gh/zpcore/4/base 2025-10-10T00:44:17.1634392Z * [new branch] gh/zpcore/4/head -> origin/gh/zpcore/4/head 2025-10-10T00:44:17.1636897Z * [new branch] gh/zpcore/5/base -> origin/gh/zpcore/5/base 2025-10-10T00:44:17.1638710Z * [new branch] gh/zpcore/5/head -> origin/gh/zpcore/5/head 2025-10-10T00:44:17.1641113Z * [new branch] gh/zpcore/6/base -> origin/gh/zpcore/6/base 2025-10-10T00:44:17.1642935Z * [new branch] gh/zpcore/6/head -> origin/gh/zpcore/6/head 2025-10-10T00:44:17.1645337Z * [new branch] gh/zpcore/7/base -> origin/gh/zpcore/7/base 2025-10-10T00:44:17.1647203Z * [new branch] gh/zpcore/7/head -> origin/gh/zpcore/7/head 2025-10-10T00:44:17.1649711Z * [new branch] gh/zpcore/8/base -> origin/gh/zpcore/8/base 2025-10-10T00:44:17.1651552Z * [new branch] gh/zpcore/8/head -> origin/gh/zpcore/8/head 2025-10-10T00:44:17.1653759Z * [new branch] google-main -> origin/google-main 2025-10-10T00:44:17.1655747Z * [new branch] greencontext -> origin/greencontext 2025-10-10T00:44:17.1658730Z * [new branch] guangyey/config -> origin/guangyey/config 2025-10-10T00:44:17.1660644Z * [new branch] guangyey/external_stream -> origin/guangyey/external_stream 2025-10-10T00:44:17.1662457Z * [new branch] guangyey/reimport -> origin/guangyey/reimport 2025-10-10T00:44:17.1664275Z * [new branch] guangyey/test_2025 -> origin/guangyey/test_2025 2025-10-10T00:44:17.1667044Z * [new branch] guilhermeleobas/cherry-pick-55d87d9dfd9 -> origin/guilhermeleobas/cherry-pick-55d87d9dfd9 2025-10-10T00:44:17.1669773Z * [new branch] hameerabbasi/gradcheck-allclose -> origin/hameerabbasi/gradcheck-allclose 2025-10-10T00:44:17.1672150Z * [new branch] haozhe/bf16-dynamic-shape -> origin/haozhe/bf16-dynamic-shape 2025-10-10T00:44:17.1674030Z * [new branch] hc_baseline -> origin/hc_baseline 2025-10-10T00:44:17.1676068Z * [new branch] hhh_decomp_mul -> origin/hhh_decomp_mul 2025-10-10T00:44:17.1677999Z * [new branch] hhh_rand -> origin/hhh_rand 2025-10-10T00:44:17.1680676Z * [new branch] hoy/triton-PR3973 -> origin/hoy/triton-PR3973 2025-10-10T00:44:17.1683232Z * [new branch] huba/debug_mode -> origin/huba/debug_mode 2025-10-10T00:44:17.1685085Z * [new branch] huba/dtensor_equal -> origin/huba/dtensor_equal 2025-10-10T00:44:17.1686919Z * [new branch] huba/f1 -> origin/huba/f1 2025-10-10T00:44:17.1689149Z * [new branch] huba/local_tensor -> origin/huba/local_tensor 2025-10-10T00:44:17.1691131Z * [new branch] ideep-update -> origin/ideep-update 2025-10-10T00:44:17.1693419Z * [new branch] increase-asan-build-memory -> origin/increase-asan-build-memory 2025-10-10T00:44:17.1695436Z * [new branch] inductor-perf-increase-timeout -> origin/inductor-perf-increase-timeout 2025-10-10T00:44:17.1697253Z * [new branch] inductordecompfix -> origin/inductordecompfix 2025-10-10T00:44:17.1699481Z * [new branch] inline -> origin/inline 2025-10-10T00:44:17.1701764Z * [new branch] inlining -> origin/inlining 2025-10-10T00:44:17.1703793Z * [new branch] inlining-ezyang -> origin/inlining-ezyang 2025-10-10T00:44:17.1705675Z * [new branch] install-torchao-0.13.0 -> origin/install-torchao-0.13.0 2025-10-10T00:44:17.1707552Z * [new branch] install_free_tensors -> origin/install_free_tensors 2025-10-10T00:44:17.1710003Z * [new branch] int8_sdpa -> origin/int8_sdpa 2025-10-10T00:44:17.1712061Z * [new branch] invoke-subgraph -> origin/invoke-subgraph 2025-10-10T00:44:17.1714223Z * [new branch] issue#58739 -> origin/issue#58739 2025-10-10T00:44:17.1716173Z * [new branch] issue-161010-dynamo-stride-clone -> origin/issue-161010-dynamo-stride-clone 2025-10-10T00:44:17.1718824Z * [new branch] jathu/o3 -> origin/jathu/o3 2025-10-10T00:44:17.1720512Z * [new branch] jathu/sve -> origin/jathu/sve 2025-10-10T00:44:17.1723243Z * [new branch] jcaip/test-cusparselt-version-0.6.2 -> origin/jcaip/test-cusparselt-version-0.6.2 2025-10-10T00:44:17.1725249Z * [new branch] jcaip/update-cusparselt-0.6.2 -> origin/jcaip/update-cusparselt-0.6.2 2025-10-10T00:44:17.1727031Z * [new branch] jeanschmidt-patch-1 -> origin/jeanschmidt-patch-1 2025-10-10T00:44:17.1729226Z * [new branch] jerryzh168-patch-1 -> origin/jerryzh168-patch-1 2025-10-10T00:44:17.1731404Z * [new branch] jithunnair-amd-patch-1 -> origin/jithunnair-amd-patch-1 2025-10-10T00:44:17.1733160Z * [new branch] jithunnair-amd-patch-2 -> origin/jithunnair-amd-patch-2 2025-10-10T00:44:17.1735113Z * [new branch] jithunnair-amd-patch-3 -> origin/jithunnair-amd-patch-3 2025-10-10T00:44:17.1737140Z * [new branch] jithunnair-amd-patch-4 -> origin/jithunnair-amd-patch-4 2025-10-10T00:44:17.1739830Z * [new branch] justinchu/allowlist-api-onnx -> origin/justinchu/allowlist-api-onnx 2025-10-10T00:44:17.1741653Z * [new branch] justinchu/attention-tests -> origin/justinchu/attention-tests 2025-10-10T00:44:17.1743424Z * [new branch] justinchu/native-qdq -> origin/justinchu/native-qdq 2025-10-10T00:44:17.1746346Z * [new branch] justinchuby/typo-error -> origin/justinchuby/typo-error 2025-10-10T00:44:17.1748810Z * [new branch] kainan666/xlf_debug -> origin/kainan666/xlf_debug 2025-10-10T00:44:17.1750888Z * [new branch] kainan_test -> origin/kainan_test 2025-10-10T00:44:17.1754363Z * [new branch] leslie/test_group_gemm_epilogues -> origin/leslie/test_group_gemm_epilogues 2025-10-10T00:44:17.1756769Z * [new branch] lessw2020/fix_cutlass_cache_error -> origin/lessw2020/fix_cutlass_cache_error 2025-10-10T00:44:17.1759250Z * [new branch] liaoxuan/shm_all_reduce -> origin/liaoxuan/shm_all_reduce 2025-10-10T00:44:17.1761174Z * [new branch] liaoxuan/test_fa_disable_softmax -> origin/liaoxuan/test_fa_disable_softmax 2025-10-10T00:44:17.1762840Z * [new branch] liaoxuan/test_int8_sdpa -> origin/liaoxuan/test_int8_sdpa 2025-10-10T00:44:17.1764824Z * [new branch] libtorch_free_so -> origin/libtorch_free_so 2025-10-10T00:44:17.1766894Z * [new branch] lintbuilddocker -> origin/lintbuilddocker 2025-10-10T00:44:17.1769390Z * [new branch] llama4-stable -> origin/llama4-stable 2025-10-10T00:44:17.1771354Z * [new branch] logdetfix -> origin/logdetfix 2025-10-10T00:44:17.1773502Z * [new branch] logsumexp -> origin/logsumexp 2025-10-10T00:44:17.1776773Z * [new branch] lts/release/1.8 -> origin/lts/release/1.8 2025-10-10T00:44:17.1779283Z * [new branch] lucaskabela/#94773 -> origin/lucaskabela/#94773 2025-10-10T00:44:17.1781284Z * [new branch] lucaskabela/cherrypick_163769 -> origin/lucaskabela/cherrypick_163769 2025-10-10T00:44:17.1783108Z * [new branch] lucaskabela/fix_164814 -> origin/lucaskabela/fix_164814 2025-10-10T00:44:17.1784888Z * [new branch] lucaskabela/fix_164823 -> origin/lucaskabela/fix_164823 2025-10-10T00:44:17.1786621Z * [new branch] lucaskabela/fix_164875 -> origin/lucaskabela/fix_164875 2025-10-10T00:44:17.1788457Z * [new branch] lucaskabela/flop_counter -> origin/lucaskabela/flop_counter 2025-10-10T00:44:17.1790217Z * [new branch] lucaskabela/func_under_decomp -> origin/lucaskabela/func_under_decomp 2025-10-10T00:44:17.1792438Z * [new branch] lucaskabela/functional_in_dynamo -> origin/lucaskabela/functional_in_dynamo 2025-10-10T00:44:17.1794821Z * [new branch] lucaskabela/install_params_as_graph_attr -> origin/lucaskabela/install_params_as_graph_attr 2025-10-10T00:44:17.1796750Z * [new branch] lucaskabela/parameters_as_graph_attr -> origin/lucaskabela/parameters_as_graph_attr 2025-10-10T00:44:17.1798809Z * [new branch] lucaskabela/remove_aot_dispatcher_metadata -> origin/lucaskabela/remove_aot_dispatcher_metadata 2025-10-10T00:44:17.1803248Z * [new branch] lucaskabela/rnn_decomp -> origin/lucaskabela/rnn_decomp 2025-10-10T00:44:17.1805184Z * [new branch] lucaskabela/typing_backends -> origin/lucaskabela/typing_backends 2025-10-10T00:44:17.1807289Z * [new branch] main -> origin/main 2025-10-10T00:44:17.1809494Z * [new branch] main-enable-b200-distributed-tests -> origin/main-enable-b200-distributed-tests 2025-10-10T00:44:17.1811359Z * [new branch] main-enable-b200-symm-mem-test -> origin/main-enable-b200-symm-mem-test 2025-10-10T00:44:17.1813513Z * [new branch] malfet-patch-1 -> origin/malfet-patch-1 2025-10-10T00:44:17.1815656Z * [new branch] malfet-patch-14 -> origin/malfet-patch-14 2025-10-10T00:44:17.1817599Z * [new branch] malfet-patch-2 -> origin/malfet-patch-2 2025-10-10T00:44:17.1819658Z * [new branch] malfet-patch-3 -> origin/malfet-patch-3 2025-10-10T00:44:17.1822169Z * [new branch] malfet-patch-4 -> origin/malfet-patch-4 2025-10-10T00:44:17.1823987Z * [new branch] malfet-patch-5 -> origin/malfet-patch-5 2025-10-10T00:44:17.1825889Z * [new branch] malfet-patch-6 -> origin/malfet-patch-6 2025-10-10T00:44:17.1827879Z * [new branch] malfet-patch-7 -> origin/malfet-patch-7 2025-10-10T00:44:17.1829943Z * [new branch] malfet-patch-8 -> origin/malfet-patch-8 2025-10-10T00:44:17.1832018Z * [new branch] malfet-patch-9 -> origin/malfet-patch-9 2025-10-10T00:44:17.1834979Z * [new branch] malfet/be-move-more-settings-to-checkout-pytorch -> origin/malfet/be-move-more-settings-to-checkout-pytorch 2025-10-10T00:44:17.1836622Z * [new branch] malfet/mps-implement-col2im -> origin/malfet/mps-implement-col2im 2025-10-10T00:44:17.1839244Z * [new branch] manuel/aoti_metal_shimify-thread_safe -> origin/manuel/aoti_metal_shimify-thread_safe 2025-10-10T00:44:17.1841020Z * [new branch] manuel/test-ops-common-allow-mps -> origin/manuel/test-ops-common-allow-mps 2025-10-10T00:44:17.1843488Z * [new branch] masnesral/metaconda -> origin/masnesral/metaconda 2025-10-10T00:44:17.1845389Z * [new branch] masnesral/pt2_internal_logging -> origin/masnesral/pt2_internal_logging 2025-10-10T00:44:17.1847406Z * [new branch] metascroy-patch-1 -> origin/metascroy-patch-1 2025-10-10T00:44:17.1849455Z * [new branch] mingw_constant_buffer -> origin/mingw_constant_buffer 2025-10-10T00:44:17.1852530Z * [new branch] mlazos/S429861-debug -> origin/mlazos/S429861-debug 2025-10-10T00:44:17.1854301Z * [new branch] mlazos/aa -> origin/mlazos/aa 2025-10-10T00:44:17.1856680Z * [new branch] mlazos/acts -> origin/mlazos/acts 2025-10-10T00:44:17.1858319Z * [new branch] mlazos/arg-renames -> origin/mlazos/arg-renames 2025-10-10T00:44:17.1860111Z * [new branch] mlazos/backup-test-branch -> origin/mlazos/backup-test-branch 2025-10-10T00:44:17.1861993Z * [new branch] mlazos/bad-cudagraphs -> origin/mlazos/bad-cudagraphs 2025-10-10T00:44:17.1863805Z * [new branch] mlazos/baseline -> origin/mlazos/baseline 2025-10-10T00:44:17.1865989Z * [new branch] mlazos/baseline-graph-breaks -> origin/mlazos/baseline-graph-breaks 2025-10-10T00:44:17.1868110Z * [new branch] mlazos/beta-tensor -> origin/mlazos/beta-tensor 2025-10-10T00:44:17.1870299Z * [new branch] mlazos/buffers -> origin/mlazos/buffers 2025-10-10T00:44:17.1872110Z * [new branch] mlazos/buffers2 -> origin/mlazos/buffers2 2025-10-10T00:44:17.1874174Z * [new branch] mlazos/buffers3 -> origin/mlazos/buffers3 2025-10-10T00:44:17.1876480Z * [new branch] mlazos/ck2 -> origin/mlazos/ck2 2025-10-10T00:44:17.1878487Z * [new branch] mlazos/combokernels -> origin/mlazos/combokernels 2025-10-10T00:44:17.1880398Z * [new branch] mlazos/ctx-cleanup -> origin/mlazos/ctx-cleanup 2025-10-10T00:44:17.1882278Z * [new branch] mlazos/cuda-cmd-log -> origin/mlazos/cuda-cmd-log 2025-10-10T00:44:17.1884335Z * [new branch] mlazos/cudagraph-tests -> origin/mlazos/cudagraph-tests 2025-10-10T00:44:17.1886236Z * [new branch] mlazos/cudagraphs-measurement -> origin/mlazos/cudagraphs-measurement 2025-10-10T00:44:17.1888456Z * [new branch] mlazos/cutlass-test -> origin/mlazos/cutlass-test 2025-10-10T00:44:17.1890244Z * [new branch] mlazos/cutlass-topo-bug -> origin/mlazos/cutlass-topo-bug 2025-10-10T00:44:17.1892242Z * [new branch] mlazos/dataclass-proxy -> origin/mlazos/dataclass-proxy 2025-10-10T00:44:17.1894161Z * [new branch] mlazos/dc-attrs -> origin/mlazos/dc-attrs 2025-10-10T00:44:17.1896068Z * [new branch] mlazos/dc-helion -> origin/mlazos/dc-helion 2025-10-10T00:44:17.1897913Z * [new branch] mlazos/dict-fix -> origin/mlazos/dict-fix 2025-10-10T00:44:17.1900025Z * [new branch] mlazos/disable-tf -> origin/mlazos/disable-tf 2025-10-10T00:44:17.1901939Z * [new branch] mlazos/dupe-fix -> origin/mlazos/dupe-fix 2025-10-10T00:44:17.1903906Z * [new branch] mlazos/dyn-batch -> origin/mlazos/dyn-batch 2025-10-10T00:44:17.1905694Z * [new branch] mlazos/evt -> origin/mlazos/evt 2025-10-10T00:44:17.1907629Z * [new branch] mlazos/extract-examples -> origin/mlazos/extract-examples 2025-10-10T00:44:17.1909480Z * [new branch] mlazos/foreach-op -> origin/mlazos/foreach-op 2025-10-10T00:44:17.1911364Z * [new branch] mlazos/fp8 -> origin/mlazos/fp8 2025-10-10T00:44:17.1913273Z * [new branch] mlazos/fp8-bias -> origin/mlazos/fp8-bias 2025-10-10T00:44:17.1915209Z * [new branch] mlazos/fp8-bias-fusion -> origin/mlazos/fp8-bias-fusion 2025-10-10T00:44:17.1917065Z * [new branch] mlazos/fp8-fixes -> origin/mlazos/fp8-fixes 2025-10-10T00:44:17.1919011Z * [new branch] mlazos/freezing -> origin/mlazos/freezing 2025-10-10T00:44:17.1920826Z * [new branch] mlazos/h-comp -> origin/mlazos/h-comp 2025-10-10T00:44:17.1922794Z * [new branch] mlazos/h-comp2 -> origin/mlazos/h-comp2 2025-10-10T00:44:17.1924828Z * [new branch] mlazos/hash-hop -> origin/mlazos/hash-hop 2025-10-10T00:44:17.1926837Z * [new branch] mlazos/hc -> origin/mlazos/hc 2025-10-10T00:44:17.1929028Z * [new branch] mlazos/hc-cycles -> origin/mlazos/hc-cycles 2025-10-10T00:44:17.1930910Z * [new branch] mlazos/hc-fixes -> origin/mlazos/hc-fixes 2025-10-10T00:44:17.1932813Z * [new branch] mlazos/hc-fixes3 -> origin/mlazos/hc-fixes3 2025-10-10T00:44:17.1934697Z * [new branch] mlazos/hc-fixes4 -> origin/mlazos/hc-fixes4 2025-10-10T00:44:17.1936559Z * [new branch] mlazos/hc-hf -> origin/mlazos/hc-hf 2025-10-10T00:44:17.1938438Z * [new branch] mlazos/hc-mut -> origin/mlazos/hc-mut 2025-10-10T00:44:17.1940312Z * [new branch] mlazos/hc10 -> origin/mlazos/hc10 2025-10-10T00:44:17.1942137Z * [new branch] mlazos/hc11 -> origin/mlazos/hc11 2025-10-10T00:44:17.1944020Z * [new branch] mlazos/hc12 -> origin/mlazos/hc12 2025-10-10T00:44:17.1945900Z * [new branch] mlazos/hc13 -> origin/mlazos/hc13 2025-10-10T00:44:17.1947766Z * [new branch] mlazos/hc14 -> origin/mlazos/hc14 2025-10-10T00:44:17.1949707Z * [new branch] mlazos/hc15 -> origin/mlazos/hc15 2025-10-10T00:44:17.1951597Z * [new branch] mlazos/hc2 -> origin/mlazos/hc2 2025-10-10T00:44:17.1953473Z * [new branch] mlazos/hc4 -> origin/mlazos/hc4 2025-10-10T00:44:17.1955497Z * [new branch] mlazos/hc5 -> origin/mlazos/hc5 2025-10-10T00:44:17.1957337Z * [new branch] mlazos/hc6 -> origin/mlazos/hc6 2025-10-10T00:44:17.1959314Z * [new branch] mlazos/hc7 -> origin/mlazos/hc7 2025-10-10T00:44:17.1961109Z * [new branch] mlazos/hc8 -> origin/mlazos/hc8 2025-10-10T00:44:17.1963088Z * [new branch] mlazos/hc9 -> origin/mlazos/hc9 2025-10-10T00:44:17.1964878Z * [new branch] mlazos/hc_baseline2 -> origin/mlazos/hc_baseline2 2025-10-10T00:44:17.1966867Z * [new branch] mlazos/inductor-streams -> origin/mlazos/inductor-streams 2025-10-10T00:44:17.1968891Z * [new branch] mlazos/lr-composibility -> origin/mlazos/lr-composibility 2025-10-10T00:44:17.1970629Z * [new branch] mlazos/main -> origin/mlazos/main 2025-10-10T00:44:17.1972739Z * [new branch] mlazos/main-test-enablement -> origin/mlazos/main-test-enablement 2025-10-10T00:44:17.1974634Z * [new branch] mlazos/mark-static-update -> origin/mlazos/mark-static-update 2025-10-10T00:44:17.1977035Z * [new branch] mlazos/mcg -> origin/mlazos/mcg 2025-10-10T00:44:17.1978961Z * [new branch] mlazos/mcg2 -> origin/mlazos/mcg2 2025-10-10T00:44:17.1981393Z * [new branch] mlazos/meta-guards -> origin/mlazos/meta-guards 2025-10-10T00:44:17.1983854Z * [new branch] mlazos/mlazos/ck2 -> origin/mlazos/mlazos/ck2 2025-10-10T00:44:17.1985834Z * [new branch] mlazos/mlazos/foreach-map-adam -> origin/mlazos/mlazos/foreach-map-adam 2025-10-10T00:44:17.1987795Z * [new branch] mlazos/mlazos/tf-mode-backup -> origin/mlazos/mlazos/tf-mode-backup 2025-10-10T00:44:17.1989681Z * [new branch] mlazos/mod-fix -> origin/mlazos/mod-fix 2025-10-10T00:44:17.1991601Z * [new branch] mlazos/mode-fix -> origin/mlazos/mode-fix 2025-10-10T00:44:17.1993503Z * [new branch] mlazos/more-tests -> origin/mlazos/more-tests 2025-10-10T00:44:17.1995388Z * [new branch] mlazos/offsets -> origin/mlazos/offsets 2025-10-10T00:44:17.1997490Z * [new branch] mlazos/proxy-ctors -> origin/mlazos/proxy-ctors 2025-10-10T00:44:17.1999340Z * [new branch] mlazos/quant-fix -> origin/mlazos/quant-fix 2025-10-10T00:44:17.2001451Z * [new branch] mlazos/resnet-fix -> origin/mlazos/resnet-fix 2025-10-10T00:44:17.2003445Z * [new branch] mlazos/rm-buf-names -> origin/mlazos/rm-buf-names 2025-10-10T00:44:17.2005330Z * [new branch] mlazos/rm-code -> origin/mlazos/rm-code 2025-10-10T00:44:17.2007245Z * [new branch] mlazos/rm-spam -> origin/mlazos/rm-spam 2025-10-10T00:44:17.2009756Z * [new branch] mlazos/rtp -> origin/mlazos/rtp 2025-10-10T00:44:17.2011696Z * [new branch] mlazos/static-idx-dbg -> origin/mlazos/static-idx-dbg 2025-10-10T00:44:17.2013594Z * [new branch] mlazos/static-inputs-log -> origin/mlazos/static-inputs-log 2025-10-10T00:44:17.2015435Z * [new branch] mlazos/td-fix2 -> origin/mlazos/td-fix2 2025-10-10T00:44:17.2017573Z * [new branch] mlazos/tensor-hasattr2 -> origin/mlazos/tensor-hasattr2 2025-10-10T00:44:17.2019529Z * [new branch] mlazos/test -> origin/mlazos/test 2025-10-10T00:44:17.2021480Z * [new branch] mlazos/tf-mode -> origin/mlazos/tf-mode 2025-10-10T00:44:17.2023417Z * [new branch] mlazos/tf-mode-backup2 -> origin/mlazos/tf-mode-backup2 2025-10-10T00:44:17.2025325Z * [new branch] mlazos/tf-mode-reland -> origin/mlazos/tf-mode-reland 2025-10-10T00:44:17.2027372Z * [new branch] mlazos/tf-mode-reland2 -> origin/mlazos/tf-mode-reland2 2025-10-10T00:44:17.2029193Z * [new branch] mlazos/tf-mode-reland3 -> origin/mlazos/tf-mode-reland3 2025-10-10T00:44:17.2031104Z * [new branch] mlazos/triton-no-epi -> origin/mlazos/triton-no-epi 2025-10-10T00:44:17.2033051Z * [new branch] mlazos/tune-proto -> origin/mlazos/tune-proto 2025-10-10T00:44:17.2035114Z * [new branch] mlazos/tuple-fixes -> origin/mlazos/tuple-fixes 2025-10-10T00:44:17.2036851Z * [new branch] mlazos/tuple-fixes2 -> origin/mlazos/tuple-fixes2 2025-10-10T00:44:17.2038826Z * [new branch] mlazos/tuple-handling -> origin/mlazos/tuple-handling 2025-10-10T00:44:17.2040934Z * [new branch] mlazos/user-streams -> origin/mlazos/user-streams 2025-10-10T00:44:17.2043037Z * [new branch] mlazos/user-streams-backup -> origin/mlazos/user-streams-backup 2025-10-10T00:44:17.2044894Z * [new branch] mlazos/vary-beta -> origin/mlazos/vary-beta 2025-10-10T00:44:17.2046787Z * [new branch] mlazos/vary-beta2 -> origin/mlazos/vary-beta2 2025-10-10T00:44:17.2048951Z * [new branch] mlazos/weird-perf1 -> origin/mlazos/weird-perf1 2025-10-10T00:44:17.2050989Z * [new branch] mm_out_dtype_compile -> origin/mm_out_dtype_compile 2025-10-10T00:44:17.2052901Z * [new branch] module-shim -> origin/module-shim 2025-10-10T00:44:17.2054858Z * [new branch] module-stack -> origin/module-stack 2025-10-10T00:44:17.2056743Z * [new branch] more_ck_Fixes -> origin/more_ck_Fixes 2025-10-10T00:44:17.2058781Z * [new branch] move-theme-out-docker -> origin/move-theme-out-docker 2025-10-10T00:44:17.2060769Z * [new branch] move_aws_steps_inside_setup_rocm -> origin/move_aws_steps_inside_setup_rocm 2025-10-10T00:44:17.2062676Z * [new branch] msaroufim-patch-1 -> origin/msaroufim-patch-1 2025-10-10T00:44:17.2065909Z * [new branch] msaroufim/be1 -> origin/msaroufim/be1 2025-10-10T00:44:17.2067759Z * [new branch] msaroufim/cn_path -> origin/msaroufim/cn_path 2025-10-10T00:44:17.2069493Z * [new branch] msaroufim/cub -> origin/msaroufim/cub 2025-10-10T00:44:17.2071423Z * [new branch] msaroufim/dtensorfusedadam -> origin/msaroufim/dtensorfusedadam 2025-10-10T00:44:17.2073141Z * [new branch] msaroufim/patchx -> origin/msaroufim/patchx 2025-10-10T00:44:17.2074907Z * [new branch] msaroufim/reduce -> origin/msaroufim/reduce 2025-10-10T00:44:17.2078030Z * [new branch] mtia/basic-cmake -> origin/mtia/basic-cmake 2025-10-10T00:44:17.2080677Z * [new branch] mwizak/fix-triton-block-shape -> origin/mwizak/fix-triton-block-shape 2025-10-10T00:44:17.2082505Z * [new branch] my_varlen_backup -> origin/my_varlen_backup 2025-10-10T00:44:17.2085198Z * [new branch] nWEIdia/skip-tests-for-pr-159494 -> origin/nWEIdia/skip-tests-for-pr-159494 2025-10-10T00:44:17.2087122Z * [new branch] nativert_num_outputs -> origin/nativert_num_outputs 2025-10-10T00:44:17.2089123Z * [new branch] new-codegen -> origin/new-codegen 2025-10-10T00:44:17.2091096Z * [new branch] newtest-base -> origin/newtest-base 2025-10-10T00:44:17.2093733Z * [new branch] ngimel/allgather_format -> origin/ngimel/allgather_format 2025-10-10T00:44:17.2095606Z * [new branch] ngimel/cat_perf2 -> origin/ngimel/cat_perf2 2025-10-10T00:44:17.2097269Z * [new branch] ngimel/error_index_list -> origin/ngimel/error_index_list 2025-10-10T00:44:17.2099153Z * [new branch] ngimel/gg_new -> origin/ngimel/gg_new 2025-10-10T00:44:17.2100910Z * [new branch] ngimel/scatter_add_multid -> origin/ngimel/scatter_add_multid 2025-10-10T00:44:17.2103089Z * [new branch] nightly -> origin/nightly 2025-10-10T00:44:17.2106244Z * [new branch] nikitaved/addmm_1_rowcol_lt_path_check -> origin/nikitaved/addmm_1_rowcol_lt_path_check 2025-10-10T00:44:17.2108228Z * [new branch] nikitaved/addmm_epilogue_fusions -> origin/nikitaved/addmm_epilogue_fusions 2025-10-10T00:44:17.2109969Z * [new branch] nikitaved/addmm_epilogue_fusions_scratch -> origin/nikitaved/addmm_epilogue_fusions_scratch 2025-10-10T00:44:17.2111713Z * [new branch] nikitaved/simpler_can_use_32bit_index -> origin/nikitaved/simpler_can_use_32bit_index 2025-10-10T00:44:17.2113341Z * [new branch] nikitaved/test -> origin/nikitaved/test 2025-10-10T00:44:17.2115395Z * [new branch] nmacchioni-patch-10 -> origin/nmacchioni-patch-10 2025-10-10T00:44:17.2117333Z * [new branch] nmacchioni-patch-7 -> origin/nmacchioni-patch-7 2025-10-10T00:44:17.2119396Z * [new branch] nmacchioni-patch-8 -> origin/nmacchioni-patch-8 2025-10-10T00:44:17.2121405Z * [new branch] nmacchioni-patch-9 -> origin/nmacchioni-patch-9 2025-10-10T00:44:17.2123386Z * [new branch] no_distributed_log_spew -> origin/no_distributed_log_spew 2025-10-10T00:44:17.2125273Z * [new branch] nofun-hack -> origin/nofun-hack 2025-10-10T00:44:17.2128076Z * [new branch] nullplay/fuse_matmul -> origin/nullplay/fuse_matmul 2025-10-10T00:44:17.2130019Z * [new branch] nullplay_fuse_matmul -> origin/nullplay_fuse_matmul 2025-10-10T00:44:17.2133398Z * [new branch] orig/release/1.10 -> origin/orig/release/1.10 2025-10-10T00:44:17.2135372Z * [new branch] orig/release/1.11 -> origin/orig/release/1.11 2025-10-10T00:44:17.2137321Z * [new branch] orig/release/1.12 -> origin/orig/release/1.12 2025-10-10T00:44:17.2139413Z * [new branch] orig/release/1.13 -> origin/orig/release/1.13 2025-10-10T00:44:17.2141336Z * [new branch] orig/release/1.6 -> origin/orig/release/1.6 2025-10-10T00:44:17.2143445Z * [new branch] orig/release/1.7 -> origin/orig/release/1.7 2025-10-10T00:44:17.2145381Z * [new branch] orig/release/1.8 -> origin/orig/release/1.8 2025-10-10T00:44:17.2147296Z * [new branch] orig/release/1.9 -> origin/orig/release/1.9 2025-10-10T00:44:17.2149254Z * [new branch] orig/release/2.0 -> origin/orig/release/2.0 2025-10-10T00:44:17.2151174Z * [new branch] orig/release/2.1 -> origin/orig/release/2.1 2025-10-10T00:44:17.2153089Z * [new branch] orig/release/2.2 -> origin/orig/release/2.2 2025-10-10T00:44:17.2154949Z * [new branch] orig/release/2.3 -> origin/orig/release/2.3 2025-10-10T00:44:17.2156846Z * [new branch] orig/release/2.4 -> origin/orig/release/2.4 2025-10-10T00:44:17.2163815Z * [new branch] orig/release/2.5 -> origin/orig/release/2.5 2025-10-10T00:44:17.2164417Z * [new branch] orig/release/2.6 -> origin/orig/release/2.6 2025-10-10T00:44:17.2164629Z * [new branch] orig/release/2.7 -> origin/orig/release/2.7 2025-10-10T00:44:17.2165237Z * [new branch] orig/release/2.8 -> origin/orig/release/2.8 2025-10-10T00:44:17.2167334Z * [new branch] orig/release/2.9 -> origin/orig/release/2.9 2025-10-10T00:44:17.2171709Z * [new branch] origin/gh/fxdawnn/1/base -> origin/origin/gh/fxdawnn/1/base 2025-10-10T00:44:17.2173475Z * [new branch] origin/gh/fxdawnn/1/orig -> origin/origin/gh/fxdawnn/1/orig 2025-10-10T00:44:17.2176659Z * [new branch] origin/gh/zpcore/14/orig -> origin/origin/gh/zpcore/14/orig 2025-10-10T00:44:17.2178530Z * [new branch] padded-tensor -> origin/padded-tensor 2025-10-10T00:44:17.2180583Z * [new branch] pca2 -> origin/pca2 2025-10-10T00:44:17.2182748Z * [new branch] perf_ops -> origin/perf_ops 2025-10-10T00:44:17.2184648Z * [new branch] perf_ops_2_9 -> origin/perf_ops_2_9 2025-10-10T00:44:17.2186733Z * [new branch] perserve_node_meta_decomp -> origin/perserve_node_meta_decomp 2025-10-10T00:44:17.2188665Z * [new branch] pianpwk-patch-1 -> origin/pianpwk-patch-1 2025-10-10T00:44:17.2191274Z * [new branch] pianpwk/__draft_debug_mode -> origin/pianpwk/__draft_debug_mode 2025-10-10T00:44:17.2193136Z * [new branch] pianpwk/_super_draft_debug_mode -> origin/pianpwk/_super_draft_debug_mode 2025-10-10T00:44:17.2194899Z * [new branch] pianpwk/backed_size_oblivious_export -> origin/pianpwk/backed_size_oblivious_export 2025-10-10T00:44:17.2196661Z * [new branch] pianpwk/base_view_shape_key -> origin/pianpwk/base_view_shape_key 2025-10-10T00:44:17.2198797Z * [new branch] pianpwk/bert_dynamic_perf -> origin/pianpwk/bert_dynamic_perf 2025-10-10T00:44:17.2200885Z * [new branch] pianpwk/debug_mode_hacks -> origin/pianpwk/debug_mode_hacks 2025-10-10T00:44:17.2203026Z * [new branch] pianpwk/debug_mode_inductor -> origin/pianpwk/debug_mode_inductor 2025-10-10T00:44:17.2204938Z * [new branch] pianpwk/debug_mode_show_ids -> origin/pianpwk/debug_mode_show_ids 2025-10-10T00:44:17.2206931Z * [new branch] pianpwk/debugmode_compile_tf -> origin/pianpwk/debugmode_compile_tf 2025-10-10T00:44:17.2208998Z * [new branch] pianpwk/debugmode_show_ids -> origin/pianpwk/debugmode_show_ids 2025-10-10T00:44:17.2210993Z * [new branch] pianpwk/dispatch_key_debugging_for_debug -> origin/pianpwk/dispatch_key_debugging_for_debug 2025-10-10T00:44:17.2212846Z * [new branch] pianpwk/draft_debug_mode_tfcompile -> origin/pianpwk/draft_debug_mode_tfcompile 2025-10-10T00:44:17.2214715Z * [new branch] pianpwk/draft_multikernel_nn -> origin/pianpwk/draft_multikernel_nn 2025-10-10T00:44:17.2216697Z * [new branch] pianpwk/draft_multikernel_status_10_5 -> origin/pianpwk/draft_multikernel_status_10_5 2025-10-10T00:44:17.2218627Z * [new branch] pianpwk/dtensor_shape_metadata_guard -> origin/pianpwk/dtensor_shape_metadata_guard 2025-10-10T00:44:17.2220501Z * [new branch] pianpwk/false_numel_refs -> origin/pianpwk/false_numel_refs 2025-10-10T00:44:17.2222401Z * [new branch] pianpwk/maybe_guard_rel -> origin/pianpwk/maybe_guard_rel 2025-10-10T00:44:17.2224277Z * [new branch] pianpwk/multi_kernel_l1 -> origin/pianpwk/multi_kernel_l1 2025-10-10T00:44:17.2226268Z * [new branch] pianpwk/multikernel_hints_draft -> origin/pianpwk/multikernel_hints_draft 2025-10-10T00:44:17.2228213Z * [new branch] pianpwk/no_size_oblivious_slice_scat -> origin/pianpwk/no_size_oblivious_slice_scat 2025-10-10T00:44:17.2230126Z * [new branch] pianpwk/oblivious_reshape_view_better -> origin/pianpwk/oblivious_reshape_view_better 2025-10-10T00:44:17.2231967Z * [new branch] pianpwk/pre_forward_hook -> origin/pianpwk/pre_forward_hook 2025-10-10T00:44:17.2234065Z * [new branch] pianpwk/skip_python_keys_in_guards -> origin/pianpwk/skip_python_keys_in_guards 2025-10-10T00:44:17.2235901Z * [new branch] pianpwk/slice_fresh_symbols -> origin/pianpwk/slice_fresh_symbols 2025-10-10T00:44:17.2237674Z * [new branch] pianpwk/sym_tokens_draft -> origin/pianpwk/sym_tokens_draft 2025-10-10T00:44:17.2239709Z * [new branch] pianpwk/test_pointwise_guard_or_false -> origin/pianpwk/test_pointwise_guard_or_false 2025-10-10T00:44:17.2241524Z * [new branch] pianpwk/totally_draft_sym_wrap -> origin/pianpwk/totally_draft_sym_wrap 2025-10-10T00:44:17.2243529Z * [new branch] pianpwk/triton_benchmark_hints -> origin/pianpwk/triton_benchmark_hints 2025-10-10T00:44:17.2245537Z * [new branch] pianpwk/try_dumb_stuff -> origin/pianpwk/try_dumb_stuff 2025-10-10T00:44:17.2247398Z * [new branch] pianpwk/try_dumb_stuff_2 -> origin/pianpwk/try_dumb_stuff_2 2025-10-10T00:44:17.2249395Z * [new branch] pianpwk/unbacked_channels_last -> origin/pianpwk/unbacked_channels_last 2025-10-10T00:44:17.2251858Z * [new branch] pianpwk/unbacked_should_swap_2 -> origin/pianpwk/unbacked_should_swap_2 2025-10-10T00:44:17.2253729Z * [new branch] pianpwk/user_symints -> origin/pianpwk/user_symints 2025-10-10T00:44:17.2257264Z * [new branch] pianpwk/wan21_reshape -> origin/pianpwk/wan21_reshape 2025-10-10T00:44:17.2257758Z * [new branch] pianpwk/whitelist_optimizer -> origin/pianpwk/whitelist_optimizer 2025-10-10T00:44:17.2259780Z * [new branch] piz/add_wait -> origin/piz/add_wait 2025-10-10T00:44:17.2261572Z * [new branch] piz/fall_back_missing_0716 -> origin/piz/fall_back_missing_0716 2025-10-10T00:44:17.2263404Z * [new branch] pool-separate -> origin/pool-separate 2025-10-10T00:44:17.2265382Z * [new branch] pr-156087 -> origin/pr-156087 2025-10-10T00:44:17.2267976Z * [new branch] pr/131860 -> origin/pr/131860 2025-10-10T00:44:17.2270044Z * [new branch] pre_compile_checks -> origin/pre_compile_checks 2025-10-10T00:44:17.2271909Z * [new branch] predispatch_to -> origin/predispatch_to 2025-10-10T00:44:17.2274027Z * [new branch] prepare-perf-baseline-number-2.8 -> origin/prepare-perf-baseline-number-2.8 2025-10-10T00:44:17.2275970Z * [new branch] prepare-perf-number-2.9 -> origin/prepare-perf-number-2.9 2025-10-10T00:44:17.2278474Z * [new branch] profiler-enabled -> origin/profiler-enabled 2025-10-10T00:44:17.2280422Z * [new branch] provenance_doc_2 -> origin/provenance_doc_2 2025-10-10T00:44:17.2282361Z * [new branch] pt-opt-cuda3 -> origin/pt-opt-cuda3 2025-10-10T00:44:17.2284305Z * [new branch] pyobjectslot -> origin/pyobjectslot 2025-10-10T00:44:17.2286598Z * [new branch] python_compiled_autograd -> origin/python_compiled_autograd 2025-10-10T00:44:17.2289977Z * [new branch] qchip/export-D54134695 -> origin/qchip/export-D54134695 2025-10-10T00:44:17.2291821Z * [new branch] quantile-docs -> origin/quantile-docs 2025-10-10T00:44:17.2293815Z * [new branch] quint-bits -> origin/quint-bits 2025-10-10T00:44:17.2295801Z * [new branch] reland-fx-annotate -> origin/reland-fx-annotate 2025-10-10T00:44:17.2297746Z * [new branch] reland_req_nvsh -> origin/reland_req_nvsh 2025-10-10T00:44:17.2303493Z * [new branch] release/1.10 -> origin/release/1.10 2025-10-10T00:44:17.2305351Z * [new branch] release/1.11 -> origin/release/1.11 2025-10-10T00:44:17.2307300Z * [new branch] release/1.12 -> origin/release/1.12 2025-10-10T00:44:17.2309277Z * [new branch] release/1.13 -> origin/release/1.13 2025-10-10T00:44:17.2311102Z * [new branch] release/1.4 -> origin/release/1.4 2025-10-10T00:44:17.2312746Z * [new branch] release/1.4.1 -> origin/release/1.4.1 2025-10-10T00:44:17.2314624Z * [new branch] release/1.5 -> origin/release/1.5 2025-10-10T00:44:17.2316514Z * [new branch] release/1.6 -> origin/release/1.6 2025-10-10T00:44:17.2318469Z * [new branch] release/1.7 -> origin/release/1.7 2025-10-10T00:44:17.2320460Z * [new branch] release/1.8 -> origin/release/1.8 2025-10-10T00:44:17.2322419Z * [new branch] release/1.9 -> origin/release/1.9 2025-10-10T00:44:17.2324173Z * [new branch] release/2.0 -> origin/release/2.0 2025-10-10T00:44:17.2326536Z * [new branch] release/2.1 -> origin/release/2.1 2025-10-10T00:44:17.2328701Z * [new branch] release/2.2 -> origin/release/2.2 2025-10-10T00:44:17.2330992Z * [new branch] release/2.3 -> origin/release/2.3 2025-10-10T00:44:17.2333381Z * [new branch] release/2.4 -> origin/release/2.4 2025-10-10T00:44:17.2335864Z * [new branch] release/2.5 -> origin/release/2.5 2025-10-10T00:44:17.2338301Z * [new branch] release/2.6 -> origin/release/2.6 2025-10-10T00:44:17.2340396Z * [new branch] release/2.7 -> origin/release/2.7 2025-10-10T00:44:17.2342466Z * [new branch] release/2.8 -> origin/release/2.8 2025-10-10T00:44:17.2344443Z * [new branch] release/2.9 -> origin/release/2.9 2025-10-10T00:44:17.2346464Z * [new branch] release_notes -> origin/release_notes 2025-10-10T00:44:17.2348361Z * [new branch] remove_header_code -> origin/remove_header_code 2025-10-10T00:44:17.2350355Z * [new branch] remove_pyinterpreter -> origin/remove_pyinterpreter 2025-10-10T00:44:17.2352854Z * [new branch] repackage-vllm-nightlies -> origin/repackage-vllm-nightlies 2025-10-10T00:44:17.2355014Z * [new branch] replace-pytorch-labs-20250812-195836 -> origin/replace-pytorch-labs-20250812-195836 2025-10-10T00:44:17.2356832Z * [new branch] replace-pytorch-labs-20250812-200248 -> origin/replace-pytorch-labs-20250812-200248 2025-10-10T00:44:17.2358694Z * [new branch] replace-pytorch-labs-20250812-200324 -> origin/replace-pytorch-labs-20250812-200324 2025-10-10T00:44:17.2360661Z * [new branch] replace-pytorch-labs-20250812-204020 -> origin/replace-pytorch-labs-20250812-204020 2025-10-10T00:44:17.2364358Z * [new branch] revert-131069-gh/krzysztofjordan/1/head -> origin/revert-131069-gh/krzysztofjordan/1/head 2025-10-10T00:44:17.2368206Z * [new branch] revert-131469-gh/andrewor14/51/head -> origin/revert-131469-gh/andrewor14/51/head 2025-10-10T00:44:17.2371961Z * [new branch] revert-156870-gh/skarjala/3/head -> origin/revert-156870-gh/skarjala/3/head 2025-10-10T00:44:17.2374182Z * [new branch] revert-157914-cherry-pick-157503-by-pytorch_bot_bot_ -> origin/revert-157914-cherry-pick-157503-by-pytorch_bot_bot_ 2025-10-10T00:44:17.2376985Z * [new branch] revert-163802-camyll/cherrypick_3016616ccbba3dc9bb6a80eb4a81a846ddf49cc9 -> origin/revert-163802-camyll/cherrypick_3016616ccbba3dc9bb6a80eb4a81a846ddf49cc9 2025-10-10T00:44:17.2378737Z * [new branch] revert_always_build_distributed -> origin/revert_always_build_distributed 2025-10-10T00:44:17.2380698Z * [new branch] rocm-test-yml-update -> origin/rocm-test-yml-update 2025-10-10T00:44:17.2382560Z * [new branch] rocm_op_bench -> origin/rocm_op_bench 2025-10-10T00:44:17.2385210Z * [new branch] ruisi/aot_eager_pass -> origin/ruisi/aot_eager_pass 2025-10-10T00:44:17.2386925Z * [new branch] ruisi/placement_trace -> origin/ruisi/placement_trace 2025-10-10T00:44:17.2389809Z * [new branch] ryanguo99/cleanup-dynamo-expected-failures -> origin/ryanguo99/cleanup-dynamo-expected-failures 2025-10-10T00:44:17.2391474Z * [new branch] ryanguo99/fix-closure-var -> origin/ryanguo99/fix-closure-var 2025-10-10T00:44:17.2394177Z * [new branch] rzou/faketensor_bench -> origin/rzou/faketensor_bench 2025-10-10T00:44:17.2395889Z * [new branch] rzou/njt -> origin/rzou/njt 2025-10-10T00:44:17.2397817Z * [new branch] rzou/pca -> origin/rzou/pca 2025-10-10T00:44:17.2399760Z * [new branch] rzou/realprop -> origin/rzou/realprop 2025-10-10T00:44:17.2401636Z * [new branch] rzou/setup_context -> origin/rzou/setup_context 2025-10-10T00:44:17.2403525Z * [new branch] samplevllm -> origin/samplevllm 2025-10-10T00:44:17.2406644Z * [new branch] sanchitintel/weird_thing_with_test_cpu_select_algorithm -> origin/sanchitintel/weird_thing_with_test_cpu_select_algorithm 2025-10-10T00:44:17.2408738Z * [new branch] sapling-pr-archive-SS-JIA -> origin/sapling-pr-archive-SS-JIA 2025-10-10T00:44:17.2410618Z * [new branch] save -> origin/save 2025-10-10T00:44:17.2413496Z * [new branch] sdym/2.5.1 -> origin/sdym/2.5.1 2025-10-10T00:44:17.2415251Z * [new branch] sekyondaMeta-dynamoconfig-fix -> origin/sekyondaMeta-dynamoconfig-fix 2025-10-10T00:44:17.2417813Z * [new branch] shengf/fx-xform-perf -> origin/shengf/fx-xform-perf 2025-10-10T00:44:17.2419819Z * [new branch] shoumikhin-patch-1 -> origin/shoumikhin-patch-1 2025-10-10T00:44:17.2421784Z * [new branch] shoumikhin-patch-12 -> origin/shoumikhin-patch-12 2025-10-10T00:44:17.2423709Z * [new branch] solve-accuracy-fix -> origin/solve-accuracy-fix 2025-10-10T00:44:17.2426438Z * [new branch] soulitzer/reland-codev-grad-dtype -> origin/soulitzer/reland-codev-grad-dtype 2025-10-10T00:44:17.2428137Z * [new branch] soulitzer/stash-tls-ac -> origin/soulitzer/stash-tls-ac 2025-10-10T00:44:17.2430762Z * [new branch] sqzhang/flight4plus -> origin/sqzhang/flight4plus 2025-10-10T00:44:17.2433372Z * [new branch] sraikund16/test -> origin/sraikund16/test 2025-10-10T00:44:17.2435430Z * [new branch] stablize-compilation-time -> origin/stablize-compilation-time 2025-10-10T00:44:17.2437323Z * [new branch] starterTaskUpdate -> origin/starterTaskUpdate 2025-10-10T00:44:17.2439285Z * [new branch] suo -> origin/suo 2025-10-10T00:44:17.2441327Z * [new branch] support-uv-in-collect_env -> origin/support-uv-in-collect_env 2025-10-10T00:44:17.2443298Z * [new branch] sve-poc -> origin/sve-poc 2025-10-10T00:44:17.2445258Z * [new branch] svekars-patch-1 -> origin/svekars-patch-1 2025-10-10T00:44:17.2447249Z * [new branch] svekars-patch-2 -> origin/svekars-patch-2 2025-10-10T00:44:17.2449396Z * [new branch] svekars-patch-3 -> origin/svekars-patch-3 2025-10-10T00:44:17.2451394Z * [new branch] svekars-patch-4 -> origin/svekars-patch-4 2025-10-10T00:44:17.2453406Z * [new branch] svekars-patch-5 -> origin/svekars-patch-5 2025-10-10T00:44:17.2455344Z * [new branch] switch-bn -> origin/switch-bn 2025-10-10T00:44:17.2457360Z * [new branch] sympy-bottleneck-repro -> origin/sympy-bottleneck-repro 2025-10-10T00:44:17.2459954Z * [new branch] tenpercent/ck_rocm_ci_v3 -> origin/tenpercent/ck_rocm_ci_v3 2025-10-10T00:44:17.2461976Z * [new branch] tensordict_integration -> origin/tensordict_integration 2025-10-10T00:44:17.2464054Z * [new branch] test-move-conda-builds -> origin/test-move-conda-builds 2025-10-10T00:44:17.2466162Z * [new branch] test-myst-markdown-docstring -> origin/test-myst-markdown-docstring 2025-10-10T00:44:17.2468009Z * [new branch] test-old -> origin/test-old 2025-10-10T00:44:17.2470047Z * [new branch] test-vec-migration-internally -> origin/test-vec-migration-internally 2025-10-10T00:44:17.2472798Z * [new branch] test/bmm_heur -> origin/test/bmm_heur 2025-10-10T00:44:17.2474513Z * [new branch] test/inductor -> origin/test/inductor 2025-10-10T00:44:17.2476469Z * [new branch] test_quantization -> origin/test_quantization 2025-10-10T00:44:17.2479107Z * [new branch] tianren/customOp_autotune -> origin/tianren/customOp_autotune 2025-10-10T00:44:17.2480863Z * [new branch] tianren/customOp_autotune_fix -> origin/tianren/customOp_autotune_fix 2025-10-10T00:44:17.2482457Z * [new branch] tianren/customOp_fusion -> origin/tianren/customOp_fusion 2025-10-10T00:44:17.2484320Z * [new branch] tianren/flex_paged_attn_fix_temp -> origin/tianren/flex_paged_attn_fix_temp 2025-10-10T00:44:17.2485969Z * [new branch] tianren/remove_repeate -> origin/tianren/remove_repeate 2025-10-10T00:44:17.2487777Z * [new branch] tianren/test -> origin/tianren/test 2025-10-10T00:44:17.2489745Z * [new branch] tidy_performance_cyy -> origin/tidy_performance_cyy 2025-10-10T00:44:17.2491656Z * [new branch] torchtitan_ep -> origin/torchtitan_ep 2025-10-10T00:44:17.2493637Z * [new branch] trace_fsdp_torchtune_lora -> origin/trace_fsdp_torchtune_lora 2025-10-10T00:44:17.2495526Z * [new branch] traceable_fsdp_unit_tests -> origin/traceable_fsdp_unit_tests 2025-10-10T00:44:17.2497456Z * [new branch] transpose_pack_fusion -> origin/transpose_pack_fusion 2025-10-10T00:44:17.2499540Z * [new branch] tree_loop_vec_base -> origin/tree_loop_vec_base 2025-10-10T00:44:17.2503928Z * [new branch] triton_kernel -> origin/triton_kernel 2025-10-10T00:44:17.2505938Z * [new branch] trunk-tagging-multi-commits -> origin/trunk-tagging-multi-commits 2025-10-10T00:44:17.2507805Z * [new branch] tt_pkg_1908 -> origin/tt_pkg_1908 2025-10-10T00:44:17.2509875Z * [new branch] type_dec -> origin/type_dec 2025-10-10T00:44:17.2511805Z * [new branch] udate-sphinx-dependancies -> origin/udate-sphinx-dependancies 2025-10-10T00:44:17.2513858Z * [new branch] unlift -> origin/unlift 2025-10-10T00:44:17.2516612Z * [new branch] update-audio-commit-hash/17567864209-1799-1 -> origin/update-audio-commit-hash/17567864209-1799-1 2025-10-10T00:44:17.2518397Z * [new branch] update-audio-commit-hash/17599208654-1801-1 -> origin/update-audio-commit-hash/17599208654-1801-1 2025-10-10T00:44:17.2520153Z * [new branch] update-audio-commit-hash/17630256502-1803-1 -> origin/update-audio-commit-hash/17630256502-1803-1 2025-10-10T00:44:17.2522135Z * [new branch] update-audio-commit-hash/17657093113-1804-1 -> origin/update-audio-commit-hash/17657093113-1804-1 2025-10-10T00:44:17.2524039Z * [new branch] update-audio-commit-hash/17688961747-1806-1 -> origin/update-audio-commit-hash/17688961747-1806-1 2025-10-10T00:44:17.2526132Z * [new branch] update-audio-commit-hash/17703952853-1807-1 -> origin/update-audio-commit-hash/17703952853-1807-1 2025-10-10T00:44:17.2528750Z * [new branch] update-audio-commit-hash/18392707270-1874-1 -> origin/update-audio-commit-hash/18392707270-1874-1 2025-10-10T00:44:17.2530582Z * [new branch] update-dynamic-shapes-doc -> origin/update-dynamic-shapes-doc 2025-10-10T00:44:17.2533373Z * [new branch] update-executorch-commit-hash/15694981040-1626-1 -> origin/update-executorch-commit-hash/15694981040-1626-1 2025-10-10T00:44:17.2535800Z * [new branch] update-triton-commit-hash/13663274526-1487-2 -> origin/update-triton-commit-hash/13663274526-1487-2 2025-10-10T00:44:17.2538570Z * [new branch] update-vision-commit-hash/15336342773-1607-1 -> origin/update-vision-commit-hash/15336342773-1607-1 2025-10-10T00:44:17.2540218Z * [new branch] update-vision-commit-hash/18361653903-1869-1 -> origin/update-vision-commit-hash/18361653903-1869-1 2025-10-10T00:44:17.2542737Z * [new branch] update-vllm-commit-hash/17536029887-1798-1 -> origin/update-vllm-commit-hash/17536029887-1798-1 2025-10-10T00:44:17.2544491Z * [new branch] update-vllm-commit-hash/17599208654-1801-1 -> origin/update-vllm-commit-hash/17599208654-1801-1 2025-10-10T00:44:17.2546229Z * [new branch] update-vllm-commit-hash/17657093113-1804-1 -> origin/update-vllm-commit-hash/17657093113-1804-1 2025-10-10T00:44:17.2547988Z * [new branch] update-vllm-commit-hash/17703952853-1807-1 -> origin/update-vllm-commit-hash/17703952853-1807-1 2025-10-10T00:44:17.2549784Z * [new branch] update-vllm-commit-hash/17718740812-1808-1 -> origin/update-vllm-commit-hash/17718740812-1808-1 2025-10-10T00:44:17.2551935Z * [new branch] update-vllm-commit-hash/17782703922-1813-1 -> origin/update-vllm-commit-hash/17782703922-1813-1 2025-10-10T00:44:17.2554108Z * [new branch] update-vllm-commit-hash/17814169036-1822-1 -> origin/update-vllm-commit-hash/17814169036-1822-1 2025-10-10T00:44:17.2556374Z * [new branch] update-vllm-commit-hash/17844794719-1823-1 -> origin/update-vllm-commit-hash/17844794719-1823-1 2025-10-10T00:44:17.2558273Z * [new branch] update-vllm-commit-hash/17872674059-1830-1 -> origin/update-vllm-commit-hash/17872674059-1830-1 2025-10-10T00:44:17.2560342Z * [new branch] update-vllm-commit-hash/17901034819-1833-1 -> origin/update-vllm-commit-hash/17901034819-1833-1 2025-10-10T00:44:17.2562212Z * [new branch] update-vllm-commit-hash/17932176396-1836-1 -> origin/update-vllm-commit-hash/17932176396-1836-1 2025-10-10T00:44:17.2564034Z * [new branch] update-vllm-commit-hash/17962545886-1842-1 -> origin/update-vllm-commit-hash/17962545886-1842-1 2025-10-10T00:44:17.2565931Z * [new branch] update-vllm-commit-hash/17993166855-1844-1 -> origin/update-vllm-commit-hash/17993166855-1844-1 2025-10-10T00:44:17.2567890Z * [new branch] update-vllm-commit-hash/18052321282-1848-1 -> origin/update-vllm-commit-hash/18052321282-1848-1 2025-10-10T00:44:17.2569795Z * [new branch] update-vllm-commit-hash/18066820738-1849-1 -> origin/update-vllm-commit-hash/18066820738-1849-1 2025-10-10T00:44:17.2571657Z * [new branch] update-vllm-commit-hash/18081987460-1850-1 -> origin/update-vllm-commit-hash/18081987460-1850-1 2025-10-10T00:44:17.2573552Z * [new branch] update-vllm-commit-hash/18114584510-1852-1 -> origin/update-vllm-commit-hash/18114584510-1852-1 2025-10-10T00:44:17.2575487Z * [new branch] update-vllm-commit-hash/18147226974-1853-1 -> origin/update-vllm-commit-hash/18147226974-1853-1 2025-10-10T00:44:17.2577398Z * [new branch] update-vllm-commit-hash/18236802781-1857-1 -> origin/update-vllm-commit-hash/18236802781-1857-1 2025-10-10T00:44:17.2580071Z * [new branch] update-xla-commit-hash/17725712604-203-1 -> origin/update-xla-commit-hash/17725712604-203-1 2025-10-10T00:44:17.2581794Z * [new branch] update-xla-commit-hash/17908176340-204-1 -> origin/update-xla-commit-hash/17908176340-204-1 2025-10-10T00:44:17.2583663Z * [new branch] update-xla-commit-hash/18273597034-206-1 -> origin/update-xla-commit-hash/18273597034-206-1 2025-10-10T00:44:17.2585614Z * [new branch] update_docs_torch_multinomial_issue#125388 -> origin/update_docs_torch_multinomial_issue#125388 2025-10-10T00:44:17.2587933Z * [new branch] update_executorch_pin -> origin/update_executorch_pin 2025-10-10T00:44:17.2590039Z * [new branch] update_slow_tests_1722488736 -> origin/update_slow_tests_1722488736 2025-10-10T00:44:17.2591974Z * [new branch] update_slow_tests_1722879173 -> origin/update_slow_tests_1722879173 2025-10-10T00:44:17.2594050Z * [new branch] update_slow_tests_1757922057 -> origin/update_slow_tests_1757922057 2025-10-10T00:44:17.2595948Z * [new branch] update_slow_tests_1758526845 -> origin/update_slow_tests_1758526845 2025-10-10T00:44:17.2597796Z * [new branch] update_slow_tests_1759736444 -> origin/update_slow_tests_1759736444 2025-10-10T00:44:17.2599913Z * [new branch] update_submodule_FBGEMM -> origin/update_submodule_FBGEMM 2025-10-10T00:44:17.2601858Z * [new branch] update_submodule_kineto -> origin/update_submodule_kineto 2025-10-10T00:44:17.2603837Z * [new branch] update_submodule_tensorpipe -> origin/update_submodule_tensorpipe 2025-10-10T00:44:17.2605876Z * [new branch] v0.1.2 -> origin/v0.1.2 2025-10-10T00:44:17.2608098Z * [new branch] v1.0.1 -> origin/v1.0.1 2025-10-10T00:44:17.2610184Z * [new branch] v1.0.3 -> origin/v1.0.3 2025-10-10T00:44:17.2612269Z * [new branch] v1.1.0 -> origin/v1.1.0 2025-10-10T00:44:17.2614382Z * [new branch] v1.2.0 -> origin/v1.2.0 2025-10-10T00:44:17.2616869Z * [new branch] v1.3.0 -> origin/v1.3.0 2025-10-10T00:44:17.2618931Z * [new branch] v1.3.1 -> origin/v1.3.1 2025-10-10T00:44:17.2620915Z * [new branch] validate_fn -> origin/validate_fn 2025-10-10T00:44:17.2623062Z * [new branch] validations_2.6 -> origin/validations_2.6 2025-10-10T00:44:17.2625212Z * [new branch] validations_2.8 -> origin/validations_2.8 2025-10-10T00:44:17.2627112Z * [new branch] varlen-api -> origin/varlen-api 2025-10-10T00:44:17.2628914Z * [new branch] varlen_api -> origin/varlen_api 2025-10-10T00:44:17.2631593Z * [new branch] viable/strict -> origin/viable/strict 2025-10-10T00:44:17.2634514Z * [new branch] vishal9-team/dtensor_parallelism_toy -> origin/vishal9-team/dtensor_parallelism_toy 2025-10-10T00:44:17.2636320Z * [new branch] vllmbuildci -> origin/vllmbuildci 2025-10-10T00:44:17.2638282Z * [new branch] vllmpin -> origin/vllmpin 2025-10-10T00:44:17.2640823Z * [new branch] wdvr/iss_145259 -> origin/wdvr/iss_145259 2025-10-10T00:44:17.2643545Z * [new branch] whc/flight51 -> origin/whc/flight51 2025-10-10T00:44:17.2645370Z * [new branch] whc/flight53 -> origin/whc/flight53 2025-10-10T00:44:17.2647290Z * [new branch] whc/stage2 -> origin/whc/stage2 2025-10-10T00:44:17.2649251Z * [new branch] whc/uneven -> origin/whc/uneven 2025-10-10T00:44:17.2651386Z * [new branch] whc/uneven-merge -> origin/whc/uneven-merge 2025-10-10T00:44:17.2653651Z * [new branch] williamwen42-patch-1 -> origin/williamwen42-patch-1 2025-10-10T00:44:17.2655564Z * [new branch] win_warnings -> origin/win_warnings 2025-10-10T00:44:17.2657590Z * [new branch] windows_libtorch_free -> origin/windows_libtorch_free 2025-10-10T00:44:17.2659809Z * [new branch] windows_mmap -> origin/windows_mmap 2025-10-10T00:44:17.2661807Z * [new branch] xmfan-war -> origin/xmfan-war 2025-10-10T00:44:17.2664418Z * [new branch] xmfan/ca_0516 -> origin/xmfan/ca_0516 2025-10-10T00:44:17.2666214Z * [new branch] xmfan/ca_1051b93192 -> origin/xmfan/ca_1051b93192 2025-10-10T00:44:17.2668097Z * [new branch] xmfan/ca_1a722f62c248391fc4a542e8851a5559aa356ae8 -> origin/xmfan/ca_1a722f62c248391fc4a542e8851a5559aa356ae8 2025-10-10T00:44:17.2669816Z * [new branch] xmfan/ca_5a2be192d1 -> origin/xmfan/ca_5a2be192d1 2025-10-10T00:44:17.2671398Z * [new branch] xmfan/ca_9d59b516e9 -> origin/xmfan/ca_9d59b516e9 2025-10-10T00:44:17.2673257Z * [new branch] xmfan/ca_api -> origin/xmfan/ca_api 2025-10-10T00:44:17.2675001Z * [new branch] xmfan/ca_apr8 -> origin/xmfan/ca_apr8 2025-10-10T00:44:17.2677221Z * [new branch] xmfan/ca_base -> origin/xmfan/ca_base 2025-10-10T00:44:17.2679601Z * [new branch] xmfan/ca_cudagraphs -> origin/xmfan/ca_cudagraphs 2025-10-10T00:44:17.2681483Z * [new branch] xmfan/ca_dynamic -> origin/xmfan/ca_dynamic 2025-10-10T00:44:17.2683349Z * [new branch] xmfan/ca_fix_dyn -> origin/xmfan/ca_fix_dyn 2025-10-10T00:44:17.2685293Z * [new branch] xmfan/ca_fix_lowering -> origin/xmfan/ca_fix_lowering 2025-10-10T00:44:17.2687179Z * [new branch] xmfan/ca_fix_polyfills -> origin/xmfan/ca_fix_polyfills 2025-10-10T00:44:17.2689026Z * [new branch] xmfan/ca_jan3 -> origin/xmfan/ca_jan3 2025-10-10T00:44:17.2690929Z * [new branch] xmfan/ca_jun18 -> origin/xmfan/ca_jun18 2025-10-10T00:44:17.2692872Z * [new branch] xmfan/ca_jun24 -> origin/xmfan/ca_jun24 2025-10-10T00:44:17.2694692Z * [new branch] xmfan/ca_mem_base -> origin/xmfan/ca_mem_base 2025-10-10T00:44:17.2697014Z * [new branch] xmfan/ca_mem_fix -> origin/xmfan/ca_mem_fix 2025-10-10T00:44:17.2699206Z * [new branch] xmfan/ca_move_to_cuda -> origin/xmfan/ca_move_to_cuda 2025-10-10T00:44:17.2703795Z * [new branch] xmfan/ca_nested -> origin/xmfan/ca_nested 2025-10-10T00:44:17.2705874Z * [new branch] xmfan/ca_overhead -> origin/xmfan/ca_overhead 2025-10-10T00:44:17.2707492Z * [new branch] xmfan/ca_overhead_0eba7e5451 -> origin/xmfan/ca_overhead_0eba7e5451 2025-10-10T00:44:17.2709375Z * [new branch] xmfan/ca_scalar -> origin/xmfan/ca_scalar 2025-10-10T00:44:17.2711396Z * [new branch] xmfan/ca_subclass_mem_fix -> origin/xmfan/ca_subclass_mem_fix 2025-10-10T00:44:17.2713270Z * [new branch] xmfan/ca_warm_mem -> origin/xmfan/ca_warm_mem 2025-10-10T00:44:17.2715149Z * [new branch] xmfan/ca_warm_mem_base -> origin/xmfan/ca_warm_mem_base 2025-10-10T00:44:17.2716937Z * [new branch] xmfan/cacu_jun18 -> origin/xmfan/cacu_jun18 2025-10-10T00:44:17.2718903Z * [new branch] xmfan/cacu_jun19 -> origin/xmfan/cacu_jun19 2025-10-10T00:44:17.2720665Z * [new branch] xmfan/cacu_jun4 -> origin/xmfan/cacu_jun4 2025-10-10T00:44:17.2722468Z * [new branch] xmfan/cacu_may27 -> origin/xmfan/cacu_may27 2025-10-10T00:44:17.2724468Z * [new branch] xmfan/disable_duck_shape -> origin/xmfan/disable_duck_shape 2025-10-10T00:44:17.2726362Z * [new branch] xmfan/fca_cpp_node_passthrough -> origin/xmfan/fca_cpp_node_passthrough 2025-10-10T00:44:17.2728495Z * [new branch] xmfan/issue_123374 -> origin/xmfan/issue_123374 2025-10-10T00:44:17.2730550Z * [new branch] xmfan/post_3945954741e2d37023c5d6954f9483008e0892f9 -> origin/xmfan/post_3945954741e2d37023c5d6954f9483008e0892f9 2025-10-10T00:44:17.2732526Z * [new branch] xmfan/pre_3945954741e2d37023c5d6954f9483008e0892f9 -> origin/xmfan/pre_3945954741e2d37023c5d6954f9483008e0892f9 2025-10-10T00:44:17.2734227Z * [new branch] xmfan/single_step -> origin/xmfan/single_step 2025-10-10T00:44:17.2736048Z * [new branch] xmfan/sth_0829 -> origin/xmfan/sth_0829 2025-10-10T00:44:17.2738060Z * [new branch] xmfan/test -> origin/xmfan/test 2025-10-10T00:44:17.2740772Z * [new branch] yguo/debug-0226-constexpr -> origin/yguo/debug-0226-constexpr 2025-10-10T00:44:17.2742626Z * [new branch] yguo/new_latest_changes -> origin/yguo/new_latest_changes 2025-10-10T00:44:17.2744629Z * [new branch] yguo/patch_constexpr_changes -> origin/yguo/patch_constexpr_changes 2025-10-10T00:44:17.2746474Z * [new branch] yihan_quantization -> origin/yihan_quantization 2025-10-10T00:44:17.2749154Z * [new branch] yiming/bootcamp -> origin/yiming/bootcamp 2025-10-10T00:44:17.2751040Z * [new branch] yiming/improve_sharding_error_msg -> origin/yiming/improve_sharding_error_msg 2025-10-10T00:44:17.2752702Z * [new branch] yiming/precompile_benchmark -> origin/yiming/precompile_benchmark 2025-10-10T00:44:17.2754632Z * [new branch] yolo-llama3 -> origin/yolo-llama3 2025-10-10T00:44:17.2757459Z * [new branch] ysiraichi/install-fmtlib-headers-v12 -> origin/ysiraichi/install-fmtlib-headers-v12 2025-10-10T00:44:17.2759917Z * [new branch] zainr/canary-test -> origin/zainr/canary-test 2025-10-10T00:44:17.2761750Z * [new branch] zainr/cleanup-gh-runners -> origin/zainr/cleanup-gh-runners 2025-10-10T00:44:17.2763516Z * [new branch] zainr/pull-migration-c -> origin/zainr/pull-migration-c 2025-10-10T00:44:17.2765260Z * [new branch] zainr/test2 -> origin/zainr/test2 2025-10-10T00:44:17.2767317Z * [new branch] zainr/unstable -> origin/zainr/unstable 2025-10-10T00:44:17.2769303Z * [new branch] zasdfgbnm-patch-3 -> origin/zasdfgbnm-patch-3 2025-10-10T00:44:17.2771310Z * [new branch] zb2p -> origin/zb2p 2025-10-10T00:44:17.2773241Z * [new branch] zeros-and-scatter-part2 -> origin/zeros-and-scatter-part2 2025-10-10T00:44:17.2775967Z * [new branch] zhxchen17/aot_compile_fix_load_guard_manager -> origin/zhxchen17/aot_compile_fix_load_guard_manager 2025-10-10T00:44:17.2778521Z * [new branch] zhxchen17/precompile/source_info -> origin/zhxchen17/precompile/source_info 2025-10-10T00:44:17.2780827Z * [new branch] zhxchen17/scratch/0 -> origin/zhxchen17/scratch/0 2025-10-10T00:44:17.2783448Z * [new branch] zhxhcen17/moodycamel -> origin/zhxhcen17/moodycamel 2025-10-10T00:44:17.2786064Z * [new branch] zxiiro/build-times -> origin/zxiiro/build-times 2025-10-10T00:44:17.2787868Z * [new branch] zxiiro/c7i-docs -> origin/zxiiro/c7i-docs 2025-10-10T00:44:17.2789842Z * [new branch] zxiiro/c7i-linux-4xlarge -> origin/zxiiro/c7i-linux-4xlarge 2025-10-10T00:44:17.2791685Z * [new branch] zxiiro/c7i-linux-build-yaml -> origin/zxiiro/c7i-linux-build-yaml 2025-10-10T00:44:17.2793675Z * [new branch] zxiiro/main -> origin/zxiiro/main 2025-10-10T00:44:17.2795562Z * [new branch] zxiiro/test-multicloud-arc -> origin/zxiiro/test-multicloud-arc 2025-10-10T00:44:17.2797310Z * [new tag] bc2caa7fdf006894eff7af936babde69ab5a40f8-huydhn-debug -> bc2caa7fdf006894eff7af936babde69ab5a40f8-huydhn-debug 2025-10-10T00:44:17.2799258Z * [new tag] ci/binaries/77164 -> ci/binaries/77164 2025-10-10T00:44:17.2801279Z * [new tag] ciflow/b200-symm-mem/163767 -> ciflow/b200-symm-mem/163767 2025-10-10T00:44:17.2802821Z * [new tag] ciflow/b200/163955 -> ciflow/b200/163955 2025-10-10T00:44:17.2804355Z * [new tag] ciflow/binaries/157432 -> ciflow/binaries/157432 2025-10-10T00:44:17.2805700Z * [new tag] ciflow/binaries/158104 -> ciflow/binaries/158104 2025-10-10T00:44:17.2807303Z * [new tag] ciflow/binaries/164769 -> ciflow/binaries/164769 2025-10-10T00:44:17.2808832Z * [new tag] ciflow/binaries/164894 -> ciflow/binaries/164894 2025-10-10T00:44:17.2810291Z * [new tag] ciflow/binaries_libtorch/157432 -> ciflow/binaries_libtorch/157432 2025-10-10T00:44:17.2811774Z * [new tag] ciflow/binaries_wheel/157432 -> ciflow/binaries_wheel/157432 2025-10-10T00:44:17.2813210Z * [new tag] ciflow/binaries_wheel/159104 -> ciflow/binaries_wheel/159104 2025-10-10T00:44:17.2814672Z * [new tag] ciflow/binaries_wheel/164935 -> ciflow/binaries_wheel/164935 2025-10-10T00:44:17.2816423Z * [new tag] ciflow/h100-cutlass-backend/163767 -> ciflow/h100-cutlass-backend/163767 2025-10-10T00:44:17.2817240Z * [new tag] ciflow/h100-cutlass-backend/164747 -> ciflow/h100-cutlass-backend/164747 2025-10-10T00:44:17.2818976Z * [new tag] ciflow/h100-distributed/163767 -> ciflow/h100-distributed/163767 2025-10-10T00:44:17.2820826Z * [new tag] ciflow/h100-symm-mem/151845 -> ciflow/h100-symm-mem/151845 2025-10-10T00:44:17.2821656Z * [new tag] ciflow/h100-symm-mem/157635 -> ciflow/h100-symm-mem/157635 2025-10-10T00:44:17.2823170Z * [new tag] ciflow/h100-symm-mem/163767 -> ciflow/h100-symm-mem/163767 2025-10-10T00:44:17.2824581Z * [new tag] ciflow/h100-symm-mem/164747 -> ciflow/h100-symm-mem/164747 2025-10-10T00:44:17.2825536Z * [new tag] ciflow/h100-symm-mem/164965 -> ciflow/h100-symm-mem/164965 2025-10-10T00:44:17.2827150Z * [new tag] ciflow/h100-symm-mem/165101 -> ciflow/h100-symm-mem/165101 2025-10-10T00:44:17.2828642Z * [new tag] ciflow/h100/163955 -> ciflow/h100/163955 2025-10-10T00:44:17.2829949Z * [new tag] ciflow/h100/164474 -> ciflow/h100/164474 2025-10-10T00:44:17.2831382Z * [new tag] ciflow/h100/164705 -> ciflow/h100/164705 2025-10-10T00:44:17.2832695Z * [new tag] ciflow/h100/164790 -> ciflow/h100/164790 2025-10-10T00:44:17.2833609Z * [new tag] ciflow/h100/164930 -> ciflow/h100/164930 2025-10-10T00:44:17.2835147Z * [new tag] ciflow/h100/165055 -> ciflow/h100/165055 2025-10-10T00:44:17.2837207Z * [new tag] ciflow/inductor-micro-benchmark/164747 -> ciflow/inductor-micro-benchmark/164747 2025-10-10T00:44:17.2838795Z * [new tag] ciflow/inductor-perf-compare/163767 -> ciflow/inductor-perf-compare/163767 2025-10-10T00:44:17.2839712Z * [new tag] ciflow/inductor-perf-compare/164747 -> ciflow/inductor-perf-compare/164747 2025-10-10T00:44:17.2841683Z * [new tag] ciflow/inductor-perf-test-nightly-rocm/151845 -> ciflow/inductor-perf-test-nightly-rocm/151845 2025-10-10T00:44:17.2842659Z * [new tag] ciflow/inductor-perf-test-nightly-rocm/164747 -> ciflow/inductor-perf-test-nightly-rocm/164747 2025-10-10T00:44:17.2845016Z * [new tag] ciflow/inductor-perf-test-nightly-x86-zen/161512 -> ciflow/inductor-perf-test-nightly-x86-zen/161512 2025-10-10T00:44:17.2845980Z * [new tag] ciflow/inductor-perf-test-nightly-x86-zen/162954 -> ciflow/inductor-perf-test-nightly-x86-zen/162954 2025-10-10T00:44:17.2847590Z * [new tag] ciflow/inductor-perf-test-nightly-x86-zen/163767 -> ciflow/inductor-perf-test-nightly-x86-zen/163767 2025-10-10T00:44:17.2848717Z * [new tag] ciflow/inductor-perf-test-nightly-x86-zen/164126 -> ciflow/inductor-perf-test-nightly-x86-zen/164126 2025-10-10T00:44:17.2850290Z * [new tag] ciflow/inductor-perf-test-nightly-x86-zen/164747 -> ciflow/inductor-perf-test-nightly-x86-zen/164747 2025-10-10T00:44:17.2852062Z * [new tag] ciflow/inductor-periodic/0d39ecb2ce8556e85343d8da0c87450192c2fdf8 -> ciflow/inductor-periodic/0d39ecb2ce8556e85343d8da0c87450192c2fdf8 2025-10-10T00:44:17.2853042Z * [new tag] ciflow/inductor-periodic/156592 -> ciflow/inductor-periodic/156592 2025-10-10T00:44:17.2854613Z * [new tag] ciflow/inductor-periodic/164492 -> ciflow/inductor-periodic/164492 2025-10-10T00:44:17.2856480Z * [new tag] ciflow/inductor-periodic/73adac05d13babb75410c3e033fdce57aa16881a -> ciflow/inductor-periodic/73adac05d13babb75410c3e033fdce57aa16881a 2025-10-10T00:44:17.2857385Z * [new tag] ciflow/inductor-rocm/151845 -> ciflow/inductor-rocm/151845 2025-10-10T00:44:17.2859130Z * [new tag] ciflow/inductor-rocm/161280 -> ciflow/inductor-rocm/161280 2025-10-10T00:44:17.2860663Z * [new tag] ciflow/inductor-rocm/162478 -> ciflow/inductor-rocm/162478 2025-10-10T00:44:17.2861605Z * [new tag] ciflow/inductor-rocm/163767 -> ciflow/inductor-rocm/163767 2025-10-10T00:44:17.2863395Z * [new tag] ciflow/inductor-rocm/164618 -> ciflow/inductor-rocm/164618 2025-10-10T00:44:17.2864513Z * [new tag] ciflow/inductor-rocm/164747 -> ciflow/inductor-rocm/164747 2025-10-10T00:44:17.2866000Z * [new tag] ciflow/inductor-rocm/164769 -> ciflow/inductor-rocm/164769 2025-10-10T00:44:17.2867392Z * [new tag] ciflow/inductor-rocm/165080 -> ciflow/inductor-rocm/165080 2025-10-10T00:44:17.2868931Z * [new tag] ciflow/inductor/137400 -> ciflow/inductor/137400 2025-10-10T00:44:17.2869874Z * [new tag] ciflow/inductor/148180 -> ciflow/inductor/148180 2025-10-10T00:44:17.2871484Z * [new tag] ciflow/inductor/148328 -> ciflow/inductor/148328 2025-10-10T00:44:17.2872565Z * [new tag] ciflow/inductor/148484 -> ciflow/inductor/148484 2025-10-10T00:44:17.2874061Z * [new tag] ciflow/inductor/148492 -> ciflow/inductor/148492 2025-10-10T00:44:17.2875332Z * [new tag] ciflow/inductor/149003 -> ciflow/inductor/149003 2025-10-10T00:44:17.2876667Z * [new tag] ciflow/inductor/151845 -> ciflow/inductor/151845 2025-10-10T00:44:17.2877572Z * [new tag] ciflow/inductor/152624 -> ciflow/inductor/152624 2025-10-10T00:44:17.2879123Z * [new tag] ciflow/inductor/156592 -> ciflow/inductor/156592 2025-10-10T00:44:17.2880427Z * [new tag] ciflow/inductor/157635 -> ciflow/inductor/157635 2025-10-10T00:44:17.2881861Z * [new tag] ciflow/inductor/157743 -> ciflow/inductor/157743 2025-10-10T00:44:17.2883571Z * [new tag] ciflow/inductor/157994 -> ciflow/inductor/157994 2025-10-10T00:44:17.2885261Z * [new tag] ciflow/inductor/158104 -> ciflow/inductor/158104 2025-10-10T00:44:17.2886810Z * [new tag] ciflow/inductor/158872 -> ciflow/inductor/158872 2025-10-10T00:44:17.2888393Z * [new tag] ciflow/inductor/159523 -> ciflow/inductor/159523 2025-10-10T00:44:17.2889407Z * [new tag] ciflow/inductor/160266 -> ciflow/inductor/160266 2025-10-10T00:44:17.2891236Z * [new tag] ciflow/inductor/160324 -> ciflow/inductor/160324 2025-10-10T00:44:17.2892751Z * [new tag] ciflow/inductor/160325 -> ciflow/inductor/160325 2025-10-10T00:44:17.2894475Z * [new tag] ciflow/inductor/160326 -> ciflow/inductor/160326 2025-10-10T00:44:17.2895814Z * [new tag] ciflow/inductor/160327 -> ciflow/inductor/160327 2025-10-10T00:44:17.2897294Z * [new tag] ciflow/inductor/160328 -> ciflow/inductor/160328 2025-10-10T00:44:17.2898960Z * [new tag] ciflow/inductor/160329 -> ciflow/inductor/160329 2025-10-10T00:44:17.2901303Z * [new tag] ciflow/inductor/160539 -> ciflow/inductor/160539 2025-10-10T00:44:17.2902175Z * [new tag] ciflow/inductor/160611 -> ciflow/inductor/160611 2025-10-10T00:44:17.2903805Z * [new tag] ciflow/inductor/160843 -> ciflow/inductor/160843 2025-10-10T00:44:17.2905424Z * [new tag] ciflow/inductor/160903 -> ciflow/inductor/160903 2025-10-10T00:44:17.2906780Z * [new tag] ciflow/inductor/161118 -> ciflow/inductor/161118 2025-10-10T00:44:17.2908043Z * [new tag] ciflow/inductor/161158 -> ciflow/inductor/161158 2025-10-10T00:44:17.2909394Z * [new tag] ciflow/inductor/161280 -> ciflow/inductor/161280 2025-10-10T00:44:17.2910698Z * [new tag] ciflow/inductor/161320 -> ciflow/inductor/161320 2025-10-10T00:44:17.2912531Z * [new tag] ciflow/inductor/161485 -> ciflow/inductor/161485 2025-10-10T00:44:17.2913854Z * [new tag] ciflow/inductor/161495 -> ciflow/inductor/161495 2025-10-10T00:44:17.2915153Z * [new tag] ciflow/inductor/161512 -> ciflow/inductor/161512 2025-10-10T00:44:17.2916500Z * [new tag] ciflow/inductor/162031 -> ciflow/inductor/162031 2025-10-10T00:44:17.2917915Z * [new tag] ciflow/inductor/162066 -> ciflow/inductor/162066 2025-10-10T00:44:17.2919264Z * [new tag] ciflow/inductor/162294 -> ciflow/inductor/162294 2025-10-10T00:44:17.2920546Z * [new tag] ciflow/inductor/162340 -> ciflow/inductor/162340 2025-10-10T00:44:17.2921892Z * [new tag] ciflow/inductor/162470 -> ciflow/inductor/162470 2025-10-10T00:44:17.2923532Z * [new tag] ciflow/inductor/162523 -> ciflow/inductor/162523 2025-10-10T00:44:17.2925185Z * [new tag] ciflow/inductor/162542 -> ciflow/inductor/162542 2025-10-10T00:44:17.2926061Z * [new tag] ciflow/inductor/162768 -> ciflow/inductor/162768 2025-10-10T00:44:17.2927667Z * [new tag] ciflow/inductor/162899 -> ciflow/inductor/162899 2025-10-10T00:44:17.2929612Z * [new tag] ciflow/inductor/162900 -> ciflow/inductor/162900 2025-10-10T00:44:17.2931046Z * [new tag] ciflow/inductor/162901 -> ciflow/inductor/162901 2025-10-10T00:44:17.2932311Z * [new tag] ciflow/inductor/162903 -> ciflow/inductor/162903 2025-10-10T00:44:17.2933702Z * [new tag] ciflow/inductor/162905 -> ciflow/inductor/162905 2025-10-10T00:44:17.2935127Z * [new tag] ciflow/inductor/162954 -> ciflow/inductor/162954 2025-10-10T00:44:17.2936491Z * [new tag] ciflow/inductor/162990 -> ciflow/inductor/162990 2025-10-10T00:44:17.2937870Z * [new tag] ciflow/inductor/163027 -> ciflow/inductor/163027 2025-10-10T00:44:17.2939127Z * [new tag] ciflow/inductor/163028 -> ciflow/inductor/163028 2025-10-10T00:44:17.2940568Z * [new tag] ciflow/inductor/163053 -> ciflow/inductor/163053 2025-10-10T00:44:17.2941851Z * [new tag] ciflow/inductor/163185 -> ciflow/inductor/163185 2025-10-10T00:44:17.2942954Z * [new tag] ciflow/inductor/163335 -> ciflow/inductor/163335 2025-10-10T00:44:17.2944618Z * [new tag] ciflow/inductor/163490 -> ciflow/inductor/163490 2025-10-10T00:44:17.2945939Z * [new tag] ciflow/inductor/163503 -> ciflow/inductor/163503 2025-10-10T00:44:17.2947505Z * [new tag] ciflow/inductor/163517 -> ciflow/inductor/163517 2025-10-10T00:44:17.2948816Z * [new tag] ciflow/inductor/163527 -> ciflow/inductor/163527 2025-10-10T00:44:17.2950255Z * [new tag] ciflow/inductor/163533 -> ciflow/inductor/163533 2025-10-10T00:44:17.2951635Z * [new tag] ciflow/inductor/163602 -> ciflow/inductor/163602 2025-10-10T00:44:17.2953076Z * [new tag] ciflow/inductor/163617 -> ciflow/inductor/163617 2025-10-10T00:44:17.2954424Z * [new tag] ciflow/inductor/163667 -> ciflow/inductor/163667 2025-10-10T00:44:17.2955951Z * [new tag] ciflow/inductor/163671 -> ciflow/inductor/163671 2025-10-10T00:44:17.2956795Z * [new tag] ciflow/inductor/163767 -> ciflow/inductor/163767 2025-10-10T00:44:17.2958363Z * [new tag] ciflow/inductor/163772 -> ciflow/inductor/163772 2025-10-10T00:44:17.2959732Z * [new tag] ciflow/inductor/163806 -> ciflow/inductor/163806 2025-10-10T00:44:17.2961085Z * [new tag] ciflow/inductor/163936 -> ciflow/inductor/163936 2025-10-10T00:44:17.2962899Z * [new tag] ciflow/inductor/163976 -> ciflow/inductor/163976 2025-10-10T00:44:17.2964144Z * [new tag] ciflow/inductor/164039 -> ciflow/inductor/164039 2025-10-10T00:44:17.2965612Z * [new tag] ciflow/inductor/164040 -> ciflow/inductor/164040 2025-10-10T00:44:17.2967046Z * [new tag] ciflow/inductor/164130 -> ciflow/inductor/164130 2025-10-10T00:44:17.2968558Z * [new tag] ciflow/inductor/164144 -> ciflow/inductor/164144 2025-10-10T00:44:17.2969891Z * [new tag] ciflow/inductor/164202 -> ciflow/inductor/164202 2025-10-10T00:44:17.2971468Z * [new tag] ciflow/inductor/164212 -> ciflow/inductor/164212 2025-10-10T00:44:17.2972405Z * [new tag] ciflow/inductor/164273 -> ciflow/inductor/164273 2025-10-10T00:44:17.2973873Z * [new tag] ciflow/inductor/164277 -> ciflow/inductor/164277 2025-10-10T00:44:17.2975244Z * [new tag] ciflow/inductor/164291 -> ciflow/inductor/164291 2025-10-10T00:44:17.2976546Z * [new tag] ciflow/inductor/164296 -> ciflow/inductor/164296 2025-10-10T00:44:17.2977936Z * [new tag] ciflow/inductor/164304 -> ciflow/inductor/164304 2025-10-10T00:44:17.2979365Z * [new tag] ciflow/inductor/164318 -> ciflow/inductor/164318 2025-10-10T00:44:17.2980688Z * [new tag] ciflow/inductor/164321 -> ciflow/inductor/164321 2025-10-10T00:44:17.2982031Z * [new tag] ciflow/inductor/164324 -> ciflow/inductor/164324 2025-10-10T00:44:17.2983425Z * [new tag] ciflow/inductor/164341 -> ciflow/inductor/164341 2025-10-10T00:44:17.2984753Z * [new tag] ciflow/inductor/164343 -> ciflow/inductor/164343 2025-10-10T00:44:17.2986149Z * [new tag] ciflow/inductor/164344 -> ciflow/inductor/164344 2025-10-10T00:44:17.2987621Z * [new tag] ciflow/inductor/164359 -> ciflow/inductor/164359 2025-10-10T00:44:17.2989153Z * [new tag] ciflow/inductor/164373 -> ciflow/inductor/164373 2025-10-10T00:44:17.2990343Z * [new tag] ciflow/inductor/164379 -> ciflow/inductor/164379 2025-10-10T00:44:17.2991951Z * [new tag] ciflow/inductor/164384 -> ciflow/inductor/164384 2025-10-10T00:44:17.2992832Z * [new tag] ciflow/inductor/164404 -> ciflow/inductor/164404 2025-10-10T00:44:17.2994596Z * [new tag] ciflow/inductor/164405 -> ciflow/inductor/164405 2025-10-10T00:44:17.2995977Z * [new tag] ciflow/inductor/164414 -> ciflow/inductor/164414 2025-10-10T00:44:17.2997057Z * [new tag] ciflow/inductor/164422 -> ciflow/inductor/164422 2025-10-10T00:44:17.2998873Z * [new tag] ciflow/inductor/164433 -> ciflow/inductor/164433 2025-10-10T00:44:17.3000339Z * [new tag] ciflow/inductor/164474 -> ciflow/inductor/164474 2025-10-10T00:44:17.3001595Z * [new tag] ciflow/inductor/164488 -> ciflow/inductor/164488 2025-10-10T00:44:17.3002915Z * [new tag] ciflow/inductor/164492 -> ciflow/inductor/164492 2025-10-10T00:44:17.3004387Z * [new tag] ciflow/inductor/164497 -> ciflow/inductor/164497 2025-10-10T00:44:17.3005759Z * [new tag] ciflow/inductor/164498 -> ciflow/inductor/164498 2025-10-10T00:44:17.3007336Z * [new tag] ciflow/inductor/164500 -> ciflow/inductor/164500 2025-10-10T00:44:17.3008615Z * [new tag] ciflow/inductor/164507 -> ciflow/inductor/164507 2025-10-10T00:44:17.3009957Z * [new tag] ciflow/inductor/164519 -> ciflow/inductor/164519 2025-10-10T00:44:17.3011486Z * [new tag] ciflow/inductor/164521 -> ciflow/inductor/164521 2025-10-10T00:44:17.3012891Z * [new tag] ciflow/inductor/164522 -> ciflow/inductor/164522 2025-10-10T00:44:17.3014315Z * [new tag] ciflow/inductor/164523 -> ciflow/inductor/164523 2025-10-10T00:44:17.3015816Z * [new tag] ciflow/inductor/164524 -> ciflow/inductor/164524 2025-10-10T00:44:17.3016690Z * [new tag] ciflow/inductor/164525 -> ciflow/inductor/164525 2025-10-10T00:44:17.3018766Z * [new tag] ciflow/inductor/164526 -> ciflow/inductor/164526 2025-10-10T00:44:17.3020361Z * [new tag] ciflow/inductor/164527 -> ciflow/inductor/164527 2025-10-10T00:44:17.3021750Z * [new tag] ciflow/inductor/164533 -> ciflow/inductor/164533 2025-10-10T00:44:17.3022880Z * [new tag] ciflow/inductor/164537 -> ciflow/inductor/164537 2025-10-10T00:44:17.3024433Z * [new tag] ciflow/inductor/164548 -> ciflow/inductor/164548 2025-10-10T00:44:17.3026012Z * [new tag] ciflow/inductor/164557 -> ciflow/inductor/164557 2025-10-10T00:44:17.3027428Z * [new tag] ciflow/inductor/164558 -> ciflow/inductor/164558 2025-10-10T00:44:17.3029213Z * [new tag] ciflow/inductor/164560 -> ciflow/inductor/164560 2025-10-10T00:44:17.3030504Z * [new tag] ciflow/inductor/164565 -> ciflow/inductor/164565 2025-10-10T00:44:17.3031918Z * [new tag] ciflow/inductor/164577 -> ciflow/inductor/164577 2025-10-10T00:44:17.3033450Z * [new tag] ciflow/inductor/164609 -> ciflow/inductor/164609 2025-10-10T00:44:17.3034774Z * [new tag] ciflow/inductor/164610 -> ciflow/inductor/164610 2025-10-10T00:44:17.3036131Z * [new tag] ciflow/inductor/164611 -> ciflow/inductor/164611 2025-10-10T00:44:17.3037702Z * [new tag] ciflow/inductor/164612 -> ciflow/inductor/164612 2025-10-10T00:44:17.3039029Z * [new tag] ciflow/inductor/164613 -> ciflow/inductor/164613 2025-10-10T00:44:17.3040550Z * [new tag] ciflow/inductor/164614 -> ciflow/inductor/164614 2025-10-10T00:44:17.3041895Z * [new tag] ciflow/inductor/164623 -> ciflow/inductor/164623 2025-10-10T00:44:17.3043346Z * [new tag] ciflow/inductor/164626 -> ciflow/inductor/164626 2025-10-10T00:44:17.3044697Z * [new tag] ciflow/inductor/164628 -> ciflow/inductor/164628 2025-10-10T00:44:17.3045788Z * [new tag] ciflow/inductor/164631 -> ciflow/inductor/164631 2025-10-10T00:44:17.3047436Z * [new tag] ciflow/inductor/164632 -> ciflow/inductor/164632 2025-10-10T00:44:17.3048932Z * [new tag] ciflow/inductor/164633 -> ciflow/inductor/164633 2025-10-10T00:44:17.3050345Z * [new tag] ciflow/inductor/164640 -> ciflow/inductor/164640 2025-10-10T00:44:17.3051853Z * [new tag] ciflow/inductor/164641 -> ciflow/inductor/164641 2025-10-10T00:44:17.3053331Z * [new tag] ciflow/inductor/164645 -> ciflow/inductor/164645 2025-10-10T00:44:17.3054752Z * [new tag] ciflow/inductor/164648 -> ciflow/inductor/164648 2025-10-10T00:44:17.3056269Z * [new tag] ciflow/inductor/164653 -> ciflow/inductor/164653 2025-10-10T00:44:17.3057713Z * [new tag] ciflow/inductor/164655 -> ciflow/inductor/164655 2025-10-10T00:44:17.3059296Z * [new tag] ciflow/inductor/164657 -> ciflow/inductor/164657 2025-10-10T00:44:17.3060623Z * [new tag] ciflow/inductor/164659 -> ciflow/inductor/164659 2025-10-10T00:44:17.3062097Z * [new tag] ciflow/inductor/164669 -> ciflow/inductor/164669 2025-10-10T00:44:17.3063553Z * [new tag] ciflow/inductor/164690 -> ciflow/inductor/164690 2025-10-10T00:44:17.3064928Z * [new tag] ciflow/inductor/164691 -> ciflow/inductor/164691 2025-10-10T00:44:17.3066439Z * [new tag] ciflow/inductor/164692 -> ciflow/inductor/164692 2025-10-10T00:44:17.3067871Z * [new tag] ciflow/inductor/164711 -> ciflow/inductor/164711 2025-10-10T00:44:17.3069255Z * [new tag] ciflow/inductor/164714 -> ciflow/inductor/164714 2025-10-10T00:44:17.3070616Z * [new tag] ciflow/inductor/164717 -> ciflow/inductor/164717 2025-10-10T00:44:17.3071972Z * [new tag] ciflow/inductor/164718 -> ciflow/inductor/164718 2025-10-10T00:44:17.3073606Z * [new tag] ciflow/inductor/164723 -> ciflow/inductor/164723 2025-10-10T00:44:17.3074989Z * [new tag] ciflow/inductor/164724 -> ciflow/inductor/164724 2025-10-10T00:44:17.3076350Z * [new tag] ciflow/inductor/164734 -> ciflow/inductor/164734 2025-10-10T00:44:17.3077632Z * [new tag] ciflow/inductor/164740 -> ciflow/inductor/164740 2025-10-10T00:44:17.3079304Z * [new tag] ciflow/inductor/164746 -> ciflow/inductor/164746 2025-10-10T00:44:17.3080701Z * [new tag] ciflow/inductor/164747 -> ciflow/inductor/164747 2025-10-10T00:44:17.3082092Z * [new tag] ciflow/inductor/164776 -> ciflow/inductor/164776 2025-10-10T00:44:17.3083458Z * [new tag] ciflow/inductor/164778 -> ciflow/inductor/164778 2025-10-10T00:44:17.3084965Z * [new tag] ciflow/inductor/164780 -> ciflow/inductor/164780 2025-10-10T00:44:17.3086281Z * [new tag] ciflow/inductor/164794 -> ciflow/inductor/164794 2025-10-10T00:44:17.3087659Z * [new tag] ciflow/inductor/164802 -> ciflow/inductor/164802 2025-10-10T00:44:17.3089102Z * [new tag] ciflow/inductor/164806 -> ciflow/inductor/164806 2025-10-10T00:44:17.3090651Z * [new tag] ciflow/inductor/164808 -> ciflow/inductor/164808 2025-10-10T00:44:17.3092011Z * [new tag] ciflow/inductor/164810 -> ciflow/inductor/164810 2025-10-10T00:44:17.3093413Z * [new tag] ciflow/inductor/164811 -> ciflow/inductor/164811 2025-10-10T00:44:17.3094772Z * [new tag] ciflow/inductor/164812 -> ciflow/inductor/164812 2025-10-10T00:44:17.3096143Z * [new tag] ciflow/inductor/164819 -> ciflow/inductor/164819 2025-10-10T00:44:17.3097472Z * [new tag] ciflow/inductor/164820 -> ciflow/inductor/164820 2025-10-10T00:44:17.3098939Z * [new tag] ciflow/inductor/164821 -> ciflow/inductor/164821 2025-10-10T00:44:17.3101513Z * [new tag] ciflow/inductor/164839 -> ciflow/inductor/164839 2025-10-10T00:44:17.3102901Z * [new tag] ciflow/inductor/164842 -> ciflow/inductor/164842 2025-10-10T00:44:17.3104245Z * [new tag] ciflow/inductor/164847 -> ciflow/inductor/164847 2025-10-10T00:44:17.3105593Z * [new tag] ciflow/inductor/164852 -> ciflow/inductor/164852 2025-10-10T00:44:17.3106990Z * [new tag] ciflow/inductor/164863 -> ciflow/inductor/164863 2025-10-10T00:44:17.3108349Z * [new tag] ciflow/inductor/164865 -> ciflow/inductor/164865 2025-10-10T00:44:17.3109733Z * [new tag] ciflow/inductor/164866 -> ciflow/inductor/164866 2025-10-10T00:44:17.3111546Z * [new tag] ciflow/inductor/164867 -> ciflow/inductor/164867 2025-10-10T00:44:17.3113120Z * [new tag] ciflow/inductor/164869 -> ciflow/inductor/164869 2025-10-10T00:44:17.3114044Z * [new tag] ciflow/inductor/164873 -> ciflow/inductor/164873 2025-10-10T00:44:17.3115736Z * [new tag] ciflow/inductor/164889 -> ciflow/inductor/164889 2025-10-10T00:44:17.3117108Z * [new tag] ciflow/inductor/164897 -> ciflow/inductor/164897 2025-10-10T00:44:17.3118814Z * [new tag] ciflow/inductor/164902 -> ciflow/inductor/164902 2025-10-10T00:44:17.3119548Z * [new tag] ciflow/inductor/164903 -> ciflow/inductor/164903 2025-10-10T00:44:17.3121173Z * [new tag] ciflow/inductor/164906 -> ciflow/inductor/164906 2025-10-10T00:44:17.3122577Z * [new tag] ciflow/inductor/164914 -> ciflow/inductor/164914 2025-10-10T00:44:17.3123997Z * [new tag] ciflow/inductor/164919 -> ciflow/inductor/164919 2025-10-10T00:44:17.3125380Z * [new tag] ciflow/inductor/164933 -> ciflow/inductor/164933 2025-10-10T00:44:17.3126854Z * [new tag] ciflow/inductor/164938 -> ciflow/inductor/164938 2025-10-10T00:44:17.3128419Z * [new tag] ciflow/inductor/164948 -> ciflow/inductor/164948 2025-10-10T00:44:17.3129766Z * [new tag] ciflow/inductor/164956 -> ciflow/inductor/164956 2025-10-10T00:44:17.3131124Z * [new tag] ciflow/inductor/164965 -> ciflow/inductor/164965 2025-10-10T00:44:17.3132497Z * [new tag] ciflow/inductor/164978 -> ciflow/inductor/164978 2025-10-10T00:44:17.3133886Z * [new tag] ciflow/inductor/164979 -> ciflow/inductor/164979 2025-10-10T00:44:17.3135292Z * [new tag] ciflow/inductor/164980 -> ciflow/inductor/164980 2025-10-10T00:44:17.3136801Z * [new tag] ciflow/inductor/164984 -> ciflow/inductor/164984 2025-10-10T00:44:17.3138152Z * [new tag] ciflow/inductor/164989 -> ciflow/inductor/164989 2025-10-10T00:44:17.3139502Z * [new tag] ciflow/inductor/164991 -> ciflow/inductor/164991 2025-10-10T00:44:17.3141064Z * [new tag] ciflow/inductor/164992 -> ciflow/inductor/164992 2025-10-10T00:44:17.3142416Z * [new tag] ciflow/inductor/164994 -> ciflow/inductor/164994 2025-10-10T00:44:17.3143784Z * [new tag] ciflow/inductor/164999 -> ciflow/inductor/164999 2025-10-10T00:44:17.3145430Z * [new tag] ciflow/inductor/165001 -> ciflow/inductor/165001 2025-10-10T00:44:17.3146219Z * [new tag] ciflow/inductor/165005 -> ciflow/inductor/165005 2025-10-10T00:44:17.3147855Z * [new tag] ciflow/inductor/165006 -> ciflow/inductor/165006 2025-10-10T00:44:17.3149268Z * [new tag] ciflow/inductor/165010 -> ciflow/inductor/165010 2025-10-10T00:44:17.3150329Z * [new tag] ciflow/inductor/165012 -> ciflow/inductor/165012 2025-10-10T00:44:17.3151929Z * [new tag] ciflow/inductor/165017 -> ciflow/inductor/165017 2025-10-10T00:44:17.3153560Z * [new tag] ciflow/inductor/165018 -> ciflow/inductor/165018 2025-10-10T00:44:17.3154938Z * [new tag] ciflow/inductor/165024 -> ciflow/inductor/165024 2025-10-10T00:44:17.3156397Z * [new tag] ciflow/inductor/165029 -> ciflow/inductor/165029 2025-10-10T00:44:17.3157916Z * [new tag] ciflow/inductor/165030 -> ciflow/inductor/165030 2025-10-10T00:44:17.3159586Z * [new tag] ciflow/inductor/165031 -> ciflow/inductor/165031 2025-10-10T00:44:17.3160930Z * [new tag] ciflow/inductor/165033 -> ciflow/inductor/165033 2025-10-10T00:44:17.3162599Z * [new tag] ciflow/inductor/165036 -> ciflow/inductor/165036 2025-10-10T00:44:17.3164236Z * [new tag] ciflow/inductor/165037 -> ciflow/inductor/165037 2025-10-10T00:44:17.3165115Z * [new tag] ciflow/inductor/165039 -> ciflow/inductor/165039 2025-10-10T00:44:17.3166857Z * [new tag] ciflow/inductor/165047 -> ciflow/inductor/165047 2025-10-10T00:44:17.3168543Z * [new tag] ciflow/inductor/165059 -> ciflow/inductor/165059 2025-10-10T00:44:17.3169971Z * [new tag] ciflow/inductor/165063 -> ciflow/inductor/165063 2025-10-10T00:44:17.3171330Z * [new tag] ciflow/inductor/165064 -> ciflow/inductor/165064 2025-10-10T00:44:17.3172695Z * [new tag] ciflow/inductor/165066 -> ciflow/inductor/165066 2025-10-10T00:44:17.3174098Z * [new tag] ciflow/inductor/165074 -> ciflow/inductor/165074 2025-10-10T00:44:17.3175446Z * [new tag] ciflow/inductor/165076 -> ciflow/inductor/165076 2025-10-10T00:44:17.3176863Z * [new tag] ciflow/inductor/165091 -> ciflow/inductor/165091 2025-10-10T00:44:17.3178378Z * [new tag] ciflow/inductor/165092 -> ciflow/inductor/165092 2025-10-10T00:44:17.3179723Z * [new tag] ciflow/inductor/165106 -> ciflow/inductor/165106 2025-10-10T00:44:17.3181116Z * [new tag] ciflow/inductor/165107 -> ciflow/inductor/165107 2025-10-10T00:44:17.3182693Z * [new tag] ciflow/inductor/165112 -> ciflow/inductor/165112 2025-10-10T00:44:17.3184109Z * [new tag] ciflow/inductor/165113 -> ciflow/inductor/165113 2025-10-10T00:44:17.3185496Z * [new tag] ciflow/inductor/165114 -> ciflow/inductor/165114 2025-10-10T00:44:17.3187130Z * [new tag] ciflow/inductor/3b9a386 -> ciflow/inductor/3b9a386 2025-10-10T00:44:17.3188681Z * [new tag] ciflow/inductor/3d4b92b -> ciflow/inductor/3d4b92b 2025-10-10T00:44:17.3190189Z * [new tag] ciflow/inductor/d224ac7 -> ciflow/inductor/d224ac7 2025-10-10T00:44:17.3192316Z * [new tag] ciflow/linux-aarch64/157994 -> ciflow/linux-aarch64/157994 2025-10-10T00:44:17.3193804Z * [new tag] ciflow/linux-aarch64/163952 -> ciflow/linux-aarch64/163952 2025-10-10T00:44:17.3194750Z * [new tag] ciflow/linux-aarch64/164965 -> ciflow/linux-aarch64/164965 2025-10-10T00:44:17.3196279Z * [new tag] ciflow/linux-aarch64/165010 -> ciflow/linux-aarch64/165010 2025-10-10T00:44:17.3197873Z * [new tag] ciflow/mps/157553 -> ciflow/mps/157553 2025-10-10T00:44:17.3198859Z * [new tag] ciflow/mps/157554 -> ciflow/mps/157554 2025-10-10T00:44:17.3201078Z * [new tag] ciflow/mps/157635 -> ciflow/mps/157635 2025-10-10T00:44:17.3202403Z * [new tag] ciflow/mps/162340 -> ciflow/mps/162340 2025-10-10T00:44:17.3203738Z * [new tag] ciflow/mps/164416 -> ciflow/mps/164416 2025-10-10T00:44:17.3205002Z * [new tag] ciflow/mps/164571 -> ciflow/mps/164571 2025-10-10T00:44:17.3206349Z * [new tag] ciflow/mps/164965 -> ciflow/mps/164965 2025-10-10T00:44:17.3208085Z * [new tag] ciflow/nightly/158104 -> ciflow/nightly/158104 2025-10-10T00:44:17.3209456Z * [new tag] ciflow/nightly/164747 -> ciflow/nightly/164747 2025-10-10T00:44:17.3210374Z * [new tag] ciflow/nightly/164901 -> ciflow/nightly/164901 2025-10-10T00:44:17.3212262Z * [new tag] ciflow/op-benchmark/157994 -> ciflow/op-benchmark/157994 2025-10-10T00:44:17.3213435Z * [new tag] ciflow/op-benchmark/163767 -> ciflow/op-benchmark/163767 2025-10-10T00:44:17.3214868Z * [new tag] ciflow/op-benchmark/164583 -> ciflow/op-benchmark/164583 2025-10-10T00:44:17.3215742Z * [new tag] ciflow/op-benchmark/164747 -> ciflow/op-benchmark/164747 2025-10-10T00:44:17.3217930Z * [new tag] ciflow/periodic-rocm-mi300/162478 -> ciflow/periodic-rocm-mi300/162478 2025-10-10T00:44:17.3218748Z * [new tag] ciflow/periodic-rocm-mi300/163767 -> ciflow/periodic-rocm-mi300/163767 2025-10-10T00:44:17.3220327Z * [new tag] ciflow/periodic-rocm-mi300/164618 -> ciflow/periodic-rocm-mi300/164618 2025-10-10T00:44:17.3221284Z * [new tag] ciflow/periodic-rocm-mi300/164747 -> ciflow/periodic-rocm-mi300/164747 2025-10-10T00:44:17.3222988Z * [new tag] ciflow/periodic-rocm-mi300/165011 -> ciflow/periodic-rocm-mi300/165011 2025-10-10T00:44:17.3223998Z * [new tag] ciflow/periodic-rocm-mi300/165080 -> ciflow/periodic-rocm-mi300/165080 2025-10-10T00:44:17.3226033Z * [new tag] ciflow/periodic/054a2fd -> ciflow/periodic/054a2fd 2025-10-10T00:44:17.3227276Z * [new tag] ciflow/periodic/0d39ecb2ce8556e85343d8da0c87450192c2fdf8 -> ciflow/periodic/0d39ecb2ce8556e85343d8da0c87450192c2fdf8 2025-10-10T00:44:17.3229225Z * [new tag] ciflow/periodic/0ea59c3c55dab37a6edefcc7002bb1428afd6456 -> ciflow/periodic/0ea59c3c55dab37a6edefcc7002bb1428afd6456 2025-10-10T00:44:17.3230104Z * [new tag] ciflow/periodic/156491 -> ciflow/periodic/156491 2025-10-10T00:44:17.3231648Z * [new tag] ciflow/periodic/162990 -> ciflow/periodic/162990 2025-10-10T00:44:17.3232607Z * [new tag] ciflow/periodic/163667 -> ciflow/periodic/163667 2025-10-10T00:44:17.3234144Z * [new tag] ciflow/periodic/163767 -> ciflow/periodic/163767 2025-10-10T00:44:17.3235818Z * [new tag] ciflow/periodic/164747 -> ciflow/periodic/164747 2025-10-10T00:44:17.3237405Z * [new tag] ciflow/periodic/164769 -> ciflow/periodic/164769 2025-10-10T00:44:17.3238744Z * [new tag] ciflow/periodic/165011 -> ciflow/periodic/165011 2025-10-10T00:44:17.3240630Z * [new tag] ciflow/periodic/2a6cdba6e5f74c2294fecc2d1344537522efbaab -> ciflow/periodic/2a6cdba6e5f74c2294fecc2d1344537522efbaab 2025-10-10T00:44:17.3241582Z * [new tag] ciflow/periodic/2a6d37d -> ciflow/periodic/2a6d37d 2025-10-10T00:44:17.3243471Z * [new tag] ciflow/periodic/317eeb8 -> ciflow/periodic/317eeb8 2025-10-10T00:44:17.3244956Z * [new tag] ciflow/periodic/3c32 -> ciflow/periodic/3c32 2025-10-10T00:44:17.3246545Z * [new tag] ciflow/periodic/3e98831 -> ciflow/periodic/3e98831 2025-10-10T00:44:17.3248430Z * [new tag] ciflow/periodic/4bcc05777e780e834d44a2d06dd5321daec316f0 -> ciflow/periodic/4bcc05777e780e834d44a2d06dd5321daec316f0 2025-10-10T00:44:17.3249656Z * [new tag] ciflow/periodic/73adac05d13babb75410c3e033fdce57aa16881a -> ciflow/periodic/73adac05d13babb75410c3e033fdce57aa16881a 2025-10-10T00:44:17.3251361Z * [new tag] ciflow/periodic/94512-point -> ciflow/periodic/94512-point 2025-10-10T00:44:17.3253124Z * [new tag] ciflow/periodic/ac08556f674259ff5b117964e300124e8a92d45b -> ciflow/periodic/ac08556f674259ff5b117964e300124e8a92d45b 2025-10-10T00:44:17.3254940Z * [new tag] ciflow/periodic/csl/test87519 -> ciflow/periodic/csl/test87519 2025-10-10T00:44:17.3256341Z * [new tag] ciflow/periodic/csltest88275 -> ciflow/periodic/csltest88275 2025-10-10T00:44:17.3257867Z * [new tag] ciflow/periodic/csltest88761 -> ciflow/periodic/csltest88761 2025-10-10T00:44:17.3259466Z * [new tag] ciflow/periodic/release_1.12 -> ciflow/periodic/release_1.12 2025-10-10T00:44:17.3261716Z * [new tag] ciflow/periodic/release_1.12.0 -> ciflow/periodic/release_1.12.0 2025-10-10T00:44:17.3263473Z * [new tag] ciflow/periodic/sha-ec5b83 -> ciflow/periodic/sha-ec5b83 2025-10-10T00:44:17.3265597Z * [new tag] ciflow/quantization-periodic/163767 -> ciflow/quantization-periodic/163767 2025-10-10T00:44:17.3266703Z * [new tag] ciflow/quantization-periodic/164747 -> ciflow/quantization-periodic/164747 2025-10-10T00:44:17.3268372Z * [new tag] ciflow/riscv64/163767 -> ciflow/riscv64/163767 2025-10-10T00:44:17.3269454Z * [new tag] ciflow/riscv64/164747 -> ciflow/riscv64/164747 2025-10-10T00:44:17.3271201Z * [new tag] ciflow/rocm-mi300/161280 -> ciflow/rocm-mi300/161280 2025-10-10T00:44:17.3272296Z * [new tag] ciflow/rocm-mi300/162478 -> ciflow/rocm-mi300/162478 2025-10-10T00:44:17.3273829Z * [new tag] ciflow/rocm-mi300/163767 -> ciflow/rocm-mi300/163767 2025-10-10T00:44:17.3274743Z * [new tag] ciflow/rocm-mi300/163955 -> ciflow/rocm-mi300/163955 2025-10-10T00:44:17.3276295Z * [new tag] ciflow/rocm-mi300/164618 -> ciflow/rocm-mi300/164618 2025-10-10T00:44:17.3277563Z * [new tag] ciflow/rocm-mi300/164747 -> ciflow/rocm-mi300/164747 2025-10-10T00:44:17.3279021Z * [new tag] ciflow/rocm-mi300/164927 -> ciflow/rocm-mi300/164927 2025-10-10T00:44:17.3279908Z * [new tag] ciflow/rocm-mi300/164930 -> ciflow/rocm-mi300/164930 2025-10-10T00:44:17.3281588Z * [new tag] ciflow/rocm-mi300/165026 -> ciflow/rocm-mi300/165026 2025-10-10T00:44:17.3282933Z * [new tag] ciflow/rocm-mi300/165080 -> ciflow/rocm-mi300/165080 2025-10-10T00:44:17.3284529Z * [new tag] ciflow/rocm-mi355/163767 -> ciflow/rocm-mi355/163767 2025-10-10T00:44:17.3285843Z * [new tag] ciflow/rocm-mi355/164747 -> ciflow/rocm-mi355/164747 2025-10-10T00:44:17.3287990Z * [new tag] ciflow/rocm/148492 -> ciflow/rocm/148492 2025-10-10T00:44:17.3289265Z * [new tag] ciflow/rocm/151845 -> ciflow/rocm/151845 2025-10-10T00:44:17.3290272Z * [new tag] ciflow/rocm/156592 -> ciflow/rocm/156592 2025-10-10T00:44:17.3291707Z * [new tag] ciflow/rocm/161280 -> ciflow/rocm/161280 2025-10-10T00:44:17.3293041Z * [new tag] ciflow/rocm/163767 -> ciflow/rocm/163767 2025-10-10T00:44:17.3294057Z * [new tag] ciflow/rocm/163955 -> ciflow/rocm/163955 2025-10-10T00:44:17.3295673Z * [new tag] ciflow/rocm/163965 -> ciflow/rocm/163965 2025-10-10T00:44:17.3297104Z * [new tag] ciflow/rocm/164656 -> ciflow/rocm/164656 2025-10-10T00:44:17.3298021Z * [new tag] ciflow/rocm/164747 -> ciflow/rocm/164747 2025-10-10T00:44:17.3299798Z * [new tag] ciflow/rocm/164769 -> ciflow/rocm/164769 2025-10-10T00:44:17.3301214Z * [new tag] ciflow/rocm/164927 -> ciflow/rocm/164927 2025-10-10T00:44:17.3302775Z * [new tag] ciflow/rocm/164930 -> ciflow/rocm/164930 2025-10-10T00:44:17.3304510Z * [new tag] ciflow/rocm/165026 -> ciflow/rocm/165026 2025-10-10T00:44:17.3306105Z * [new tag] ciflow/rocm/165103 -> ciflow/rocm/165103 2025-10-10T00:44:17.3307696Z * [new tag] ciflow/s390/164747 -> ciflow/s390/164747 2025-10-10T00:44:17.3309100Z * [new tag] ciflow/s390/164917 -> ciflow/s390/164917 2025-10-10T00:44:17.3310810Z * [new tag] ciflow/slow/01c7106 -> ciflow/slow/01c7106 2025-10-10T00:44:17.3312221Z * [new tag] ciflow/slow/0577043 -> ciflow/slow/0577043 2025-10-10T00:44:17.3314094Z * [new tag] ciflow/slow/0d5b74da0cab798fbfdb9caa53fad816999c8386-sdym -> ciflow/slow/0d5b74da0cab798fbfdb9caa53fad816999c8386-sdym 2025-10-10T00:44:17.3314916Z * [new tag] ciflow/slow/0e81104 -> ciflow/slow/0e81104 2025-10-10T00:44:17.3316487Z * [new tag] ciflow/slow/163767 -> ciflow/slow/163767 2025-10-10T00:44:17.3317857Z * [new tag] ciflow/slow/164747 -> ciflow/slow/164747 2025-10-10T00:44:17.3318706Z * [new tag] ciflow/slow/164769 -> ciflow/slow/164769 2025-10-10T00:44:17.3320473Z * [new tag] ciflow/slow/1732077 -> ciflow/slow/1732077 2025-10-10T00:44:17.3322313Z * [new tag] ciflow/slow/187eb7c -> ciflow/slow/187eb7c 2025-10-10T00:44:17.3324250Z * [new tag] ciflow/slow/1faef89 -> ciflow/slow/1faef89 2025-10-10T00:44:17.3326127Z * [new tag] ciflow/slow/3920ec1 -> ciflow/slow/3920ec1 2025-10-10T00:44:17.3327536Z * [new tag] ciflow/slow/3b7c6b2 -> ciflow/slow/3b7c6b2 2025-10-10T00:44:17.3329193Z * [new tag] ciflow/slow/59a3759 -> ciflow/slow/59a3759 2025-10-10T00:44:17.3330853Z * [new tag] ciflow/slow/70ef0bb -> ciflow/slow/70ef0bb 2025-10-10T00:44:17.3332168Z * [new tag] ciflow/slow/788ff06 -> ciflow/slow/788ff06 2025-10-10T00:44:17.3334123Z * [new tag] ciflow/slow/8751002215790a3a88750faa8f4366933e296693-sdym -> ciflow/slow/8751002215790a3a88750faa8f4366933e296693-sdym 2025-10-10T00:44:17.3335193Z * [new tag] ciflow/slow/9d85864 -> ciflow/slow/9d85864 2025-10-10T00:44:17.3342184Z * [new tag] ciflow/slow/9ffad5b -> ciflow/slow/9ffad5b 2025-10-10T00:44:17.3342594Z * [new tag] ciflow/slow/a206e8b -> ciflow/slow/a206e8b 2025-10-10T00:44:17.3342782Z * [new tag] ciflow/slow/a837609 -> ciflow/slow/a837609 2025-10-10T00:44:17.3342952Z * [new tag] ciflow/slow/af841f3 -> ciflow/slow/af841f3 2025-10-10T00:44:17.3343418Z * [new tag] ciflow/slow/da3aba1e46157c4df504b067477cdf2b3c96b194-sdym -> ciflow/slow/da3aba1e46157c4df504b067477cdf2b3c96b194-sdym 2025-10-10T00:44:17.3344858Z * [new tag] ciflow/torchbench/164747 -> ciflow/torchbench/164747 2025-10-10T00:44:17.3346455Z * [new tag] ciflow/trunk/113258 -> ciflow/trunk/113258 2025-10-10T00:44:17.3347786Z * [new tag] ciflow/trunk/137400 -> ciflow/trunk/137400 2025-10-10T00:44:17.3348876Z * [new tag] ciflow/trunk/148180 -> ciflow/trunk/148180 2025-10-10T00:44:17.3350285Z * [new tag] ciflow/trunk/148328 -> ciflow/trunk/148328 2025-10-10T00:44:17.3351377Z * [new tag] ciflow/trunk/148492 -> ciflow/trunk/148492 2025-10-10T00:44:17.3352790Z * [new tag] ciflow/trunk/149003 -> ciflow/trunk/149003 2025-10-10T00:44:17.3354074Z * [new tag] ciflow/trunk/149536 -> ciflow/trunk/149536 2025-10-10T00:44:17.3355327Z * [new tag] ciflow/trunk/151845 -> ciflow/trunk/151845 2025-10-10T00:44:17.3356593Z * [new tag] ciflow/trunk/152624 -> ciflow/trunk/152624 2025-10-10T00:44:17.3358010Z * [new tag] ciflow/trunk/154279 -> ciflow/trunk/154279 2025-10-10T00:44:17.3359305Z * [new tag] ciflow/trunk/154983 -> ciflow/trunk/154983 2025-10-10T00:44:17.3361085Z * [new tag] ciflow/trunk/156418 -> ciflow/trunk/156418 2025-10-10T00:44:17.3362850Z * [new tag] ciflow/trunk/156592 -> ciflow/trunk/156592 2025-10-10T00:44:17.3364370Z * [new tag] ciflow/trunk/157432 -> ciflow/trunk/157432 2025-10-10T00:44:17.3365780Z * [new tag] ciflow/trunk/157994 -> ciflow/trunk/157994 2025-10-10T00:44:17.3367162Z * [new tag] ciflow/trunk/158104 -> ciflow/trunk/158104 2025-10-10T00:44:17.3368641Z * [new tag] ciflow/trunk/159104 -> ciflow/trunk/159104 2025-10-10T00:44:17.3369933Z * [new tag] ciflow/trunk/160266 -> ciflow/trunk/160266 2025-10-10T00:44:17.3371492Z * [new tag] ciflow/trunk/160328 -> ciflow/trunk/160328 2025-10-10T00:44:17.3372358Z * [new tag] ciflow/trunk/160329 -> ciflow/trunk/160329 2025-10-10T00:44:17.3373941Z * [new tag] ciflow/trunk/160539 -> ciflow/trunk/160539 2025-10-10T00:44:17.3375743Z * [new tag] ciflow/trunk/160610 -> ciflow/trunk/160610 2025-10-10T00:44:17.3377149Z * [new tag] ciflow/trunk/160843 -> ciflow/trunk/160843 2025-10-10T00:44:17.3378568Z * [new tag] ciflow/trunk/161035 -> ciflow/trunk/161035 2025-10-10T00:44:17.3380414Z * [new tag] ciflow/trunk/161320 -> ciflow/trunk/161320 2025-10-10T00:44:17.3381795Z * [new tag] ciflow/trunk/162031 -> ciflow/trunk/162031 2025-10-10T00:44:17.3383223Z * [new tag] ciflow/trunk/162066 -> ciflow/trunk/162066 2025-10-10T00:44:17.3384665Z * [new tag] ciflow/trunk/162203 -> ciflow/trunk/162203 2025-10-10T00:44:17.3386061Z * [new tag] ciflow/trunk/162340 -> ciflow/trunk/162340 2025-10-10T00:44:17.3387412Z * [new tag] ciflow/trunk/162542 -> ciflow/trunk/162542 2025-10-10T00:44:17.3388774Z * [new tag] ciflow/trunk/162899 -> ciflow/trunk/162899 2025-10-10T00:44:17.3390314Z * [new tag] ciflow/trunk/163034 -> ciflow/trunk/163034 2025-10-10T00:44:17.3391873Z * [new tag] ciflow/trunk/163332 -> ciflow/trunk/163332 2025-10-10T00:44:17.3393396Z * [new tag] ciflow/trunk/163446 -> ciflow/trunk/163446 2025-10-10T00:44:17.3394837Z * [new tag] ciflow/trunk/163490 -> ciflow/trunk/163490 2025-10-10T00:44:17.3396113Z * [new tag] ciflow/trunk/163527 -> ciflow/trunk/163527 2025-10-10T00:44:17.3397463Z * [new tag] ciflow/trunk/163533 -> ciflow/trunk/163533 2025-10-10T00:44:17.3398825Z * [new tag] ciflow/trunk/163671 -> ciflow/trunk/163671 2025-10-10T00:44:17.3400579Z * [new tag] ciflow/trunk/163767 -> ciflow/trunk/163767 2025-10-10T00:44:17.3401853Z * [new tag] ciflow/trunk/163846 -> ciflow/trunk/163846 2025-10-10T00:44:17.3403242Z * [new tag] ciflow/trunk/163899 -> ciflow/trunk/163899 2025-10-10T00:44:17.3404659Z * [new tag] ciflow/trunk/163955 -> ciflow/trunk/163955 2025-10-10T00:44:17.3406042Z * [new tag] ciflow/trunk/163976 -> ciflow/trunk/163976 2025-10-10T00:44:17.3407474Z * [new tag] ciflow/trunk/164040 -> ciflow/trunk/164040 2025-10-10T00:44:17.3408892Z * [new tag] ciflow/trunk/164130 -> ciflow/trunk/164130 2025-10-10T00:44:17.3410227Z * [new tag] ciflow/trunk/164144 -> ciflow/trunk/164144 2025-10-10T00:44:17.3411610Z * [new tag] ciflow/trunk/164202 -> ciflow/trunk/164202 2025-10-10T00:44:17.3412933Z * [new tag] ciflow/trunk/164318 -> ciflow/trunk/164318 2025-10-10T00:44:17.3414299Z * [new tag] ciflow/trunk/164414 -> ciflow/trunk/164414 2025-10-10T00:44:17.3415564Z * [new tag] ciflow/trunk/164416 -> ciflow/trunk/164416 2025-10-10T00:44:17.3417026Z * [new tag] ciflow/trunk/164437 -> ciflow/trunk/164437 2025-10-10T00:44:17.3418334Z * [new tag] ciflow/trunk/164467 -> ciflow/trunk/164467 2025-10-10T00:44:17.3419688Z * [new tag] ciflow/trunk/164500 -> ciflow/trunk/164500 2025-10-10T00:44:17.3421052Z * [new tag] ciflow/trunk/164510 -> ciflow/trunk/164510 2025-10-10T00:44:17.3422465Z * [new tag] ciflow/trunk/164519 -> ciflow/trunk/164519 2025-10-10T00:44:17.3423766Z * [new tag] ciflow/trunk/164542 -> ciflow/trunk/164542 2025-10-10T00:44:17.3425300Z * [new tag] ciflow/trunk/164560 -> ciflow/trunk/164560 2025-10-10T00:44:17.3426673Z * [new tag] ciflow/trunk/164566 -> ciflow/trunk/164566 2025-10-10T00:44:17.3428000Z * [new tag] ciflow/trunk/164623 -> ciflow/trunk/164623 2025-10-10T00:44:17.3429358Z * [new tag] ciflow/trunk/164628 -> ciflow/trunk/164628 2025-10-10T00:44:17.3430708Z * [new tag] ciflow/trunk/164641 -> ciflow/trunk/164641 2025-10-10T00:44:17.3432206Z * [new tag] ciflow/trunk/164643 -> ciflow/trunk/164643 2025-10-10T00:44:17.3433597Z * [new tag] ciflow/trunk/164645 -> ciflow/trunk/164645 2025-10-10T00:44:17.3435056Z * [new tag] ciflow/trunk/164653 -> ciflow/trunk/164653 2025-10-10T00:44:17.3436402Z * [new tag] ciflow/trunk/164655 -> ciflow/trunk/164655 2025-10-10T00:44:17.3437785Z * [new tag] ciflow/trunk/164691 -> ciflow/trunk/164691 2025-10-10T00:44:17.3439092Z * [new tag] ciflow/trunk/164692 -> ciflow/trunk/164692 2025-10-10T00:44:17.3440452Z * [new tag] ciflow/trunk/164705 -> ciflow/trunk/164705 2025-10-10T00:44:17.3441779Z * [new tag] ciflow/trunk/164746 -> ciflow/trunk/164746 2025-10-10T00:44:17.3443143Z * [new tag] ciflow/trunk/164747 -> ciflow/trunk/164747 2025-10-10T00:44:17.3444498Z * [new tag] ciflow/trunk/164790 -> ciflow/trunk/164790 2025-10-10T00:44:17.3445857Z * [new tag] ciflow/trunk/164808 -> ciflow/trunk/164808 2025-10-10T00:44:17.3447295Z * [new tag] ciflow/trunk/164812 -> ciflow/trunk/164812 2025-10-10T00:44:17.3448982Z * [new tag] ciflow/trunk/164836 -> ciflow/trunk/164836 2025-10-10T00:44:17.3450347Z * [new tag] ciflow/trunk/164842 -> ciflow/trunk/164842 2025-10-10T00:44:17.3451687Z * [new tag] ciflow/trunk/164882 -> ciflow/trunk/164882 2025-10-10T00:44:17.3453043Z * [new tag] ciflow/trunk/164889 -> ciflow/trunk/164889 2025-10-10T00:44:17.3454418Z * [new tag] ciflow/trunk/164894 -> ciflow/trunk/164894 2025-10-10T00:44:17.3455716Z * [new tag] ciflow/trunk/164930 -> ciflow/trunk/164930 2025-10-10T00:44:17.3457055Z * [new tag] ciflow/trunk/164953 -> ciflow/trunk/164953 2025-10-10T00:44:17.3458397Z * [new tag] ciflow/trunk/164976 -> ciflow/trunk/164976 2025-10-10T00:44:17.3459768Z * [new tag] ciflow/trunk/164999 -> ciflow/trunk/164999 2025-10-10T00:44:17.3461158Z * [new tag] ciflow/trunk/165000 -> ciflow/trunk/165000 2025-10-10T00:44:17.3462635Z * [new tag] ciflow/trunk/165017 -> ciflow/trunk/165017 2025-10-10T00:44:17.3463630Z * [new tag] ciflow/trunk/165018 -> ciflow/trunk/165018 2025-10-10T00:44:17.3465525Z * [new tag] ciflow/trunk/165024 -> ciflow/trunk/165024 2025-10-10T00:44:17.3466932Z * [new tag] ciflow/trunk/165031 -> ciflow/trunk/165031 2025-10-10T00:44:17.3468244Z * [new tag] ciflow/trunk/165033 -> ciflow/trunk/165033 2025-10-10T00:44:17.3470040Z * [new tag] ciflow/trunk/165047 -> ciflow/trunk/165047 2025-10-10T00:44:17.3471432Z * [new tag] ciflow/trunk/165057 -> ciflow/trunk/165057 2025-10-10T00:44:17.3472749Z * [new tag] ciflow/trunk/165060 -> ciflow/trunk/165060 2025-10-10T00:44:17.3474098Z * [new tag] ciflow/trunk/165065 -> ciflow/trunk/165065 2025-10-10T00:44:17.3475467Z * [new tag] ciflow/trunk/165066 -> ciflow/trunk/165066 2025-10-10T00:44:17.3477196Z * [new tag] ciflow/trunk/165090 -> ciflow/trunk/165090 2025-10-10T00:44:17.3478619Z * [new tag] ciflow/trunk/165094 -> ciflow/trunk/165094 2025-10-10T00:44:17.3480061Z * [new tag] ciflow/trunk/165113 -> ciflow/trunk/165113 2025-10-10T00:44:17.3481972Z * [new tag] ciflow/unstable/123 -> ciflow/unstable/123 2025-10-10T00:44:17.3483637Z * [new tag] ciflow/vllm/164628 -> ciflow/vllm/164628 2025-10-10T00:44:17.3485162Z * [new tag] ciflow/win-arm64/158104 -> ciflow/win-arm64/158104 2025-10-10T00:44:17.3486694Z * [new tag] ciflow/xpu/157994 -> ciflow/xpu/157994 2025-10-10T00:44:17.3488112Z * [new tag] ciflow/xpu/161485 -> ciflow/xpu/161485 2025-10-10T00:44:17.3489556Z * [new tag] ciflow/xpu/162454 -> ciflow/xpu/162454 2025-10-10T00:44:17.3490830Z * [new tag] ciflow/xpu/163332 -> ciflow/xpu/163332 2025-10-10T00:44:17.3492176Z * [new tag] cslpull75 -> cslpull75 2025-10-10T00:44:17.3493474Z * [new tag] cslpull76 -> cslpull76 2025-10-10T00:44:17.3494802Z * [new tag] cslpull77 -> cslpull77 2025-10-10T00:44:17.3496190Z * [new tag] cslpull78 -> cslpull78 2025-10-10T00:44:17.3497788Z * [new tag] cslpull79 -> cslpull79 2025-10-10T00:44:17.3500063Z * [new tag] cslpull80 -> cslpull80 2025-10-10T00:44:17.3502206Z * [new tag] cslpull81 -> cslpull81 2025-10-10T00:44:17.3503809Z * [new tag] cslpull82 -> cslpull82 2025-10-10T00:44:17.3505128Z * [new tag] cslpull83 -> cslpull83 2025-10-10T00:44:17.3506643Z * [new tag] cslpull84 -> cslpull84 2025-10-10T00:44:17.3508090Z * [new tag] cslpull85 -> cslpull85 2025-10-10T00:44:17.3509466Z * [new tag] cslpull86 -> cslpull86 2025-10-10T00:44:17.3510907Z * [new tag] cslpull87 -> cslpull87 2025-10-10T00:44:17.3512468Z * [new tag] cslpull88 -> cslpull88 2025-10-10T00:44:17.3513780Z * [new tag] cslpull89 -> cslpull89 2025-10-10T00:44:17.3515013Z * [new tag] cslpull90 -> cslpull90 2025-10-10T00:44:17.3517076Z * [new tag] cslpull91 -> cslpull91 2025-10-10T00:44:17.3517967Z * [new tag] cslpull92 -> cslpull92 2025-10-10T00:44:17.3519681Z * [new tag] flight_5 -> flight_5 2025-10-10T00:44:17.3521108Z * [new tag] flight_5.1 -> flight_5.1 2025-10-10T00:44:17.3522768Z * [new tag] flight_5.2 -> flight_5.2 2025-10-10T00:44:17.3524170Z * [new tag] flight_5.3 -> flight_5.3 2025-10-10T00:44:17.3525631Z * [new tag] forpull1 -> forpull1 2025-10-10T00:44:17.3527355Z * [new tag] malfet/tag-2ef5611 -> malfet/tag-2ef5611 2025-10-10T00:44:17.3528902Z * [new tag] malfet/tag-317b1a0 -> malfet/tag-317b1a0 2025-10-10T00:44:17.3530731Z * [new tag] malfet/tag-ec6f767 -> malfet/tag-ec6f767 2025-10-10T00:44:17.3532301Z * [new tag] nightly-binary -> nightly-binary 2025-10-10T00:44:17.3533733Z * [new tag] sqzhang_flight4_plus -> sqzhang_flight4_plus 2025-10-10T00:44:17.3535500Z * [new tag] sqzhang_flight_3 -> sqzhang_flight_3 2025-10-10T00:44:17.3537290Z * [new tag] trunk/001e1d263746ae9d121d9c8cf55bc87f777d9dba -> trunk/001e1d263746ae9d121d9c8cf55bc87f777d9dba 2025-10-10T00:44:17.3538791Z * [new tag] trunk/005c3d449e4c655d2eb0d76949a8cd41ce88f979 -> trunk/005c3d449e4c655d2eb0d76949a8cd41ce88f979 2025-10-10T00:44:17.3540395Z * [new tag] trunk/00f0365b959323bab89dc0a5bd5d40589e78edc8 -> trunk/00f0365b959323bab89dc0a5bd5d40589e78edc8 2025-10-10T00:44:17.3542282Z * [new tag] trunk/01f3a43462da594b65a6c9e8b46c132cd360cea9 -> trunk/01f3a43462da594b65a6c9e8b46c132cd360cea9 2025-10-10T00:44:17.3543663Z * [new tag] trunk/0319556a35b01e8857f7bf75df9df3287e1e853a -> trunk/0319556a35b01e8857f7bf75df9df3287e1e853a 2025-10-10T00:44:17.3545396Z * [new tag] trunk/054268c9ebb3291c6fd442e4a1f6602a8ea43ab6 -> trunk/054268c9ebb3291c6fd442e4a1f6602a8ea43ab6 2025-10-10T00:44:17.3546418Z * [new tag] trunk/06d86e58d0309aa2c217256f88d1990a22ec6e4f -> trunk/06d86e58d0309aa2c217256f88d1990a22ec6e4f 2025-10-10T00:44:17.3548357Z * [new tag] trunk/078d475d3bb104823e70ce975c2ee0d4d2fb0952 -> trunk/078d475d3bb104823e70ce975c2ee0d4d2fb0952 2025-10-10T00:44:17.3549415Z * [new tag] trunk/086dec3235d463e751c12ce9eeeb2dfcc873e206 -> trunk/086dec3235d463e751c12ce9eeeb2dfcc873e206 2025-10-10T00:44:17.3551337Z * [new tag] trunk/0a3e4e894cbc0cc93568c5d016f3ad72650cf641 -> trunk/0a3e4e894cbc0cc93568c5d016f3ad72650cf641 2025-10-10T00:44:17.3552828Z * [new tag] trunk/0b01ff4de02035eb21c1bc6bf4b1b627bc1cefaa -> trunk/0b01ff4de02035eb21c1bc6bf4b1b627bc1cefaa 2025-10-10T00:44:17.3554348Z * [new tag] trunk/0b15f7ae059cf4fa3909bbb009d83c0253a6385a -> trunk/0b15f7ae059cf4fa3909bbb009d83c0253a6385a 2025-10-10T00:44:17.3555876Z * [new tag] trunk/0b4f2b46d9e14c1858dd3d0ca9b62e349ae316cf -> trunk/0b4f2b46d9e14c1858dd3d0ca9b62e349ae316cf 2025-10-10T00:44:17.3557627Z * [new tag] trunk/0b85236477fe8a0e32510bcc973b2f34ef981df2 -> trunk/0b85236477fe8a0e32510bcc973b2f34ef981df2 2025-10-10T00:44:17.3558531Z * [new tag] trunk/0d39ecb2ce8556e85343d8da0c87450192c2fdf8 -> trunk/0d39ecb2ce8556e85343d8da0c87450192c2fdf8 2025-10-10T00:44:17.3560200Z * [new tag] trunk/0e5773b7fadef9e29b006af470b771fad55b5206 -> trunk/0e5773b7fadef9e29b006af470b771fad55b5206 2025-10-10T00:44:17.3561823Z * [new tag] trunk/0e9b3a772ab96e998ab85591d5b2a9c1d41bacb0 -> trunk/0e9b3a772ab96e998ab85591d5b2a9c1d41bacb0 2025-10-10T00:44:17.3563293Z * [new tag] trunk/0fbe3f19c7e88ee1720d2e1579e3fd2cafdaabf9 -> trunk/0fbe3f19c7e88ee1720d2e1579e3fd2cafdaabf9 2025-10-10T00:44:17.3564897Z * [new tag] trunk/0fd976b65c6daf3799a501d9202e4f50144446d1 -> trunk/0fd976b65c6daf3799a501d9202e4f50144446d1 2025-10-10T00:44:17.3566425Z * [new tag] trunk/1051c1de5c0c1d34bec94c4a3199ac7b23bb19e1 -> trunk/1051c1de5c0c1d34bec94c4a3199ac7b23bb19e1 2025-10-10T00:44:17.3568184Z * [new tag] trunk/115af42e9d57e89c26777be72822107cd7b39e07 -> trunk/115af42e9d57e89c26777be72822107cd7b39e07 2025-10-10T00:44:17.3569866Z * [new tag] trunk/11f5f656867089dac1fa1e64e34c9966578fbddd -> trunk/11f5f656867089dac1fa1e64e34c9966578fbddd 2025-10-10T00:44:17.3571251Z * [new tag] trunk/12d2ef557f6e127100267c31a31572d8ab5cc788 -> trunk/12d2ef557f6e127100267c31a31572d8ab5cc788 2025-10-10T00:44:17.3572628Z * [new tag] trunk/144378615a5a2b347e39c6376cba7d75f7a82926 -> trunk/144378615a5a2b347e39c6376cba7d75f7a82926 2025-10-10T00:44:17.3574390Z * [new tag] trunk/14791ea947349fb5fa7b7d6230cfd3924c36ba27 -> trunk/14791ea947349fb5fa7b7d6230cfd3924c36ba27 2025-10-10T00:44:17.3575454Z * [new tag] trunk/15800888b697bacd555399b3a0ca2e8d0827528e -> trunk/15800888b697bacd555399b3a0ca2e8d0827528e 2025-10-10T00:44:17.3577860Z * [new tag] trunk/15c8bdcc5e3a6dfd14e5c977438f772031e064ff -> trunk/15c8bdcc5e3a6dfd14e5c977438f772031e064ff 2025-10-10T00:44:17.3579602Z * [new tag] trunk/15d726005ddc5558c934c3edd5f815c2e504e501 -> trunk/15d726005ddc5558c934c3edd5f815c2e504e501 2025-10-10T00:44:17.3581008Z * [new tag] trunk/16f9bef642b07b3090a6e4a04517eff84d41a197 -> trunk/16f9bef642b07b3090a6e4a04517eff84d41a197 2025-10-10T00:44:17.3582533Z * [new tag] trunk/17c7170ca6e2efd5ead2b93bd12e226ff48f0669 -> trunk/17c7170ca6e2efd5ead2b93bd12e226ff48f0669 2025-10-10T00:44:17.3584201Z * [new tag] trunk/184817c7a81d5c01e107a84efeb269b063ddf5d6 -> trunk/184817c7a81d5c01e107a84efeb269b063ddf5d6 2025-10-10T00:44:17.3586038Z * [new tag] trunk/18940820006d2304460008575561e2e8e7fc59fc -> trunk/18940820006d2304460008575561e2e8e7fc59fc 2025-10-10T00:44:17.3587074Z * [new tag] trunk/18e18488e8c90e53cc113b1a5eddd9640ee80292 -> trunk/18e18488e8c90e53cc113b1a5eddd9640ee80292 2025-10-10T00:44:17.3588870Z * [new tag] trunk/1927783aa3ad676db6f4c34fc77ef3825a4e2ed5 -> trunk/1927783aa3ad676db6f4c34fc77ef3825a4e2ed5 2025-10-10T00:44:17.3590284Z * [new tag] trunk/19bf67be3286c0e2babe83af0d1593bae850362a -> trunk/19bf67be3286c0e2babe83af0d1593bae850362a 2025-10-10T00:44:17.3591706Z * [new tag] trunk/1bb68271b7ff1b582845384c6c7f7b1593ae1619 -> trunk/1bb68271b7ff1b582845384c6c7f7b1593ae1619 2025-10-10T00:44:17.3594242Z * [new tag] trunk/1d182dd81c3143697337e35d046fd02951dedb09 -> trunk/1d182dd81c3143697337e35d046fd02951dedb09 2025-10-10T00:44:17.3594615Z * [new tag] trunk/1e42fde45eff81845f269e8185f54a19f6d87c5b -> trunk/1e42fde45eff81845f269e8185f54a19f6d87c5b 2025-10-10T00:44:17.3596366Z * [new tag] trunk/1f73b96668bc6ae4c8e7ef5b630ff5f3c69ae005 -> trunk/1f73b96668bc6ae4c8e7ef5b630ff5f3c69ae005 2025-10-10T00:44:17.3597418Z * [new tag] trunk/1f8ee5da117952b03f0050a178d69f8e7189b0f8 -> trunk/1f8ee5da117952b03f0050a178d69f8e7189b0f8 2025-10-10T00:44:17.3600144Z * [new tag] trunk/1f9614cef8e0272c8e3bd99004d2978a6ecc5195 -> trunk/1f9614cef8e0272c8e3bd99004d2978a6ecc5195 2025-10-10T00:44:17.3600962Z * [new tag] trunk/1fb072ac2a33af93a77888dddbdd228b22a3f9c4 -> trunk/1fb072ac2a33af93a77888dddbdd228b22a3f9c4 2025-10-10T00:44:17.3602135Z * [new tag] trunk/1fc71d1b578badb1b3ba7cc2d5795f4f80463749 -> trunk/1fc71d1b578badb1b3ba7cc2d5795f4f80463749 2025-10-10T00:44:17.3603795Z * [new tag] trunk/20082d713666fa1eade588bebd523d86309bfa25 -> trunk/20082d713666fa1eade588bebd523d86309bfa25 2025-10-10T00:44:17.3605273Z * [new tag] trunk/2164b661219ab0a76aa018e955ba3d8e8f99c083 -> trunk/2164b661219ab0a76aa018e955ba3d8e8f99c083 2025-10-10T00:44:17.3606908Z * [new tag] trunk/228973df7f770505aafc6fc17b99f81ac58bdfe1 -> trunk/228973df7f770505aafc6fc17b99f81ac58bdfe1 2025-10-10T00:44:17.3608614Z * [new tag] trunk/22b1710252525d80d47ba95c762ccdbf577b2dc2 -> trunk/22b1710252525d80d47ba95c762ccdbf577b2dc2 2025-10-10T00:44:17.3609935Z * [new tag] trunk/22e219d9969ff3cee85bc5de32fa49d5a549a148 -> trunk/22e219d9969ff3cee85bc5de32fa49d5a549a148 2025-10-10T00:44:17.3611755Z * [new tag] trunk/235b995ce18de632ab816940319fcd66b46039b8 -> trunk/235b995ce18de632ab816940319fcd66b46039b8 2025-10-10T00:44:17.3612779Z * [new tag] trunk/23ab6a45e5c759fb4714905cb8c84ef74c70aa67 -> trunk/23ab6a45e5c759fb4714905cb8c84ef74c70aa67 2025-10-10T00:44:17.3614550Z * [new tag] trunk/24d69c57cbaa94cc828dbbdf83c889f5f244ae28 -> trunk/24d69c57cbaa94cc828dbbdf83c889f5f244ae28 2025-10-10T00:44:17.3616169Z * [new tag] trunk/263db92563f0ae71bf3e4fc265fbb48e79f9f23f -> trunk/263db92563f0ae71bf3e4fc265fbb48e79f9f23f 2025-10-10T00:44:17.3617678Z * [new tag] trunk/27234792add2ee9bedd84ca02dbf34f8f244bc5c -> trunk/27234792add2ee9bedd84ca02dbf34f8f244bc5c 2025-10-10T00:44:17.3619334Z * [new tag] trunk/27eb36debbe3fa2d43a2f893a5c46a6257a09460 -> trunk/27eb36debbe3fa2d43a2f893a5c46a6257a09460 2025-10-10T00:44:17.3620749Z * [new tag] trunk/2855a045b30dafad7a08d66e242be13770189c19 -> trunk/2855a045b30dafad7a08d66e242be13770189c19 2025-10-10T00:44:17.3622327Z * [new tag] trunk/2883b5ab773daf5861d43ff0b65be49a441ab3f9 -> trunk/2883b5ab773daf5861d43ff0b65be49a441ab3f9 2025-10-10T00:44:17.3623911Z * [new tag] trunk/29824067215f3ed9e4044ca0f31a71e9d95f237d -> trunk/29824067215f3ed9e4044ca0f31a71e9d95f237d 2025-10-10T00:44:17.3625546Z * [new tag] trunk/2a11ce2c787b2339ffb8941b849dd487d25b4121 -> trunk/2a11ce2c787b2339ffb8941b849dd487d25b4121 2025-10-10T00:44:17.3626605Z * [new tag] trunk/2a6cdba6e5f74c2294fecc2d1344537522efbaab -> trunk/2a6cdba6e5f74c2294fecc2d1344537522efbaab 2025-10-10T00:44:17.3628567Z * [new tag] trunk/2a760dc51e04d65845440cc09e7016cfc74f9132 -> trunk/2a760dc51e04d65845440cc09e7016cfc74f9132 2025-10-10T00:44:17.3629614Z * [new tag] trunk/2a7c48675010056f23d62b5c6ecb318782801723 -> trunk/2a7c48675010056f23d62b5c6ecb318782801723 2025-10-10T00:44:17.3631553Z * [new tag] trunk/2b58adc3bdcf9476e1cef49ad965b7d3c7b9ac24 -> trunk/2b58adc3bdcf9476e1cef49ad965b7d3c7b9ac24 2025-10-10T00:44:17.3633060Z * [new tag] trunk/2b9ff9953523a2e916234c9197d946f4cff976c7 -> trunk/2b9ff9953523a2e916234c9197d946f4cff976c7 2025-10-10T00:44:17.3634607Z * [new tag] trunk/2c2e1268b7aae8ed610d12f2d38d39f8d93888a3 -> trunk/2c2e1268b7aae8ed610d12f2d38d39f8d93888a3 2025-10-10T00:44:17.3636080Z * [new tag] trunk/2c5ed6e7c067573b093725cd15d13812d9647562 -> trunk/2c5ed6e7c067573b093725cd15d13812d9647562 2025-10-10T00:44:17.3637643Z * [new tag] trunk/2d50678dcc7ab2da13a9bca6af8f2333e8970344 -> trunk/2d50678dcc7ab2da13a9bca6af8f2333e8970344 2025-10-10T00:44:17.3639191Z * [new tag] trunk/2e027e874232fefe7b1c56ce8aeb26c0e6b97f15 -> trunk/2e027e874232fefe7b1c56ce8aeb26c0e6b97f15 2025-10-10T00:44:17.3640730Z * [new tag] trunk/2e1742dd63c2168fd9649dbba96a95abf1f57cae -> trunk/2e1742dd63c2168fd9649dbba96a95abf1f57cae 2025-10-10T00:44:17.3641992Z * [new tag] trunk/2fe37b5fde392535a3238f975c93dd202cd3e24b -> trunk/2fe37b5fde392535a3238f975c93dd202cd3e24b 2025-10-10T00:44:17.3643736Z * [new tag] trunk/3040a5d294bd30d3938d0043a5d93d6c23264827 -> trunk/3040a5d294bd30d3938d0043a5d93d6c23264827 2025-10-10T00:44:17.3645410Z * [new tag] trunk/321e6026925f6b6e8a36e3a8b7c0295cd7541911 -> trunk/321e6026925f6b6e8a36e3a8b7c0295cd7541911 2025-10-10T00:44:17.3646301Z * [new tag] trunk/322091d8d8542a0cbff524306029bef4d7338747 -> trunk/322091d8d8542a0cbff524306029bef4d7338747 2025-10-10T00:44:17.3648128Z * [new tag] trunk/3288fbf374128610928e27d03615ac0d46a6ce14 -> trunk/3288fbf374128610928e27d03615ac0d46a6ce14 2025-10-10T00:44:17.3649817Z * [new tag] trunk/331191ce4b29b5d7d3bb7f0e7454ca70c06fbd26 -> trunk/331191ce4b29b5d7d3bb7f0e7454ca70c06fbd26 2025-10-10T00:44:17.3650915Z * [new tag] trunk/33b17bc619b044a0050797987efb8890d43319df -> trunk/33b17bc619b044a0050797987efb8890d43319df 2025-10-10T00:44:17.3652711Z * [new tag] trunk/34042a9145fe28033e7edb08f1fcf90ed197f4ac -> trunk/34042a9145fe28033e7edb08f1fcf90ed197f4ac 2025-10-10T00:44:17.3654397Z * [new tag] trunk/344e6365a0068c2d2847fcec0c55dd53291d475e -> trunk/344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:44:17.3655951Z * [new tag] trunk/34ac9b61cbfcf17328ccb8b729509829447fdddd -> trunk/34ac9b61cbfcf17328ccb8b729509829447fdddd 2025-10-10T00:44:17.3657484Z * [new tag] trunk/35c4130fd1358c98e12301ffa0f1b2294e0c795f -> trunk/35c4130fd1358c98e12301ffa0f1b2294e0c795f 2025-10-10T00:44:17.3658988Z * [new tag] trunk/35f66b83f89a571d0c0abe16c66a23120b92bdaf -> trunk/35f66b83f89a571d0c0abe16c66a23120b92bdaf 2025-10-10T00:44:17.3660366Z * [new tag] trunk/361c5d362c4ea1950e05116899cfcf753c345ebd -> trunk/361c5d362c4ea1950e05116899cfcf753c345ebd 2025-10-10T00:44:17.3661932Z * [new tag] trunk/37c6087334cce3ad4bc9838ea2ef63aba89f2253 -> trunk/37c6087334cce3ad4bc9838ea2ef63aba89f2253 2025-10-10T00:44:17.3663507Z * [new tag] trunk/3912ba3e940b9354622fa09b2ada677cd10723d8 -> trunk/3912ba3e940b9354622fa09b2ada677cd10723d8 2025-10-10T00:44:17.3664986Z * [new tag] trunk/39189592fd688979e56063430ed5a038d999908f -> trunk/39189592fd688979e56063430ed5a038d999908f 2025-10-10T00:44:17.3666556Z * [new tag] trunk/3924f784ba81f87fe09988d6fc9620b57e4d9f72 -> trunk/3924f784ba81f87fe09988d6fc9620b57e4d9f72 2025-10-10T00:44:17.3668230Z * [new tag] trunk/39b31a6bfde6e046383ae2b06fe0b68df5cdbdd2 -> trunk/39b31a6bfde6e046383ae2b06fe0b68df5cdbdd2 2025-10-10T00:44:17.3669267Z * [new tag] trunk/39c340ec9e2ee3011f1d260f581b5a95f3c99039 -> trunk/39c340ec9e2ee3011f1d260f581b5a95f3c99039 2025-10-10T00:44:17.3671081Z * [new tag] trunk/39d0c06ed0d7bc634d7f1a4e84b69f66d1ea0798 -> trunk/39d0c06ed0d7bc634d7f1a4e84b69f66d1ea0798 2025-10-10T00:44:17.3672714Z * [new tag] trunk/3c0577bd15778c96cecf0e7a5e5958d7fcab64f0 -> trunk/3c0577bd15778c96cecf0e7a5e5958d7fcab64f0 2025-10-10T00:44:17.3674230Z * [new tag] trunk/3c59351c6ea2fc29d346903e28e95c5f4d0ccdbb -> trunk/3c59351c6ea2fc29d346903e28e95c5f4d0ccdbb 2025-10-10T00:44:17.3676260Z * [new tag] trunk/3c5ca685d6f5b6f3971c0cd20a054aa355610419 -> trunk/3c5ca685d6f5b6f3971c0cd20a054aa355610419 2025-10-10T00:44:17.3677811Z * [new tag] trunk/3ca09d65f1bdf83142dc9fe47976227ae4a88e7b -> trunk/3ca09d65f1bdf83142dc9fe47976227ae4a88e7b 2025-10-10T00:44:17.3679885Z * [new tag] trunk/3cc8af2d67f42bf2a933796290446c5ab8978aac -> trunk/3cc8af2d67f42bf2a933796290446c5ab8978aac 2025-10-10T00:44:17.3680932Z * [new tag] trunk/3d1fa40ae1fee18ddf3dca89229e3ae828589e0c -> trunk/3d1fa40ae1fee18ddf3dca89229e3ae828589e0c 2025-10-10T00:44:17.3682708Z * [new tag] trunk/3d9d41c80168bcd3c569345a96682c42a5eba36a -> trunk/3d9d41c80168bcd3c569345a96682c42a5eba36a 2025-10-10T00:44:17.3684255Z * [new tag] trunk/3db21643417a04f6f2707a783ac32a538a98d53d -> trunk/3db21643417a04f6f2707a783ac32a538a98d53d 2025-10-10T00:44:17.3685855Z * [new tag] trunk/3ddf2018d0b7b4def0553dc092d928ef831a19c3 -> trunk/3ddf2018d0b7b4def0553dc092d928ef831a19c3 2025-10-10T00:44:17.3687190Z * [new tag] trunk/3e03deab6f3c268c85c8efd9546e28cdda0fa4cc -> trunk/3e03deab6f3c268c85c8efd9546e28cdda0fa4cc 2025-10-10T00:44:17.3689011Z * [new tag] trunk/3e0826c9d792ae87373dc0ff5d46c260020de29f -> trunk/3e0826c9d792ae87373dc0ff5d46c260020de29f 2025-10-10T00:44:17.3690625Z * [new tag] trunk/409aece3f9436a2f740f1b97f1243f738f6bbbf6 -> trunk/409aece3f9436a2f740f1b97f1243f738f6bbbf6 2025-10-10T00:44:17.3692135Z * [new tag] trunk/40b25578e4ecb7ef1c38201b3ce0014eb57c53eb -> trunk/40b25578e4ecb7ef1c38201b3ce0014eb57c53eb 2025-10-10T00:44:17.3693241Z * [new tag] trunk/412c6d28ec3869ef8ba962b290d755251e7cc3c1 -> trunk/412c6d28ec3869ef8ba962b290d755251e7cc3c1 2025-10-10T00:44:17.3694869Z * [new tag] trunk/415e641572473479fc9d9eaea12762e1a223a9e0 -> trunk/415e641572473479fc9d9eaea12762e1a223a9e0 2025-10-10T00:44:17.3696497Z * [new tag] trunk/41808b2ba9a61ab2f4c7af394c1668d09a4a0331 -> trunk/41808b2ba9a61ab2f4c7af394c1668d09a4a0331 2025-10-10T00:44:17.3698044Z * [new tag] trunk/4308b8a28fa332d23ad6d25a472559b354619131 -> trunk/4308b8a28fa332d23ad6d25a472559b354619131 2025-10-10T00:44:17.3699469Z * [new tag] trunk/43848b71d9af0223eafdd1755bf7444aafe9e993 -> trunk/43848b71d9af0223eafdd1755bf7444aafe9e993 2025-10-10T00:44:17.3701297Z * [new tag] trunk/43fc859625f9c0a794307b3ef30c26ab3fc2bfec -> trunk/43fc859625f9c0a794307b3ef30c26ab3fc2bfec 2025-10-10T00:44:17.3702920Z * [new tag] trunk/4412026949b562f940d4c24162de19d299725b62 -> trunk/4412026949b562f940d4c24162de19d299725b62 2025-10-10T00:44:17.3704018Z * [new tag] trunk/44a5d419935a77b3308f247279a457e6d0b9a292 -> trunk/44a5d419935a77b3308f247279a457e6d0b9a292 2025-10-10T00:44:17.3706011Z * [new tag] trunk/4661200125ba9c87aa7d54a55e585403b5ce5040 -> trunk/4661200125ba9c87aa7d54a55e585403b5ce5040 2025-10-10T00:44:17.3707574Z * [new tag] trunk/4691fe60700ac51a878775fd23a8f7c4548c6757 -> trunk/4691fe60700ac51a878775fd23a8f7c4548c6757 2025-10-10T00:44:17.3709193Z * [new tag] trunk/4725871a815fb880e89135a493c8c94ab9bbfece -> trunk/4725871a815fb880e89135a493c8c94ab9bbfece 2025-10-10T00:44:17.3710663Z * [new tag] trunk/47956196d99166fe9083beb2a52fd2e6c90b2011 -> trunk/47956196d99166fe9083beb2a52fd2e6c90b2011 2025-10-10T00:44:17.3712202Z * [new tag] trunk/483f4e0db91166128ad8922d86dc7222338d4ecc -> trunk/483f4e0db91166128ad8922d86dc7222338d4ecc 2025-10-10T00:44:17.3713829Z * [new tag] trunk/48b54b45d62af7ecafccc5afede04474cb236f1a -> trunk/48b54b45d62af7ecafccc5afede04474cb236f1a 2025-10-10T00:44:17.3715392Z * [new tag] trunk/49f7d8d19d24f616b11ef050535a211245aed649 -> trunk/49f7d8d19d24f616b11ef050535a211245aed649 2025-10-10T00:44:17.3716655Z * [new tag] trunk/4a0df39f814afad087e8b29dd2914a8b54567694 -> trunk/4a0df39f814afad087e8b29dd2914a8b54567694 2025-10-10T00:44:17.3718285Z * [new tag] trunk/4a6abba0d9fb3dc0f29b5efe527e26b2962caec1 -> trunk/4a6abba0d9fb3dc0f29b5efe527e26b2962caec1 2025-10-10T00:44:17.3719838Z * [new tag] trunk/4ab847bbc7ba09f29a4e81494e8a752dcb411117 -> trunk/4ab847bbc7ba09f29a4e81494e8a752dcb411117 2025-10-10T00:44:17.3720890Z * [new tag] trunk/4bcc05777e780e834d44a2d06dd5321daec316f0 -> trunk/4bcc05777e780e834d44a2d06dd5321daec316f0 2025-10-10T00:44:17.3722736Z * [new tag] trunk/4bd1505f849e701a8e54f9d185c23f13e7324498 -> trunk/4bd1505f849e701a8e54f9d185c23f13e7324498 2025-10-10T00:44:17.3723864Z * [new tag] trunk/4c0fec3e4dac35b9e9dec2beacfb5967906a4701 -> trunk/4c0fec3e4dac35b9e9dec2beacfb5967906a4701 2025-10-10T00:44:17.3725707Z * [new tag] trunk/4c3c0ef2f10415f5d5b13f1c842f91bb90ee91d3 -> trunk/4c3c0ef2f10415f5d5b13f1c842f91bb90ee91d3 2025-10-10T00:44:17.3727306Z * [new tag] trunk/4d7f9f3aed68729380730ed46e29ff2052f05b73 -> trunk/4d7f9f3aed68729380730ed46e29ff2052f05b73 2025-10-10T00:44:17.3728919Z * [new tag] trunk/50e077beaaf71798f870552f3849e4a52c784df5 -> trunk/50e077beaaf71798f870552f3849e4a52c784df5 2025-10-10T00:44:17.3730443Z * [new tag] trunk/5103ecc5d8f0cc90e686763652e2d84c22d83ca9 -> trunk/5103ecc5d8f0cc90e686763652e2d84c22d83ca9 2025-10-10T00:44:17.3732125Z * [new tag] trunk/5178d0a480f8f4e21da3757de455c8215b249ec5 -> trunk/5178d0a480f8f4e21da3757de455c8215b249ec5 2025-10-10T00:44:17.3733731Z * [new tag] trunk/5209c8ce0704f34ba4bd2a58c19877fbf6cf0392 -> trunk/5209c8ce0704f34ba4bd2a58c19877fbf6cf0392 2025-10-10T00:44:17.3735311Z * [new tag] trunk/5390324984c43f1214b8abf731ad495ba2df5341 -> trunk/5390324984c43f1214b8abf731ad495ba2df5341 2025-10-10T00:44:17.3736884Z * [new tag] trunk/53f5af8c924aba3c0fab1fabc6baf7d6affcb8a1 -> trunk/53f5af8c924aba3c0fab1fabc6baf7d6affcb8a1 2025-10-10T00:44:17.3738441Z * [new tag] trunk/54ae61c573e91fa2a2c6430435059e2d94ecba2e -> trunk/54ae61c573e91fa2a2c6430435059e2d94ecba2e 2025-10-10T00:44:17.3739987Z * [new tag] trunk/5656d45c8ff03cf20fd7d5098247c2250395af8a -> trunk/5656d45c8ff03cf20fd7d5098247c2250395af8a 2025-10-10T00:44:17.3741548Z * [new tag] trunk/56d66ac0d74f44d7b656757795142b5b9a1802a1 -> trunk/56d66ac0d74f44d7b656757795142b5b9a1802a1 2025-10-10T00:44:17.3743159Z * [new tag] trunk/5743d731c1de495ecf3bb03682a2dcbe207ca895 -> trunk/5743d731c1de495ecf3bb03682a2dcbe207ca895 2025-10-10T00:44:17.3744302Z * [new tag] trunk/5a1fbf45ad727353e367740ecd8825ca7ee857e9 -> trunk/5a1fbf45ad727353e367740ecd8825ca7ee857e9 2025-10-10T00:44:17.3746064Z * [new tag] trunk/5a66ff4915ecfd86f1a68e7862e5a2ad473e5a79 -> trunk/5a66ff4915ecfd86f1a68e7862e5a2ad473e5a79 2025-10-10T00:44:17.3747688Z * [new tag] trunk/5b0b4cda4aa03bee16ee67d9d36012a539df3c50 -> trunk/5b0b4cda4aa03bee16ee67d9d36012a539df3c50 2025-10-10T00:44:17.3748964Z * [new tag] trunk/5b8174bc286725f9326fba6dc0ef17c316486bbd -> trunk/5b8174bc286725f9326fba6dc0ef17c316486bbd 2025-10-10T00:44:17.3750685Z * [new tag] trunk/5ba11df4f871717818b88c4eab514d31286601d1 -> trunk/5ba11df4f871717818b88c4eab514d31286601d1 2025-10-10T00:44:17.3752133Z * [new tag] trunk/5c827a4133da69108338d0363bb7ad7f62803c40 -> trunk/5c827a4133da69108338d0363bb7ad7f62803c40 2025-10-10T00:44:17.3753672Z * [new tag] trunk/5d459dd6099ef94d33db9a6d36bcce9f742f1da1 -> trunk/5d459dd6099ef94d33db9a6d36bcce9f742f1da1 2025-10-10T00:44:17.3754724Z * [new tag] trunk/5d7360bb03355c89c0b956df0ab428f5a7b5c9f8 -> trunk/5d7360bb03355c89c0b956df0ab428f5a7b5c9f8 2025-10-10T00:44:17.3756970Z * [new tag] trunk/5dbae1eae26159058f6199fc68fe73fc0e5bef5f -> trunk/5dbae1eae26159058f6199fc68fe73fc0e5bef5f 2025-10-10T00:44:17.3758255Z * [new tag] trunk/5e47b4dd60ff9efb253286af5a2479d9d800ce6a -> trunk/5e47b4dd60ff9efb253286af5a2479d9d800ce6a 2025-10-10T00:44:17.3759981Z * [new tag] trunk/5ed4270440fd0b62d3aa14692f9e377a0061061e -> trunk/5ed4270440fd0b62d3aa14692f9e377a0061061e 2025-10-10T00:44:17.3761632Z * [new tag] trunk/5f18f240de43fc24481ead4d740dda64f174fa86 -> trunk/5f18f240de43fc24481ead4d740dda64f174fa86 2025-10-10T00:44:17.3763313Z * [new tag] trunk/5f775bdfb766d9a2717ffbb64f2a51e53cddc778 -> trunk/5f775bdfb766d9a2717ffbb64f2a51e53cddc778 2025-10-10T00:44:17.3764405Z * [new tag] trunk/600267ea56cafcf8f9a1150a4379184960a757b2 -> trunk/600267ea56cafcf8f9a1150a4379184960a757b2 2025-10-10T00:44:17.3766305Z * [new tag] trunk/600db525bdb5e76c12f30f271d969d43a7f8efef -> trunk/600db525bdb5e76c12f30f271d969d43a7f8efef 2025-10-10T00:44:17.3768303Z * [new tag] trunk/608792153f42254d2d2b5a87d524807a0c2724f1 -> trunk/608792153f42254d2d2b5a87d524807a0c2724f1 2025-10-10T00:44:17.3769839Z * [new tag] trunk/6389658ec6b1ea58cb1de032266d865eeb8d48e9 -> trunk/6389658ec6b1ea58cb1de032266d865eeb8d48e9 2025-10-10T00:44:17.3771470Z * [new tag] trunk/64108bdbed2f099d527060b4c9fdd5a11cad2afc -> trunk/64108bdbed2f099d527060b4c9fdd5a11cad2afc 2025-10-10T00:44:17.3773004Z * [new tag] trunk/65aa62d50d1c83aa1b46ed4d584f12f509bab1c4 -> trunk/65aa62d50d1c83aa1b46ed4d584f12f509bab1c4 2025-10-10T00:44:17.3774605Z * [new tag] trunk/65f10becdf21f3a0947a735904fcce876ce3c4b0 -> trunk/65f10becdf21f3a0947a735904fcce876ce3c4b0 2025-10-10T00:44:17.3776586Z * [new tag] trunk/660e369a68dd8be60ce4eb67c25191ea66efc303 -> trunk/660e369a68dd8be60ce4eb67c25191ea66efc303 2025-10-10T00:44:17.3778219Z * [new tag] trunk/68350660ee2db8c21c84527929b92de9f0bcc3e2 -> trunk/68350660ee2db8c21c84527929b92de9f0bcc3e2 2025-10-10T00:44:17.3779823Z * [new tag] trunk/6861a270624b44954826688f8dad668eb0154452 -> trunk/6861a270624b44954826688f8dad668eb0154452 2025-10-10T00:44:17.3781471Z * [new tag] trunk/6861fa43e5fee7fedc0213e352fa983edea8aa78 -> trunk/6861fa43e5fee7fedc0213e352fa983edea8aa78 2025-10-10T00:44:17.3783038Z * [new tag] trunk/688efd9741dbd18c176729aec3df7a73825f8463 -> trunk/688efd9741dbd18c176729aec3df7a73825f8463 2025-10-10T00:44:17.3784723Z * [new tag] trunk/6a09f9306cadd003b2e6abc3f6422a2d8607779b -> trunk/6a09f9306cadd003b2e6abc3f6422a2d8607779b 2025-10-10T00:44:17.3786505Z * [new tag] trunk/6a31f42da45c0f1cbdb021b3695f0e6388b8b532 -> trunk/6a31f42da45c0f1cbdb021b3695f0e6388b8b532 2025-10-10T00:44:17.3788132Z * [new tag] trunk/6a7f5c0d21a22959d014c8b06f3efe3408336aaf -> trunk/6a7f5c0d21a22959d014c8b06f3efe3408336aaf 2025-10-10T00:44:17.3789885Z * [new tag] trunk/6b768e1890a179122e91395c5532a382d69b96a0 -> trunk/6b768e1890a179122e91395c5532a382d69b96a0 2025-10-10T00:44:17.3791492Z * [new tag] trunk/6b7970192f5de47d29a4fe085f509389ac0bea7d -> trunk/6b7970192f5de47d29a4fe085f509389ac0bea7d 2025-10-10T00:44:17.3793136Z * [new tag] trunk/6bb021c12553755a4f64df0b60dc34b1efdb992b -> trunk/6bb021c12553755a4f64df0b60dc34b1efdb992b 2025-10-10T00:44:17.3794709Z * [new tag] trunk/6bb586eafd723d4972c729f37c14f27c88168adc -> trunk/6bb586eafd723d4972c729f37c14f27c88168adc 2025-10-10T00:44:17.3796274Z * [new tag] trunk/6c0125dbc0241aef962528651df4f67204a8b526 -> trunk/6c0125dbc0241aef962528651df4f67204a8b526 2025-10-10T00:44:17.3797812Z * [new tag] trunk/6c209bfc5c1e1e59e6a62f94151398d66164bb93 -> trunk/6c209bfc5c1e1e59e6a62f94151398d66164bb93 2025-10-10T00:44:17.3799634Z * [new tag] trunk/6c3c9414eb571b34ff0d932978e4733dbb08dc1d -> trunk/6c3c9414eb571b34ff0d932978e4733dbb08dc1d 2025-10-10T00:44:17.3801204Z * [new tag] trunk/6d27a8e5093ee2a21d44dceeeffcb272e6e0f655 -> trunk/6d27a8e5093ee2a21d44dceeeffcb272e6e0f655 2025-10-10T00:44:17.3802878Z * [new tag] trunk/702f6e703b1d3a942346848b65a9f2a37d12ae18 -> trunk/702f6e703b1d3a942346848b65a9f2a37d12ae18 2025-10-10T00:44:17.3804551Z * [new tag] trunk/7158aa22e8dc97fdc2657cf0d4cde34b277e7d9e -> trunk/7158aa22e8dc97fdc2657cf0d4cde34b277e7d9e 2025-10-10T00:44:17.3806140Z * [new tag] trunk/71aefd5595834dd97f38aa978ee32abbd13ac3d6 -> trunk/71aefd5595834dd97f38aa978ee32abbd13ac3d6 2025-10-10T00:44:17.3807780Z * [new tag] trunk/724463d5a2fba369cd14e89215b84d1b01435df7 -> trunk/724463d5a2fba369cd14e89215b84d1b01435df7 2025-10-10T00:44:17.3809284Z * [new tag] trunk/73adac05d13babb75410c3e033fdce57aa16881a -> trunk/73adac05d13babb75410c3e033fdce57aa16881a 2025-10-10T00:44:17.3810888Z * [new tag] trunk/7457d139c51124e5a31a6173f99f81f0deb52178 -> trunk/7457d139c51124e5a31a6173f99f81f0deb52178 2025-10-10T00:44:17.3812665Z * [new tag] trunk/746fe78ecd52f3e9cfddda41f0ac82dada7bdd0b -> trunk/746fe78ecd52f3e9cfddda41f0ac82dada7bdd0b 2025-10-10T00:44:17.3813746Z * [new tag] trunk/7617b113ad0045cdfe5cf1feb8efb634a41c6ce2 -> trunk/7617b113ad0045cdfe5cf1feb8efb634a41c6ce2 2025-10-10T00:44:17.3815667Z * [new tag] trunk/7a1ead755f2e2abe8be49a7a0fb88b6b13973147 -> trunk/7a1ead755f2e2abe8be49a7a0fb88b6b13973147 2025-10-10T00:44:17.3816953Z * [new tag] trunk/7b15534434aeaf59a4c9189f52b4ebd4a5d58803 -> trunk/7b15534434aeaf59a4c9189f52b4ebd4a5d58803 2025-10-10T00:44:17.3818767Z * [new tag] trunk/7b691546d2949790ffc8f6bd3c674faa6a46ff7c -> trunk/7b691546d2949790ffc8f6bd3c674faa6a46ff7c 2025-10-10T00:44:17.3820629Z * [new tag] trunk/7cfecd76b2141d81c90d722dc5e3262bdf7ea900 -> trunk/7cfecd76b2141d81c90d722dc5e3262bdf7ea900 2025-10-10T00:44:17.3822184Z * [new tag] trunk/7d570129e0cea8dd3de0175baff96723656ab8ab -> trunk/7d570129e0cea8dd3de0175baff96723656ab8ab 2025-10-10T00:44:17.3823797Z * [new tag] trunk/7e7ac2039d5d5f35373c4de6cdf0ccdee3734c7a -> trunk/7e7ac2039d5d5f35373c4de6cdf0ccdee3734c7a 2025-10-10T00:44:17.3824887Z * [new tag] trunk/7eb1eb4313cfa3db1beadc6d9d04ea6b76acc39c -> trunk/7eb1eb4313cfa3db1beadc6d9d04ea6b76acc39c 2025-10-10T00:44:17.3826590Z * [new tag] trunk/801e282f39e9ef4424dfd3ecfd2b550a44595229 -> trunk/801e282f39e9ef4424dfd3ecfd2b550a44595229 2025-10-10T00:44:17.3828342Z * [new tag] trunk/81994b08a078b30e076d408713f78c9bf4e329e7 -> trunk/81994b08a078b30e076d408713f78c9bf4e329e7 2025-10-10T00:44:17.3829338Z * [new tag] trunk/81dbeb06f4b3eb6c56625ec25d377eb7c7c6c573 -> trunk/81dbeb06f4b3eb6c56625ec25d377eb7c7c6c573 2025-10-10T00:44:17.3831238Z * [new tag] trunk/83458197d14921f797565135f0f45031c362338d -> trunk/83458197d14921f797565135f0f45031c362338d 2025-10-10T00:44:17.3832827Z * [new tag] trunk/83d71dfb2fd993a6242372b8123549acaa85ffdb -> trunk/83d71dfb2fd993a6242372b8123549acaa85ffdb 2025-10-10T00:44:17.3834415Z * [new tag] trunk/86474ce996d168b404592cbbdfcc30d6607c8bd4 -> trunk/86474ce996d168b404592cbbdfcc30d6607c8bd4 2025-10-10T00:44:17.3836013Z * [new tag] trunk/86c789849eac1f96d03cf273e7995dbc7d319c26 -> trunk/86c789849eac1f96d03cf273e7995dbc7d319c26 2025-10-10T00:44:17.3837901Z * [new tag] trunk/874efa2d72d83b00894097130f18062ce331a265 -> trunk/874efa2d72d83b00894097130f18062ce331a265 2025-10-10T00:44:17.3838992Z * [new tag] trunk/87c9fbda22c229d4e5512011e050efd6ffea1241 -> trunk/87c9fbda22c229d4e5512011e050efd6ffea1241 2025-10-10T00:44:17.3840758Z * [new tag] trunk/87eccf10e8484c9e59ef81ae7bdee68d3db4f605 -> trunk/87eccf10e8484c9e59ef81ae7bdee68d3db4f605 2025-10-10T00:44:17.3842379Z * [new tag] trunk/8c0bc879b97bc580aaa0777b2d266bdd068cb528 -> trunk/8c0bc879b97bc580aaa0777b2d266bdd068cb528 2025-10-10T00:44:17.3844034Z * [new tag] trunk/8c54101933bb7c6ed3f9c1a65629b7f30376f7e2 -> trunk/8c54101933bb7c6ed3f9c1a65629b7f30376f7e2 2025-10-10T00:44:17.3845669Z * [new tag] trunk/8ca986ee60febce075f9e3ff83726048cebbbf68 -> trunk/8ca986ee60febce075f9e3ff83726048cebbbf68 2025-10-10T00:44:17.3847368Z * [new tag] trunk/8d53d788fefc0370931063d91f0c342556c3cf4c -> trunk/8d53d788fefc0370931063d91f0c342556c3cf4c 2025-10-10T00:44:17.3849736Z * [new tag] trunk/8e1f409b8ccf64b2cf3933ece13587ad57e9d8a9 -> trunk/8e1f409b8ccf64b2cf3933ece13587ad57e9d8a9 2025-10-10T00:44:17.3851484Z * [new tag] trunk/8ec8c14aced9f3e7ff4ab663822bed792d6c34f4 -> trunk/8ec8c14aced9f3e7ff4ab663822bed792d6c34f4 2025-10-10T00:44:17.3853088Z * [new tag] trunk/8f54e27e5decf41222f5d744069eb6572dbf275f -> trunk/8f54e27e5decf41222f5d744069eb6572dbf275f 2025-10-10T00:44:17.3854789Z * [new tag] trunk/8f705d019a64b1ca882e043b3eb98559273a9e59 -> trunk/8f705d019a64b1ca882e043b3eb98559273a9e59 2025-10-10T00:44:17.3856436Z * [new tag] trunk/8f83b3e71cb2af6244971af59bfbb6e2abb55f24 -> trunk/8f83b3e71cb2af6244971af59bfbb6e2abb55f24 2025-10-10T00:44:17.3858042Z * [new tag] trunk/90b4e130d6871bee4e1f15bb8294c1bbbf8f4ba5 -> trunk/90b4e130d6871bee4e1f15bb8294c1bbbf8f4ba5 2025-10-10T00:44:17.3859139Z * [new tag] trunk/90c0825e2deb0a46faf5cc2deb7184f6f8ea7a6d -> trunk/90c0825e2deb0a46faf5cc2deb7184f6f8ea7a6d 2025-10-10T00:44:17.3861031Z * [new tag] trunk/91040f49348646d79c6cd3434c34860d25c2e47a -> trunk/91040f49348646d79c6cd3434c34860d25c2e47a 2025-10-10T00:44:17.3862288Z * [new tag] trunk/91b94842645c1a781ab169b0df718545901ebb01 -> trunk/91b94842645c1a781ab169b0df718545901ebb01 2025-10-10T00:44:17.3864094Z * [new tag] trunk/91c211fb8c8ec3065be2a18dfc399ce849ea83bf -> trunk/91c211fb8c8ec3065be2a18dfc399ce849ea83bf 2025-10-10T00:44:17.3865693Z * [new tag] trunk/91c4db76cbb82dfa46d937b8dce4c942eaf5e226 -> trunk/91c4db76cbb82dfa46d937b8dce4c942eaf5e226 2025-10-10T00:44:17.3867310Z * [new tag] trunk/93e833de0f987f66d8c93b76ffe6aad35b714231 -> trunk/93e833de0f987f66d8c93b76ffe6aad35b714231 2025-10-10T00:44:17.3868880Z * [new tag] trunk/94b1ec8c7c5cc63541325abc923973f2fc2ad094 -> trunk/94b1ec8c7c5cc63541325abc923973f2fc2ad094 2025-10-10T00:44:17.3870623Z * [new tag] trunk/955f21dc2c628e09e0d112b3db1ee928cd1da344 -> trunk/955f21dc2c628e09e0d112b3db1ee928cd1da344 2025-10-10T00:44:17.3872225Z * [new tag] trunk/9580539e2f73d68e89544c713ff460bea3038701 -> trunk/9580539e2f73d68e89544c713ff460bea3038701 2025-10-10T00:44:17.3873830Z * [new tag] trunk/95a053284cd28e8d52bd55049bd45aea47adba0c -> trunk/95a053284cd28e8d52bd55049bd45aea47adba0c 2025-10-10T00:44:17.3875513Z * [new tag] trunk/960c4b9937251da01ea588efff0fc06a34eac35b -> trunk/960c4b9937251da01ea588efff0fc06a34eac35b 2025-10-10T00:44:17.3877001Z * [new tag] trunk/96181d6f7619acf938dc743123326c6b5dd25284 -> trunk/96181d6f7619acf938dc743123326c6b5dd25284 2025-10-10T00:44:17.3878706Z * [new tag] trunk/9697a7ce9ea095e933658cfee13f9bbef272551a -> trunk/9697a7ce9ea095e933658cfee13f9bbef272551a 2025-10-10T00:44:17.3880792Z * [new tag] trunk/96d91da792d4b50930318ecdfb8b5b8190c467cd -> trunk/96d91da792d4b50930318ecdfb8b5b8190c467cd 2025-10-10T00:44:17.3882365Z * [new tag] trunk/97463d4cf3c125557ef23502772b12a67dac4dc7 -> trunk/97463d4cf3c125557ef23502772b12a67dac4dc7 2025-10-10T00:44:17.3884019Z * [new tag] trunk/97ca21106d0179f425fc752ec867fe11669c2834 -> trunk/97ca21106d0179f425fc752ec867fe11669c2834 2025-10-10T00:44:17.3885648Z * [new tag] trunk/98a081a24c22072362dc536afd39a469e28939d4 -> trunk/98a081a24c22072362dc536afd39a469e28939d4 2025-10-10T00:44:17.3887331Z * [new tag] trunk/9944cac6e6a95159744a775a8bef40d89eef0f03 -> trunk/9944cac6e6a95159744a775a8bef40d89eef0f03 2025-10-10T00:44:17.3889097Z * [new tag] trunk/9aa92f246fa5fe5cfda17970d41d167b19a0612a -> trunk/9aa92f246fa5fe5cfda17970d41d167b19a0612a 2025-10-10T00:44:17.3890671Z * [new tag] trunk/9d1ab4f4bb508a72c7f549f0b5219c4601944ba1 -> trunk/9d1ab4f4bb508a72c7f549f0b5219c4601944ba1 2025-10-10T00:44:17.3892251Z * [new tag] trunk/9eb89a4ad5965b97c54e498d71fc765c0059acef -> trunk/9eb89a4ad5965b97c54e498d71fc765c0059acef 2025-10-10T00:44:17.3893748Z * [new tag] trunk/9ec10dc26a81dc618ff435edd4ca4819245ecb0f -> trunk/9ec10dc26a81dc618ff435edd4ca4819245ecb0f 2025-10-10T00:44:17.3895385Z * [new tag] trunk/9ecd092bd98f43d1cd4acc88eed6cbc39e946dbe -> trunk/9ecd092bd98f43d1cd4acc88eed6cbc39e946dbe 2025-10-10T00:44:17.3896445Z * [new tag] trunk/9f5e1beaf3c9248a335d2448103240a463187eb5 -> trunk/9f5e1beaf3c9248a335d2448103240a463187eb5 2025-10-10T00:44:17.3898561Z * [new tag] trunk/9fc2c6446d394dd313ed71e9d1ffc4f7f3916423 -> trunk/9fc2c6446d394dd313ed71e9d1ffc4f7f3916423 2025-10-10T00:44:17.3902763Z * [new tag] trunk/9fff8155c362da777e7ce31b85fb2dc7cfced2d5 -> trunk/9fff8155c362da777e7ce31b85fb2dc7cfced2d5 2025-10-10T00:44:17.3904306Z * [new tag] trunk/a029675f6f0b9cf48eb7943d4be8169c67960a8e -> trunk/a029675f6f0b9cf48eb7943d4be8169c67960a8e 2025-10-10T00:44:17.3906000Z * [new tag] trunk/a11a66ef320938cd0fd72b44b2b572b06937e100 -> trunk/a11a66ef320938cd0fd72b44b2b572b06937e100 2025-10-10T00:44:17.3907570Z * [new tag] trunk/a2f29bcd6388acdc3202d8a90974c50ffb605104 -> trunk/a2f29bcd6388acdc3202d8a90974c50ffb605104 2025-10-10T00:44:17.3909260Z * [new tag] trunk/a34797e031727f6a01a2f13a66db2f7e1fcc05b6 -> trunk/a34797e031727f6a01a2f13a66db2f7e1fcc05b6 2025-10-10T00:44:17.3910936Z * [new tag] trunk/a4110fedcf72eaede76324bb5c21a76589d75849 -> trunk/a4110fedcf72eaede76324bb5c21a76589d75849 2025-10-10T00:44:17.3912026Z * [new tag] trunk/a43c4c3972a611db169dde2aed803b91fe78c081 -> trunk/a43c4c3972a611db169dde2aed803b91fe78c081 2025-10-10T00:44:17.3913874Z * [new tag] trunk/a57a14868dcfd9dabf9bd19b6b11f31967c80c87 -> trunk/a57a14868dcfd9dabf9bd19b6b11f31967c80c87 2025-10-10T00:44:17.3915500Z * [new tag] trunk/a6fa4f9c283971c0fb6f60a89674a1f35370ac79 -> trunk/a6fa4f9c283971c0fb6f60a89674a1f35370ac79 2025-10-10T00:44:17.3917160Z * [new tag] trunk/a753ffa9aff47e005c31d6bcbf5b6a61cc54afed -> trunk/a753ffa9aff47e005c31d6bcbf5b6a61cc54afed 2025-10-10T00:44:17.3918410Z * [new tag] trunk/a7fa1a91e386c7708e4c8747680911b0c3174a66 -> trunk/a7fa1a91e386c7708e4c8747680911b0c3174a66 2025-10-10T00:44:17.3920011Z * [new tag] trunk/a9a9a3438a374f96a308b707a1718036aaec790d -> trunk/a9a9a3438a374f96a308b707a1718036aaec790d 2025-10-10T00:44:17.3921712Z * [new tag] trunk/ab01a0d7d352e7fd07989b8d6bf035bf82aea74e -> trunk/ab01a0d7d352e7fd07989b8d6bf035bf82aea74e 2025-10-10T00:44:17.3923487Z * [new tag] trunk/ab94a0d544503b5c27e889b45e45ef8cf75c8183 -> trunk/ab94a0d544503b5c27e889b45e45ef8cf75c8183 2025-10-10T00:44:17.3924626Z * [new tag] trunk/abadea70f3eb5f2f764fd6448d42dd2c29fa28b3 -> trunk/abadea70f3eb5f2f764fd6448d42dd2c29fa28b3 2025-10-10T00:44:17.3926393Z * [new tag] trunk/ac08556f674259ff5b117964e300124e8a92d45b -> trunk/ac08556f674259ff5b117964e300124e8a92d45b 2025-10-10T00:44:17.3928289Z * [new tag] trunk/ac7b4e7fe4d233dcd7f6343d42b4fa3d64bce548 -> trunk/ac7b4e7fe4d233dcd7f6343d42b4fa3d64bce548 2025-10-10T00:44:17.3929494Z * [new tag] trunk/ac901bf79a2d78539ffec272bf32f4ae47035b23 -> trunk/ac901bf79a2d78539ffec272bf32f4ae47035b23 2025-10-10T00:44:17.3931488Z * [new tag] trunk/ad7b2bebc651c297d869f265deedef726bf17048 -> trunk/ad7b2bebc651c297d869f265deedef726bf17048 2025-10-10T00:44:17.3933094Z * [new tag] trunk/ae25ec569c614c2a2274837079578b71f3201a3b -> trunk/ae25ec569c614c2a2274837079578b71f3201a3b 2025-10-10T00:44:17.3935021Z * [new tag] trunk/aea57b3aa38a3d4a058e0a7eba08d0c6c28ed9c5 -> trunk/aea57b3aa38a3d4a058e0a7eba08d0c6c28ed9c5 2025-10-10T00:44:17.3936726Z * [new tag] trunk/aed5ed1076d3e73e0b6357dafac1002aa6a221e9 -> trunk/aed5ed1076d3e73e0b6357dafac1002aa6a221e9 2025-10-10T00:44:17.3938045Z * [new tag] trunk/aed66248a01d309eb2ac1149b5f51310545b0783 -> trunk/aed66248a01d309eb2ac1149b5f51310545b0783 2025-10-10T00:44:17.3939899Z * [new tag] trunk/af32d16a71681ca05c6d410fb1b9cee091d4577d -> trunk/af32d16a71681ca05c6d410fb1b9cee091d4577d 2025-10-10T00:44:17.3941790Z * [new tag] trunk/af40828bbb785f968eda18dbdc8750ba67f57366 -> trunk/af40828bbb785f968eda18dbdc8750ba67f57366 2025-10-10T00:44:17.3943410Z * [new tag] trunk/af4c29fea8f50ac3bb9e4a0e305da4a2c6b53d29 -> trunk/af4c29fea8f50ac3bb9e4a0e305da4a2c6b53d29 2025-10-10T00:44:17.3944971Z * [new tag] trunk/afee8062d511ad63e0af65ffac0e712d86aae8f1 -> trunk/afee8062d511ad63e0af65ffac0e712d86aae8f1 2025-10-10T00:44:17.3946538Z * [new tag] trunk/afeec56a5aa83dd0258565400551a99777c0023b -> trunk/afeec56a5aa83dd0258565400551a99777c0023b 2025-10-10T00:44:17.3948363Z * [new tag] trunk/b0985144b59db8fb20964829b5e0a9d2f9a3f0d6 -> trunk/b0985144b59db8fb20964829b5e0a9d2f9a3f0d6 2025-10-10T00:44:17.3949939Z * [new tag] trunk/b116c5133024be39a2db67cd0112b490b970b710 -> trunk/b116c5133024be39a2db67cd0112b490b970b710 2025-10-10T00:44:17.3951540Z * [new tag] trunk/b13cd141b3585c2ae89ad7747acd11203a2fb837 -> trunk/b13cd141b3585c2ae89ad7747acd11203a2fb837 2025-10-10T00:44:17.3953059Z * [new tag] trunk/b1ac252f55f4a4d0e5488fb2ac9154154decec87 -> trunk/b1ac252f55f4a4d0e5488fb2ac9154154decec87 2025-10-10T00:44:17.3954787Z * [new tag] trunk/b28b24a9fc7d391c5793a94489a3f2d5381f6ad7 -> trunk/b28b24a9fc7d391c5793a94489a3f2d5381f6ad7 2025-10-10T00:44:17.3956438Z * [new tag] trunk/b2b3947565fd0c27ebd4941152c964eab30370e2 -> trunk/b2b3947565fd0c27ebd4941152c964eab30370e2 2025-10-10T00:44:17.3957572Z * [new tag] trunk/b558c986e8ec693b531ad2817026393c55d72eb6 -> trunk/b558c986e8ec693b531ad2817026393c55d72eb6 2025-10-10T00:44:17.3959544Z * [new tag] trunk/b5e93ffdcf779c703af5c8119636b01f250eafcd -> trunk/b5e93ffdcf779c703af5c8119636b01f250eafcd 2025-10-10T00:44:17.3961201Z * [new tag] trunk/b63bbe16615cc7680836dbb151bd848bce4893d6 -> trunk/b63bbe16615cc7680836dbb151bd848bce4893d6 2025-10-10T00:44:17.3962181Z * [new tag] trunk/b6b7a44dec63495d57946cbfe8f2accb8f876db2 -> trunk/b6b7a44dec63495d57946cbfe8f2accb8f876db2 2025-10-10T00:44:17.3963842Z * [new tag] trunk/b9e73e639e36f3aa628752161711e68878231b30 -> trunk/b9e73e639e36f3aa628752161711e68878231b30 2025-10-10T00:44:17.3965501Z * [new tag] trunk/ba480d6bf78ea446d1268d9b5b3a0dbb490c9c88 -> trunk/ba480d6bf78ea446d1268d9b5b3a0dbb490c9c88 2025-10-10T00:44:17.3967241Z * [new tag] trunk/bac0f289a35f05052740076fc5671271a3d487c2 -> trunk/bac0f289a35f05052740076fc5671271a3d487c2 2025-10-10T00:44:17.3968980Z * [new tag] trunk/bc1690c7e859dee8c47a7f0bbd3c43cc27c6fd2a -> trunk/bc1690c7e859dee8c47a7f0bbd3c43cc27c6fd2a 2025-10-10T00:44:17.3970138Z * [new tag] trunk/bc33b10202fb7c3761bcabc166e02d96807d8739 -> trunk/bc33b10202fb7c3761bcabc166e02d96807d8739 2025-10-10T00:44:17.3972104Z * [new tag] trunk/bcafea5c92ca2ee1b0dc8f6d8b62ecabb6f40228 -> trunk/bcafea5c92ca2ee1b0dc8f6d8b62ecabb6f40228 2025-10-10T00:44:17.3973795Z * [new tag] trunk/bcd96cc6ff798281e66aabef6ce72542fdc97c7a -> trunk/bcd96cc6ff798281e66aabef6ce72542fdc97c7a 2025-10-10T00:44:17.3975354Z * [new tag] trunk/bd3b98a8a5d68ddc84b20a4609b9ea90998bf95b -> trunk/bd3b98a8a5d68ddc84b20a4609b9ea90998bf95b 2025-10-10T00:44:17.3976956Z * [new tag] trunk/bdc0a421d7bcc49db12f7593d2c213a6141da614 -> trunk/bdc0a421d7bcc49db12f7593d2c213a6141da614 2025-10-10T00:44:17.3978638Z * [new tag] trunk/bde18c445dcb1d83e8ea0afae52f9b9bf8171f45 -> trunk/bde18c445dcb1d83e8ea0afae52f9b9bf8171f45 2025-10-10T00:44:17.3979786Z * [new tag] trunk/bf717ce346203fc27e792f4bdcc31e979cd74fa9 -> trunk/bf717ce346203fc27e792f4bdcc31e979cd74fa9 2025-10-10T00:44:17.3981962Z * [new tag] trunk/c0510dc447a1f105cb8758d2721380f7a7c380d1 -> trunk/c0510dc447a1f105cb8758d2721380f7a7c380d1 2025-10-10T00:44:17.3983721Z * [new tag] trunk/c1f40d33c89b361a1edad17aa25cfff1ab4014fd -> trunk/c1f40d33c89b361a1edad17aa25cfff1ab4014fd 2025-10-10T00:44:17.3985296Z * [new tag] trunk/c32118dc3e50505fd285e6e448a90883fce11535 -> trunk/c32118dc3e50505fd285e6e448a90883fce11535 2025-10-10T00:44:17.3987309Z * [new tag] trunk/c45d56dd00546daa7d9044674233dba1ac7b6194 -> trunk/c45d56dd00546daa7d9044674233dba1ac7b6194 2025-10-10T00:44:17.3988873Z * [new tag] trunk/c6329524d8670d5f9295cddcf7ebc3040ed9179e -> trunk/c6329524d8670d5f9295cddcf7ebc3040ed9179e 2025-10-10T00:44:17.3990473Z * [new tag] trunk/c6a6c80a730ff4edaec0d2fc4a5ff9344edaed41 -> trunk/c6a6c80a730ff4edaec0d2fc4a5ff9344edaed41 2025-10-10T00:44:17.3992075Z * [new tag] trunk/c7e30ae4dd9a58ed4f4bcbdc6afc2249cac94f28 -> trunk/c7e30ae4dd9a58ed4f4bcbdc6afc2249cac94f28 2025-10-10T00:44:17.3993719Z * [new tag] trunk/c813617c53e6be91e77f47e9a3f713146d54f340 -> trunk/c813617c53e6be91e77f47e9a3f713146d54f340 2025-10-10T00:44:17.3995392Z * [new tag] trunk/c855f8632e331b51d60d5f1bcc59d3181cb4bc82 -> trunk/c855f8632e331b51d60d5f1bcc59d3181cb4bc82 2025-10-10T00:44:17.3996961Z * [new tag] trunk/c965d6dbb23a8b2338ffebf3f01c6f92ce5847d2 -> trunk/c965d6dbb23a8b2338ffebf3f01c6f92ce5847d2 2025-10-10T00:44:17.3998786Z * [new tag] trunk/cac5e13e1384900c5acc4938c33d6037a61850d5 -> trunk/cac5e13e1384900c5acc4938c33d6037a61850d5 2025-10-10T00:44:17.4000260Z * [new tag] trunk/cc71ab86a6985e85645424b727c766e031047ff6 -> trunk/cc71ab86a6985e85645424b727c766e031047ff6 2025-10-10T00:44:17.4002191Z * [new tag] trunk/cd62a73dcb13102069aa827a6657f62d88cce095 -> trunk/cd62a73dcb13102069aa827a6657f62d88cce095 2025-10-10T00:44:17.4003285Z * [new tag] trunk/cf0a00d4f38775e5a82a166e367f40383c606963 -> trunk/cf0a00d4f38775e5a82a166e367f40383c606963 2025-10-10T00:44:17.4005211Z * [new tag] trunk/cfc5cc17dc4fa6be41b4b31eb6e63d3863479452 -> trunk/cfc5cc17dc4fa6be41b4b31eb6e63d3863479452 2025-10-10T00:44:17.4006842Z * [new tag] trunk/cfd46d13e6e1308add3a9f287b4855ccc3f2e66c -> trunk/cfd46d13e6e1308add3a9f287b4855ccc3f2e66c 2025-10-10T00:44:17.4008591Z * [new tag] trunk/d1a62c80363cf769552453eed187e935f905737d -> trunk/d1a62c80363cf769552453eed187e935f905737d 2025-10-10T00:44:17.4010139Z * [new tag] trunk/d1cbb74fb16406488a174832e1b58b7c242f418d -> trunk/d1cbb74fb16406488a174832e1b58b7c242f418d 2025-10-10T00:44:17.4011763Z * [new tag] trunk/d386325ca9a142419f45b987391f4bb175dd7d0b -> trunk/d386325ca9a142419f45b987391f4bb175dd7d0b 2025-10-10T00:44:17.4013387Z * [new tag] trunk/d40a9bfb8da0dc1ac1e6e56b33a25979112874de -> trunk/d40a9bfb8da0dc1ac1e6e56b33a25979112874de 2025-10-10T00:44:17.4015072Z * [new tag] trunk/d4443840036a00a30afcf066cb23f4525e590809 -> trunk/d4443840036a00a30afcf066cb23f4525e590809 2025-10-10T00:44:17.4016651Z * [new tag] trunk/d4752bc7f6818a3df5356a9de61afe1d3e27ade9 -> trunk/d4752bc7f6818a3df5356a9de61afe1d3e27ade9 2025-10-10T00:44:17.4018207Z * [new tag] trunk/da49a57d3462332b26cb7ee58910b5bc67e5772c -> trunk/da49a57d3462332b26cb7ee58910b5bc67e5772c 2025-10-10T00:44:17.4019839Z * [new tag] trunk/da903b6a8be422529d47649e89c0d50bb95c37ca -> trunk/da903b6a8be422529d47649e89c0d50bb95c37ca 2025-10-10T00:44:17.4021490Z * [new tag] trunk/dca73982c53e9f99f96246b5d9ed9bab83c7423f -> trunk/dca73982c53e9f99f96246b5d9ed9bab83c7423f 2025-10-10T00:44:17.4023187Z * [new tag] trunk/ddf8de28c25944a58e739ba9996b06753e4199cc -> trunk/ddf8de28c25944a58e739ba9996b06753e4199cc 2025-10-10T00:44:17.4024742Z * [new tag] trunk/df640df68a5275684eaae3080a9c97a0c61469c8 -> trunk/df640df68a5275684eaae3080a9c97a0c61469c8 2025-10-10T00:44:17.4026246Z * [new tag] trunk/e09fb44ef177005c4a11c28be24781429d416a3e -> trunk/e09fb44ef177005c4a11c28be24781429d416a3e 2025-10-10T00:44:17.4027835Z * [new tag] trunk/e0cb1848d0fd9fb4467ad8b844c565aea5071838 -> trunk/e0cb1848d0fd9fb4467ad8b844c565aea5071838 2025-10-10T00:44:17.4029386Z * [new tag] trunk/e3ae80fc036da356e3748d134689741583552f09 -> trunk/e3ae80fc036da356e3748d134689741583552f09 2025-10-10T00:44:17.4031033Z * [new tag] trunk/e40fe634b1a7aa33e278b1404ee02dea12277080 -> trunk/e40fe634b1a7aa33e278b1404ee02dea12277080 2025-10-10T00:44:17.4032772Z * [new tag] trunk/e438db254602cf39ba536aed0590b4144c019ee8 -> trunk/e438db254602cf39ba536aed0590b4144c019ee8 2025-10-10T00:44:17.4034618Z * [new tag] trunk/e532f62e0d96e56cb28fa6a0ba6d981896a65d52 -> trunk/e532f62e0d96e56cb28fa6a0ba6d981896a65d52 2025-10-10T00:44:17.4036146Z * [new tag] trunk/e6d4b26776842307475b368db60e27ac1bcede86 -> trunk/e6d4b26776842307475b368db60e27ac1bcede86 2025-10-10T00:44:17.4037728Z * [new tag] trunk/e7ed1a00eb5510d1c7dccd17b5c0ebb54231284f -> trunk/e7ed1a00eb5510d1c7dccd17b5c0ebb54231284f 2025-10-10T00:44:17.4039320Z * [new tag] trunk/e7fd2969303ab931f5e5875eca676018e1acd089 -> trunk/e7fd2969303ab931f5e5875eca676018e1acd089 2025-10-10T00:44:17.4040967Z * [new tag] trunk/e89d12bf5d6b69c153cd000ef278fca59f03226d -> trunk/e89d12bf5d6b69c153cd000ef278fca59f03226d 2025-10-10T00:44:17.4042566Z * [new tag] trunk/e98c4e835b1db22092fc93b49d2cddd7b3537d1f -> trunk/e98c4e835b1db22092fc93b49d2cddd7b3537d1f 2025-10-10T00:44:17.4044056Z * [new tag] trunk/ea42517e454d2e47391646bbb897f5fc51147b9d -> trunk/ea42517e454d2e47391646bbb897f5fc51147b9d 2025-10-10T00:44:17.4045760Z * [new tag] trunk/eaa02655eabd24609744c2251ac40d39d86ebb39 -> trunk/eaa02655eabd24609744c2251ac40d39d86ebb39 2025-10-10T00:44:17.4047338Z * [new tag] trunk/eccf561326147894d57482a5aba7a2290005b257 -> trunk/eccf561326147894d57482a5aba7a2290005b257 2025-10-10T00:44:17.4049041Z * [new tag] trunk/ece5e0f01b68509342f85fa388ca61936dc18b20 -> trunk/ece5e0f01b68509342f85fa388ca61936dc18b20 2025-10-10T00:44:17.4050697Z * [new tag] trunk/ed2d514ad860229f6d364688f9db27dad034cd83 -> trunk/ed2d514ad860229f6d364688f9db27dad034cd83 2025-10-10T00:44:17.4052273Z * [new tag] trunk/ed6156e3ea334b9b8d395e5a9f76fa3ba7408c06 -> trunk/ed6156e3ea334b9b8d395e5a9f76fa3ba7408c06 2025-10-10T00:44:17.4053924Z * [new tag] trunk/ee5389d520844db36374e86c986b9ff8f47ac4bb -> trunk/ee5389d520844db36374e86c986b9ff8f47ac4bb 2025-10-10T00:44:17.4055586Z * [new tag] trunk/ee6a1ecb0a1035f068484c8fcfba44b2efc9e837 -> trunk/ee6a1ecb0a1035f068484c8fcfba44b2efc9e837 2025-10-10T00:44:17.4057074Z * [new tag] trunk/ef50c6e3e3d83bfd67e50930eea9a3a9db084061 -> trunk/ef50c6e3e3d83bfd67e50930eea9a3a9db084061 2025-10-10T00:44:17.4058737Z * [new tag] trunk/ef7e2ca77e3f554ced81eb614f15fb84249d4a7e -> trunk/ef7e2ca77e3f554ced81eb614f15fb84249d4a7e 2025-10-10T00:44:17.4060477Z * [new tag] trunk/f006aee601cb72077f4b1dbc3f7f0f685e57a1a9 -> trunk/f006aee601cb72077f4b1dbc3f7f0f685e57a1a9 2025-10-10T00:44:17.4062025Z * [new tag] trunk/f05e23e1bc1439e19145e43e8ffca0051cda2f33 -> trunk/f05e23e1bc1439e19145e43e8ffca0051cda2f33 2025-10-10T00:44:17.4063648Z * [new tag] trunk/f0c9f3bddbf7ad77d5d3a8803c23bb47bfb71d79 -> trunk/f0c9f3bddbf7ad77d5d3a8803c23bb47bfb71d79 2025-10-10T00:44:17.4065273Z * [new tag] trunk/f11ac803d73b90d7e1f7bde962b9afe6b5967eb7 -> trunk/f11ac803d73b90d7e1f7bde962b9afe6b5967eb7 2025-10-10T00:44:17.4066903Z * [new tag] trunk/f1229b6db946c290cc5bdea05dde69fc01e0bed0 -> trunk/f1229b6db946c290cc5bdea05dde69fc01e0bed0 2025-10-10T00:44:17.4068456Z * [new tag] trunk/f231be25c679adb47ac3e483dc68948e5ad137a4 -> trunk/f231be25c679adb47ac3e483dc68948e5ad137a4 2025-10-10T00:44:17.4070198Z * [new tag] trunk/f33201729416ed17467228e80b04d01d4d02b5f3 -> trunk/f33201729416ed17467228e80b04d01d4d02b5f3 2025-10-10T00:44:17.4071771Z * [new tag] trunk/f37a6523efe1b9bf7f6b5b5d0f36dc461a3fda2a -> trunk/f37a6523efe1b9bf7f6b5b5d0f36dc461a3fda2a 2025-10-10T00:44:17.4073569Z * [new tag] trunk/f39789cdabb6465f21666bd001829e1f7284d754 -> trunk/f39789cdabb6465f21666bd001829e1f7284d754 2025-10-10T00:44:17.4075133Z * [new tag] trunk/f3afbcf3407783e54ec2795b06ae744f645320ba -> trunk/f3afbcf3407783e54ec2795b06ae744f645320ba 2025-10-10T00:44:17.4076766Z * [new tag] trunk/f3e43ff2d73f375487b1b71483bbecb6cdad8920 -> trunk/f3e43ff2d73f375487b1b71483bbecb6cdad8920 2025-10-10T00:44:17.4078119Z * [new tag] trunk/f414aa8e0d17e8eff38a93cebd52436e53f50eba -> trunk/f414aa8e0d17e8eff38a93cebd52436e53f50eba 2025-10-10T00:44:17.4079855Z * [new tag] trunk/f465ea6752c91498de63eb57439a74f4836e568a -> trunk/f465ea6752c91498de63eb57439a74f4836e568a 2025-10-10T00:44:17.4081538Z * [new tag] trunk/f46bb04dcc37a9b394e414569aef8aef69f9bf53 -> trunk/f46bb04dcc37a9b394e414569aef8aef69f9bf53 2025-10-10T00:44:17.4083149Z * [new tag] trunk/f46ddb1e65b595c80f285dc42aa8549970736aae -> trunk/f46ddb1e65b595c80f285dc42aa8549970736aae 2025-10-10T00:44:17.4084813Z * [new tag] trunk/f4cf75688f0fd93466589addfb7d0ec33e46e3bf -> trunk/f4cf75688f0fd93466589addfb7d0ec33e46e3bf 2025-10-10T00:44:17.4086339Z * [new tag] trunk/f505caa71bd2e4d1e708e20a3665b834134e08fc -> trunk/f505caa71bd2e4d1e708e20a3665b834134e08fc 2025-10-10T00:44:17.4088141Z * [new tag] trunk/f5fd18f7e24378bd9eb91404f697f1c81a8187d5 -> trunk/f5fd18f7e24378bd9eb91404f697f1c81a8187d5 2025-10-10T00:44:17.4089488Z * [new tag] trunk/f6de195616432f42a545b98ea41cc816019d1c60 -> trunk/f6de195616432f42a545b98ea41cc816019d1c60 2025-10-10T00:44:17.4091741Z * [new tag] trunk/f6f76767563d4293a0f78551edf4675a5794c570 -> trunk/f6f76767563d4293a0f78551edf4675a5794c570 2025-10-10T00:44:17.4093413Z * [new tag] trunk/f7082e92b3635e89906fae514506152a2ec844a0 -> trunk/f7082e92b3635e89906fae514506152a2ec844a0 2025-10-10T00:44:17.4095073Z * [new tag] trunk/f713abab16cb98c15f486e9822dd261279cce252 -> trunk/f713abab16cb98c15f486e9822dd261279cce252 2025-10-10T00:44:17.4096640Z * [new tag] trunk/f76fdcaaf8b6d5f97c7f63705400ebed8984f869 -> trunk/f76fdcaaf8b6d5f97c7f63705400ebed8984f869 2025-10-10T00:44:17.4098204Z * [new tag] trunk/f79e212733ca89ce3cc99a3072e50351686e5568 -> trunk/f79e212733ca89ce3cc99a3072e50351686e5568 2025-10-10T00:44:17.4100115Z * [new tag] trunk/f7ad6dbad67161333a1473d1e0b478b7475a0ec1 -> trunk/f7ad6dbad67161333a1473d1e0b478b7475a0ec1 2025-10-10T00:44:17.4101675Z * [new tag] trunk/fa5306b4f5bea89d80b9f14926712119aab78161 -> trunk/fa5306b4f5bea89d80b9f14926712119aab78161 2025-10-10T00:44:17.4103321Z * [new tag] trunk/fac6f20ae3a68fa49e19571a1fc4bcdddbf87d80 -> trunk/fac6f20ae3a68fa49e19571a1fc4bcdddbf87d80 2025-10-10T00:44:17.4104863Z * [new tag] trunk/fac85fcfb5ad0e63438d808a2f9ba7ea2dff9ad4 -> trunk/fac85fcfb5ad0e63438d808a2f9ba7ea2dff9ad4 2025-10-10T00:44:17.4106523Z * [new tag] trunk/fd3e15c14f4fc474af610b482382a2c85729f50d -> trunk/fd3e15c14f4fc474af610b482382a2c85729f50d 2025-10-10T00:44:17.4108161Z * [new tag] trunk/fd4bde430a51e5f216295c950d962c6343119821 -> trunk/fd4bde430a51e5f216295c950d962c6343119821 2025-10-10T00:44:17.4109789Z * [new tag] trunk/fdc622b513610b53ddcdc0b40282df9beae369bd -> trunk/fdc622b513610b53ddcdc0b40282df9beae369bd 2025-10-10T00:44:17.4111388Z * [new tag] trunk/fdc8ccc5bc433478c2a114016e193f5665d1e370 -> trunk/fdc8ccc5bc433478c2a114016e193f5665d1e370 2025-10-10T00:44:17.4113104Z * [new tag] trunk/ff5faa744a52561f4c6a138089123fd8d41cab73 -> trunk/ff5faa744a52561f4c6a138089123fd8d41cab73 2025-10-10T00:44:17.4114359Z * [new tag] v0.1.1 -> v0.1.1 2025-10-10T00:44:17.4115870Z * [new tag] v0.1.10 -> v0.1.10 2025-10-10T00:44:17.4117777Z * [new tag] v0.1.11 -> v0.1.11 2025-10-10T00:44:17.4119237Z * [new tag] v0.1.12 -> v0.1.12 2025-10-10T00:44:17.4120606Z * [new tag] v0.1.2 -> v0.1.2 2025-10-10T00:44:17.4122007Z * [new tag] v0.1.3 -> v0.1.3 2025-10-10T00:44:17.4123441Z * [new tag] v0.1.4 -> v0.1.4 2025-10-10T00:44:17.4124733Z * [new tag] v0.1.5 -> v0.1.5 2025-10-10T00:44:17.4126277Z * [new tag] v0.1.6 -> v0.1.6 2025-10-10T00:44:17.4127763Z * [new tag] v0.1.7 -> v0.1.7 2025-10-10T00:44:17.4129317Z * [new tag] v0.1.8 -> v0.1.8 2025-10-10T00:44:17.4130197Z * [new tag] v0.1.9 -> v0.1.9 2025-10-10T00:44:17.4131888Z * [new tag] v0.2.0 -> v0.2.0 2025-10-10T00:44:17.4133605Z * [new tag] v0.3.0 -> v0.3.0 2025-10-10T00:44:17.4135106Z * [new tag] v0.3.1 -> v0.3.1 2025-10-10T00:44:17.4136602Z * [new tag] v0.4.0 -> v0.4.0 2025-10-10T00:44:17.4138063Z * [new tag] v0.4.1 -> v0.4.1 2025-10-10T00:44:17.4139511Z * [new tag] v1.0.0 -> v1.0.0 2025-10-10T00:44:17.4141140Z * [new tag] v1.0.0a0 -> v1.0.0a0 2025-10-10T00:44:17.4142219Z * [new tag] v1.0.1 -> v1.0.1 2025-10-10T00:44:17.4143944Z * [new tag] v1.0rc0 -> v1.0rc0 2025-10-10T00:44:17.4145314Z * [new tag] v1.0rc1 -> v1.0rc1 2025-10-10T00:44:17.4146789Z * [new tag] v1.1.0 -> v1.1.0 2025-10-10T00:44:17.4148321Z * [new tag] v1.1.0a0 -> v1.1.0a0 2025-10-10T00:44:17.4149998Z * [new tag] v1.10.0 -> v1.10.0 2025-10-10T00:44:17.4151624Z * [new tag] v1.10.0-rc1 -> v1.10.0-rc1 2025-10-10T00:44:17.4153118Z * [new tag] v1.10.0-rc2 -> v1.10.0-rc2 2025-10-10T00:44:17.4154165Z * [new tag] v1.10.0-rc3 -> v1.10.0-rc3 2025-10-10T00:44:17.4155892Z * [new tag] v1.10.1 -> v1.10.1 2025-10-10T00:44:17.4157173Z * [new tag] v1.10.1-rc1 -> v1.10.1-rc1 2025-10-10T00:44:17.4158078Z * [new tag] v1.10.2 -> v1.10.2 2025-10-10T00:44:17.4159622Z * [new tag] v1.10.2-rc1 -> v1.10.2-rc1 2025-10-10T00:44:17.4161143Z * [new tag] v1.11.0 -> v1.11.0 2025-10-10T00:44:17.4162935Z * [new tag] v1.11.0-rc1 -> v1.11.0-rc1 2025-10-10T00:44:17.4164274Z * [new tag] v1.11.0-rc2 -> v1.11.0-rc2 2025-10-10T00:44:17.4165799Z * [new tag] v1.11.0-rc3 -> v1.11.0-rc3 2025-10-10T00:44:17.4167480Z * [new tag] v1.11.0-rc4 -> v1.11.0-rc4 2025-10-10T00:44:17.4169105Z * [new tag] v1.11.0-rc5 -> v1.11.0-rc5 2025-10-10T00:44:17.4170033Z * [new tag] v1.11.0-rc6 -> v1.11.0-rc6 2025-10-10T00:44:17.4171537Z * [new tag] v1.11.0-rc7 -> v1.11.0-rc7 2025-10-10T00:44:17.4173073Z * [new tag] v1.12.0 -> v1.12.0 2025-10-10T00:44:17.4174566Z * [new tag] v1.12.0-rc1 -> v1.12.0-rc1 2025-10-10T00:44:17.4176036Z * [new tag] v1.12.0-rc2 -> v1.12.0-rc2 2025-10-10T00:44:17.4177575Z * [new tag] v1.12.0-rc3 -> v1.12.0-rc3 2025-10-10T00:44:17.4179168Z * [new tag] v1.12.0-rc4 -> v1.12.0-rc4 2025-10-10T00:44:17.4180705Z * [new tag] v1.12.0-rc5 -> v1.12.0-rc5 2025-10-10T00:44:17.4182272Z * [new tag] v1.12.0-rc6 -> v1.12.0-rc6 2025-10-10T00:44:17.4183341Z * [new tag] v1.12.0-rc7 -> v1.12.0-rc7 2025-10-10T00:44:17.4184805Z * [new tag] v1.12.0-rc8 -> v1.12.0-rc8 2025-10-10T00:44:17.4186072Z * [new tag] v1.12.1 -> v1.12.1 2025-10-10T00:44:17.4187641Z * [new tag] v1.12.1-rc1 -> v1.12.1-rc1 2025-10-10T00:44:17.4189154Z * [new tag] v1.12.1-rc2 -> v1.12.1-rc2 2025-10-10T00:44:17.4190707Z * [new tag] v1.12.1-rc3 -> v1.12.1-rc3 2025-10-10T00:44:17.4192273Z * [new tag] v1.12.1-rc4 -> v1.12.1-rc4 2025-10-10T00:44:17.4193178Z * [new tag] v1.12.1-rc5 -> v1.12.1-rc5 2025-10-10T00:44:17.4195037Z * [new tag] v1.13.0 -> v1.13.0 2025-10-10T00:44:17.4196461Z * [new tag] v1.13.0-rc1 -> v1.13.0-rc1 2025-10-10T00:44:17.4197977Z * [new tag] v1.13.0-rc2 -> v1.13.0-rc2 2025-10-10T00:44:17.4199127Z * [new tag] v1.13.0-rc3 -> v1.13.0-rc3 2025-10-10T00:44:17.4201387Z * [new tag] v1.13.0-rc4 -> v1.13.0-rc4 2025-10-10T00:44:17.4202240Z * [new tag] v1.13.0-rc5 -> v1.13.0-rc5 2025-10-10T00:44:17.4203790Z * [new tag] v1.13.0-rc6 -> v1.13.0-rc6 2025-10-10T00:44:17.4205338Z * [new tag] v1.13.1 -> v1.13.1 2025-10-10T00:44:17.4206585Z * [new tag] v1.13.1-rc1 -> v1.13.1-rc1 2025-10-10T00:44:17.4208164Z * [new tag] v1.2.0 -> v1.2.0 2025-10-10T00:44:17.4209720Z * [new tag] v1.2.0a0 -> v1.2.0a0 2025-10-10T00:44:17.4211592Z * [new tag] v1.3.0 -> v1.3.0 2025-10-10T00:44:17.4213134Z * [new tag] v1.3.0a0 -> v1.3.0a0 2025-10-10T00:44:17.4214183Z * [new tag] v1.3.1 -> v1.3.1 2025-10-10T00:44:17.4215861Z * [new tag] v1.4.0 -> v1.4.0 2025-10-10T00:44:17.4217345Z * [new tag] v1.4.0a0 -> v1.4.0a0 2025-10-10T00:44:17.4218328Z * [new tag] v1.4.1 -> v1.4.1 2025-10-10T00:44:17.4220048Z * [new tag] v1.5.0 -> v1.5.0 2025-10-10T00:44:17.4221809Z * [new tag] v1.5.0-rc1 -> v1.5.0-rc1 2025-10-10T00:44:17.4223299Z * [new tag] v1.5.0-rc2 -> v1.5.0-rc2 2025-10-10T00:44:17.4224827Z * [new tag] v1.5.0-rc3 -> v1.5.0-rc3 2025-10-10T00:44:17.4226234Z * [new tag] v1.5.0-rc4 -> v1.5.0-rc4 2025-10-10T00:44:17.4227615Z * [new tag] v1.5.0-rc5 -> v1.5.0-rc5 2025-10-10T00:44:17.4229145Z * [new tag] v1.5.1 -> v1.5.1 2025-10-10T00:44:17.4230500Z * [new tag] v1.5.1-rc1 -> v1.5.1-rc1 2025-10-10T00:44:17.4231765Z * [new tag] v1.6.0 -> v1.6.0 2025-10-10T00:44:17.4233336Z * [new tag] v1.6.0-rc1 -> v1.6.0-rc1 2025-10-10T00:44:17.4234825Z * [new tag] v1.6.0-rc2 -> v1.6.0-rc2 2025-10-10T00:44:17.4236346Z * [new tag] v1.6.0-rc3 -> v1.6.0-rc3 2025-10-10T00:44:17.4237882Z * [new tag] v1.6.0-rc4 -> v1.6.0-rc4 2025-10-10T00:44:17.4239489Z * [new tag] v1.6.0-rc5 -> v1.6.0-rc5 2025-10-10T00:44:17.4240899Z * [new tag] v1.6.0-rc6 -> v1.6.0-rc6 2025-10-10T00:44:17.4241939Z * [new tag] v1.6.0-rc7 -> v1.6.0-rc7 2025-10-10T00:44:17.4243647Z * [new tag] v1.7.0 -> v1.7.0 2025-10-10T00:44:17.4245191Z * [new tag] v1.7.0-rc1 -> v1.7.0-rc1 2025-10-10T00:44:17.4246799Z * [new tag] v1.7.0-rc2 -> v1.7.0-rc2 2025-10-10T00:44:17.4248368Z * [new tag] v1.7.0-rc3 -> v1.7.0-rc3 2025-10-10T00:44:17.4249277Z * [new tag] v1.7.0-rc4 -> v1.7.0-rc4 2025-10-10T00:44:17.4251061Z * [new tag] v1.7.1 -> v1.7.1 2025-10-10T00:44:17.4252786Z * [new tag] v1.7.1-rc1 -> v1.7.1-rc1 2025-10-10T00:44:17.4254282Z * [new tag] v1.7.1-rc2 -> v1.7.1-rc2 2025-10-10T00:44:17.4255334Z * [new tag] v1.7.1-rc3 -> v1.7.1-rc3 2025-10-10T00:44:17.4257184Z * [new tag] v1.8.0 -> v1.8.0 2025-10-10T00:44:17.4258182Z * [new tag] v1.8.0-rc1 -> v1.8.0-rc1 2025-10-10T00:44:17.4259957Z * [new tag] v1.8.0-rc2 -> v1.8.0-rc2 2025-10-10T00:44:17.4261634Z * [new tag] v1.8.0-rc3 -> v1.8.0-rc3 2025-10-10T00:44:17.4262928Z * [new tag] v1.8.0-rc4 -> v1.8.0-rc4 2025-10-10T00:44:17.4263979Z * [new tag] v1.8.0-rc5 -> v1.8.0-rc5 2025-10-10T00:44:17.4265404Z * [new tag] v1.8.1 -> v1.8.1 2025-10-10T00:44:17.4266989Z * [new tag] v1.8.1-rc1 -> v1.8.1-rc1 2025-10-10T00:44:17.4268284Z * [new tag] v1.8.1-rc2 -> v1.8.1-rc2 2025-10-10T00:44:17.4269191Z * [new tag] v1.8.1-rc3 -> v1.8.1-rc3 2025-10-10T00:44:17.4271453Z * [new tag] v1.8.2 -> v1.8.2 2025-10-10T00:44:17.4272765Z * [new tag] v1.8.2-rc1 -> v1.8.2-rc1 2025-10-10T00:44:17.4274448Z * [new tag] v1.9.0 -> v1.9.0 2025-10-10T00:44:17.4275896Z * [new tag] v1.9.0-rc1 -> v1.9.0-rc1 2025-10-10T00:44:17.4277471Z * [new tag] v1.9.0-rc2 -> v1.9.0-rc2 2025-10-10T00:44:17.4278991Z * [new tag] v1.9.0-rc3 -> v1.9.0-rc3 2025-10-10T00:44:17.4280312Z * [new tag] v1.9.0-rc4 -> v1.9.0-rc4 2025-10-10T00:44:17.4281775Z * [new tag] v1.9.1 -> v1.9.1 2025-10-10T00:44:17.4283398Z * [new tag] v1.9.1-rc1 -> v1.9.1-rc1 2025-10-10T00:44:17.4284735Z * [new tag] v1.9.1-rc2 -> v1.9.1-rc2 2025-10-10T00:44:17.4286310Z * [new tag] v2.0.0 -> v2.0.0 2025-10-10T00:44:17.4287799Z * [new tag] v2.0.0-rc1 -> v2.0.0-rc1 2025-10-10T00:44:17.4289405Z * [new tag] v2.0.0-rc2 -> v2.0.0-rc2 2025-10-10T00:44:17.4290996Z * [new tag] v2.0.0-rc3 -> v2.0.0-rc3 2025-10-10T00:44:17.4292578Z * [new tag] v2.0.0-rc4 -> v2.0.0-rc4 2025-10-10T00:44:17.4294081Z * [new tag] v2.0.0-rc5 -> v2.0.0-rc5 2025-10-10T00:44:17.4295124Z * [new tag] v2.0.0-rc6 -> v2.0.0-rc6 2025-10-10T00:44:17.4296833Z * [new tag] v2.0.1 -> v2.0.1 2025-10-10T00:44:17.4298589Z * [new tag] v2.0.1-rc1 -> v2.0.1-rc1 2025-10-10T00:44:17.4302674Z * [new tag] v2.0.1-rc2 -> v2.0.1-rc2 2025-10-10T00:44:17.4304140Z * [new tag] v2.0.1-rc3 -> v2.0.1-rc3 2025-10-10T00:44:17.4305195Z * [new tag] v2.0.1-rc4 -> v2.0.1-rc4 2025-10-10T00:44:17.4307437Z * [new tag] v2.1.0 -> v2.1.0 2025-10-10T00:44:17.4309376Z * [new tag] v2.1.0-rc1 -> v2.1.0-rc1 2025-10-10T00:44:17.4310850Z * [new tag] v2.1.0-rc2 -> v2.1.0-rc2 2025-10-10T00:44:17.4312551Z * [new tag] v2.1.0-rc3 -> v2.1.0-rc3 2025-10-10T00:44:17.4314123Z * [new tag] v2.1.0-rc4 -> v2.1.0-rc4 2025-10-10T00:44:17.4315649Z * [new tag] v2.1.0-rc5 -> v2.1.0-rc5 2025-10-10T00:44:17.4316936Z * [new tag] v2.1.0-rc6 -> v2.1.0-rc6 2025-10-10T00:44:17.4318448Z * [new tag] v2.1.1 -> v2.1.1 2025-10-10T00:44:17.4320054Z * [new tag] v2.1.1-rc1 -> v2.1.1-rc1 2025-10-10T00:44:17.4321443Z * [new tag] v2.1.1-rc2 -> v2.1.1-rc2 2025-10-10T00:44:17.4323097Z * [new tag] v2.1.1-rc3 -> v2.1.1-rc3 2025-10-10T00:44:17.4324634Z * [new tag] v2.1.1-rc4 -> v2.1.1-rc4 2025-10-10T00:44:17.4326461Z * [new tag] v2.1.1-rc5 -> v2.1.1-rc5 2025-10-10T00:44:17.4327245Z * [new tag] v2.1.1-rc6 -> v2.1.1-rc6 2025-10-10T00:44:17.4329110Z * [new tag] v2.1.2 -> v2.1.2 2025-10-10T00:44:17.4330717Z * [new tag] v2.1.2-rc1 -> v2.1.2-rc1 2025-10-10T00:44:17.4332433Z * [new tag] v2.1.2-rc2 -> v2.1.2-rc2 2025-10-10T00:44:17.4333337Z * [new tag] v2.1.2-rc3 -> v2.1.2-rc3 2025-10-10T00:44:17.4335184Z * [new tag] v2.2.0 -> v2.2.0 2025-10-10T00:44:17.4336671Z * [new tag] v2.2.0-rc1 -> v2.2.0-rc1 2025-10-10T00:44:17.4338147Z * [new tag] v2.2.0-rc2 -> v2.2.0-rc2 2025-10-10T00:44:17.4339559Z * [new tag] v2.2.0-rc3 -> v2.2.0-rc3 2025-10-10T00:44:17.4341005Z * [new tag] v2.2.0-rc4 -> v2.2.0-rc4 2025-10-10T00:44:17.4342453Z * [new tag] v2.2.0-rc5 -> v2.2.0-rc5 2025-10-10T00:44:17.4344041Z * [new tag] v2.2.0-rc6 -> v2.2.0-rc6 2025-10-10T00:44:17.4345278Z * [new tag] v2.2.0-rc7 -> v2.2.0-rc7 2025-10-10T00:44:17.4346568Z * [new tag] v2.2.0-rc8 -> v2.2.0-rc8 2025-10-10T00:44:17.4348170Z * [new tag] v2.2.1 -> v2.2.1 2025-10-10T00:44:17.4349882Z * [new tag] v2.2.1-rc1 -> v2.2.1-rc1 2025-10-10T00:44:17.4350716Z * [new tag] v2.2.1-rc2 -> v2.2.1-rc2 2025-10-10T00:44:17.4352263Z * [new tag] v2.2.1-rc3 -> v2.2.1-rc3 2025-10-10T00:44:17.4353629Z * [new tag] v2.2.2 -> v2.2.2 2025-10-10T00:44:17.4355326Z * [new tag] v2.2.2-rc1 -> v2.2.2-rc1 2025-10-10T00:44:17.4356373Z * [new tag] v2.2.2-rc2 -> v2.2.2-rc2 2025-10-10T00:44:17.4357823Z * [new tag] v2.2.2-rc3 -> v2.2.2-rc3 2025-10-10T00:44:17.4359342Z * [new tag] v2.3.0 -> v2.3.0 2025-10-10T00:44:17.4360851Z * [new tag] v2.3.0-rc1 -> v2.3.0-rc1 2025-10-10T00:44:17.4362508Z * [new tag] v2.3.0-rc10 -> v2.3.0-rc10 2025-10-10T00:44:17.4364092Z * [new tag] v2.3.0-rc11 -> v2.3.0-rc11 2025-10-10T00:44:17.4365540Z * [new tag] v2.3.0-rc12 -> v2.3.0-rc12 2025-10-10T00:44:17.4366913Z * [new tag] v2.3.0-rc2 -> v2.3.0-rc2 2025-10-10T00:44:17.4373502Z * [new tag] v2.3.0-rc3 -> v2.3.0-rc3 2025-10-10T00:44:17.4373749Z * [new tag] v2.3.0-rc4 -> v2.3.0-rc4 2025-10-10T00:44:17.4374061Z * [new tag] v2.3.0-rc5 -> v2.3.0-rc5 2025-10-10T00:44:17.4374210Z * [new tag] v2.3.0-rc6 -> v2.3.0-rc6 2025-10-10T00:44:17.4374350Z * [new tag] v2.3.0-rc7 -> v2.3.0-rc7 2025-10-10T00:44:17.4375919Z * [new tag] v2.3.0-rc8 -> v2.3.0-rc8 2025-10-10T00:44:17.4377258Z * [new tag] v2.3.0-rc9 -> v2.3.0-rc9 2025-10-10T00:44:17.4378511Z * [new tag] v2.3.1 -> v2.3.1 2025-10-10T00:44:17.4379986Z * [new tag] v2.3.1-rc1 -> v2.3.1-rc1 2025-10-10T00:44:17.4381674Z * [new tag] v2.3.1-rc2 -> v2.3.1-rc2 2025-10-10T00:44:17.4382995Z * [new tag] v2.3.1-rc3 -> v2.3.1-rc3 2025-10-10T00:44:17.4384436Z * [new tag] v2.4.0 -> v2.4.0 2025-10-10T00:44:17.4385921Z * [new tag] v2.4.0-rc1 -> v2.4.0-rc1 2025-10-10T00:44:17.4387491Z * [new tag] v2.4.0-rc2 -> v2.4.0-rc2 2025-10-10T00:44:17.4389012Z * [new tag] v2.4.0-rc3 -> v2.4.0-rc3 2025-10-10T00:44:17.4390476Z * [new tag] v2.4.0-rc4 -> v2.4.0-rc4 2025-10-10T00:44:17.4391909Z * [new tag] v2.4.0-rc5 -> v2.4.0-rc5 2025-10-10T00:44:17.4393647Z * [new tag] v2.4.0-rc6 -> v2.4.0-rc6 2025-10-10T00:44:17.4394956Z * [new tag] v2.4.0-rc7 -> v2.4.0-rc7 2025-10-10T00:44:17.4396365Z * [new tag] v2.4.0-rc8 -> v2.4.0-rc8 2025-10-10T00:44:17.4398099Z * [new tag] v2.4.0-rc9 -> v2.4.0-rc9 2025-10-10T00:44:17.4399144Z * [new tag] v2.4.1 -> v2.4.1 2025-10-10T00:44:17.4401335Z * [new tag] v2.4.1-rc1 -> v2.4.1-rc1 2025-10-10T00:44:17.4402769Z * [new tag] v2.4.1-rc2 -> v2.4.1-rc2 2025-10-10T00:44:17.4404791Z * [new tag] v2.4.1-rc3 -> v2.4.1-rc3 2025-10-10T00:44:17.4406186Z * [new tag] v2.5.0 -> v2.5.0 2025-10-10T00:44:17.4408036Z * [new tag] v2.5.0-rc1 -> v2.5.0-rc1 2025-10-10T00:44:17.4409337Z * [new tag] v2.5.0-rc10 -> v2.5.0-rc10 2025-10-10T00:44:17.4410816Z * [new tag] v2.5.0-rc2 -> v2.5.0-rc2 2025-10-10T00:44:17.4412185Z * [new tag] v2.5.0-rc3 -> v2.5.0-rc3 2025-10-10T00:44:17.4413744Z * [new tag] v2.5.0-rc4 -> v2.5.0-rc4 2025-10-10T00:44:17.4415539Z * [new tag] v2.5.0-rc5 -> v2.5.0-rc5 2025-10-10T00:44:17.4416855Z * [new tag] v2.5.0-rc6 -> v2.5.0-rc6 2025-10-10T00:44:17.4418314Z * [new tag] v2.5.0-rc7 -> v2.5.0-rc7 2025-10-10T00:44:17.4419866Z * [new tag] v2.5.0-rc8 -> v2.5.0-rc8 2025-10-10T00:44:17.4421516Z * [new tag] v2.5.0-rc9 -> v2.5.0-rc9 2025-10-10T00:44:17.4422878Z * [new tag] v2.5.1 -> v2.5.1 2025-10-10T00:44:17.4424300Z * [new tag] v2.5.1-rc1 -> v2.5.1-rc1 2025-10-10T00:44:17.4425599Z * [new tag] v2.6.0 -> v2.6.0 2025-10-10T00:44:17.4427320Z * [new tag] v2.6.0-rc1 -> v2.6.0-rc1 2025-10-10T00:44:17.4428713Z * [new tag] v2.6.0-rc2 -> v2.6.0-rc2 2025-10-10T00:44:17.4430469Z * [new tag] v2.6.0-rc3 -> v2.6.0-rc3 2025-10-10T00:44:17.4431739Z * [new tag] v2.6.0-rc4 -> v2.6.0-rc4 2025-10-10T00:44:17.4433764Z * [new tag] v2.6.0-rc5 -> v2.6.0-rc5 2025-10-10T00:44:17.4435186Z * [new tag] v2.6.0-rc6 -> v2.6.0-rc6 2025-10-10T00:44:17.4436695Z * [new tag] v2.6.0-rc7 -> v2.6.0-rc7 2025-10-10T00:44:17.4438623Z * [new tag] v2.6.0-rc8 -> v2.6.0-rc8 2025-10-10T00:44:17.4439890Z * [new tag] v2.6.0-rc9 -> v2.6.0-rc9 2025-10-10T00:44:17.4441548Z * [new tag] v2.7.0 -> v2.7.0 2025-10-10T00:44:17.4443065Z * [new tag] v2.7.0-rc1 -> v2.7.0-rc1 2025-10-10T00:44:17.4444411Z * [new tag] v2.7.0-rc10 -> v2.7.0-rc10 2025-10-10T00:44:17.4446283Z * [new tag] v2.7.0-rc2 -> v2.7.0-rc2 2025-10-10T00:44:17.4447979Z * [new tag] v2.7.0-rc3 -> v2.7.0-rc3 2025-10-10T00:44:17.4449319Z * [new tag] v2.7.0-rc4 -> v2.7.0-rc4 2025-10-10T00:44:17.4451046Z * [new tag] v2.7.0-rc5 -> v2.7.0-rc5 2025-10-10T00:44:17.4452407Z * [new tag] v2.7.0-rc6 -> v2.7.0-rc6 2025-10-10T00:44:17.4453836Z * [new tag] v2.7.0-rc7 -> v2.7.0-rc7 2025-10-10T00:44:17.4455385Z * [new tag] v2.7.0-rc8 -> v2.7.0-rc8 2025-10-10T00:44:17.4457071Z * [new tag] v2.7.0-rc9 -> v2.7.0-rc9 2025-10-10T00:44:17.4458259Z * [new tag] v2.7.1 -> v2.7.1 2025-10-10T00:44:17.4459799Z * [new tag] v2.7.1-rc1 -> v2.7.1-rc1 2025-10-10T00:44:17.4461450Z * [new tag] v2.7.1-rc2 -> v2.7.1-rc2 2025-10-10T00:44:17.4463068Z * [new tag] v2.7.1-rc3 -> v2.7.1-rc3 2025-10-10T00:44:17.4464658Z * [new tag] v2.7.1-rc4 -> v2.7.1-rc4 2025-10-10T00:44:17.4466202Z * [new tag] v2.7.1-rc5 -> v2.7.1-rc5 2025-10-10T00:44:17.4467566Z * [new tag] v2.8.0 -> v2.8.0 2025-10-10T00:44:17.4469107Z * [new tag] v2.8.0-rc1 -> v2.8.0-rc1 2025-10-10T00:44:17.4470696Z * [new tag] v2.8.0-rc2 -> v2.8.0-rc2 2025-10-10T00:44:17.4472312Z * [new tag] v2.8.0-rc3 -> v2.8.0-rc3 2025-10-10T00:44:17.4473948Z * [new tag] v2.8.0-rc4 -> v2.8.0-rc4 2025-10-10T00:44:17.4475517Z * [new tag] v2.8.0-rc5 -> v2.8.0-rc5 2025-10-10T00:44:17.4477049Z * [new tag] v2.8.0-rc6 -> v2.8.0-rc6 2025-10-10T00:44:17.4478612Z * [new tag] v2.8.0-rc7 -> v2.8.0-rc7 2025-10-10T00:44:17.4480135Z * [new tag] v2.8.0-rc8 -> v2.8.0-rc8 2025-10-10T00:44:17.4481685Z * [new tag] v2.9.0-rc1 -> v2.9.0-rc1 2025-10-10T00:44:17.4483767Z * [new tag] v2.9.0-rc2 -> v2.9.0-rc2 2025-10-10T00:44:17.4485113Z * [new tag] v2.9.0-rc3 -> v2.9.0-rc3 2025-10-10T00:44:17.4486961Z * [new tag] v2.9.0-rc4 -> v2.9.0-rc4 2025-10-10T00:44:17.4488533Z * [new tag] v2.9.0-rc5 -> v2.9.0-rc5 2025-10-10T00:44:17.4490128Z * [new tag] v2.9.0-rc6 -> v2.9.0-rc6 2025-10-10T00:44:17.4491490Z * [new tag] v2.9.0-rc7 -> v2.9.0-rc7 2025-10-10T00:44:17.4493281Z * [new tag] v2.9.0-rc8 -> v2.9.0-rc8 2025-10-10T00:44:17.4494643Z * [new tag] v2.9.0-rc9 -> v2.9.0-rc9 2025-10-10T00:44:17.4496736Z * [new tag] viable/strict/1759343184 -> viable/strict/1759343184 2025-10-10T00:44:17.4498583Z * [new tag] viable/strict/1759346540 -> viable/strict/1759346540 2025-10-10T00:44:17.4499719Z * [new tag] viable/strict/1759348181 -> viable/strict/1759348181 2025-10-10T00:44:17.4501350Z * [new tag] viable/strict/1759350324 -> viable/strict/1759350324 2025-10-10T00:44:17.4502712Z * [new tag] viable/strict/1759351793 -> viable/strict/1759351793 2025-10-10T00:44:17.4504451Z * [new tag] viable/strict/1759353844 -> viable/strict/1759353844 2025-10-10T00:44:17.4505657Z * [new tag] viable/strict/1759355374 -> viable/strict/1759355374 2025-10-10T00:44:17.4506906Z * [new tag] viable/strict/1759357472 -> viable/strict/1759357472 2025-10-10T00:44:17.4508633Z * [new tag] viable/strict/1759361002 -> viable/strict/1759361002 2025-10-10T00:44:17.4510309Z * [new tag] viable/strict/1759362585 -> viable/strict/1759362585 2025-10-10T00:44:17.4512638Z * [new tag] viable/strict/1759365359 -> viable/strict/1759365359 2025-10-10T00:44:17.4513613Z * [new tag] viable/strict/1759370089 -> viable/strict/1759370089 2025-10-10T00:44:17.4515412Z * [new tag] viable/strict/1759377554 -> viable/strict/1759377554 2025-10-10T00:44:17.4516708Z * [new tag] viable/strict/1759379133 -> viable/strict/1759379133 2025-10-10T00:44:17.4518154Z * [new tag] viable/strict/1759389871 -> viable/strict/1759389871 2025-10-10T00:44:17.4519541Z * [new tag] viable/strict/1759393562 -> viable/strict/1759393562 2025-10-10T00:44:17.4520946Z * [new tag] viable/strict/1759395076 -> viable/strict/1759395076 2025-10-10T00:44:17.4522283Z * [new tag] viable/strict/1759398579 -> viable/strict/1759398579 2025-10-10T00:44:17.4523618Z * [new tag] viable/strict/1759404142 -> viable/strict/1759404142 2025-10-10T00:44:17.4524998Z * [new tag] viable/strict/1759405773 -> viable/strict/1759405773 2025-10-10T00:44:17.4526349Z * [new tag] viable/strict/1759408041 -> viable/strict/1759408041 2025-10-10T00:44:17.4528069Z * [new tag] viable/strict/1759411593 -> viable/strict/1759411593 2025-10-10T00:44:17.4529012Z * [new tag] viable/strict/1759427395 -> viable/strict/1759427395 2025-10-10T00:44:17.4530826Z * [new tag] viable/strict/1759434582 -> viable/strict/1759434582 2025-10-10T00:44:17.4531780Z * [new tag] viable/strict/1759436720 -> viable/strict/1759436720 2025-10-10T00:44:17.4533603Z * [new tag] viable/strict/1759440219 -> viable/strict/1759440219 2025-10-10T00:44:17.4534534Z * [new tag] viable/strict/1759441948 -> viable/strict/1759441948 2025-10-10T00:44:17.4536179Z * [new tag] viable/strict/1759443860 -> viable/strict/1759443860 2025-10-10T00:44:17.4537756Z * [new tag] viable/strict/1759445377 -> viable/strict/1759445377 2025-10-10T00:44:17.4538619Z * [new tag] viable/strict/1759447415 -> viable/strict/1759447415 2025-10-10T00:44:17.4540452Z * [new tag] viable/strict/1759451750 -> viable/strict/1759451750 2025-10-10T00:44:17.4541396Z * [new tag] viable/strict/1759453910 -> viable/strict/1759453910 2025-10-10T00:44:17.4543196Z * [new tag] viable/strict/1759456483 -> viable/strict/1759456483 2025-10-10T00:44:17.4544440Z * [new tag] viable/strict/1759459279 -> viable/strict/1759459279 2025-10-10T00:44:17.4545763Z * [new tag] viable/strict/1759460742 -> viable/strict/1759460742 2025-10-10T00:44:17.4547190Z * [new tag] viable/strict/1759462025 -> viable/strict/1759462025 2025-10-10T00:44:17.4548601Z * [new tag] viable/strict/1759469086 -> viable/strict/1759469086 2025-10-10T00:44:17.4549691Z * [new tag] viable/strict/1759470581 -> viable/strict/1759470581 2025-10-10T00:44:17.4551389Z * [new tag] viable/strict/1759472786 -> viable/strict/1759472786 2025-10-10T00:44:17.4552704Z * [new tag] viable/strict/1759476294 -> viable/strict/1759476294 2025-10-10T00:44:17.4553984Z * [new tag] viable/strict/1759479963 -> viable/strict/1759479963 2025-10-10T00:44:17.4555346Z * [new tag] viable/strict/1759492177 -> viable/strict/1759492177 2025-10-10T00:44:17.4556718Z * [new tag] viable/strict/1759519278 -> viable/strict/1759519278 2025-10-10T00:44:17.4558147Z * [new tag] viable/strict/1759524580 -> viable/strict/1759524580 2025-10-10T00:44:17.4559665Z * [new tag] viable/strict/1759528193 -> viable/strict/1759528193 2025-10-10T00:44:17.4561349Z * [new tag] viable/strict/1759533797 -> viable/strict/1759533797 2025-10-10T00:44:17.4562679Z * [new tag] viable/strict/1759542780 -> viable/strict/1759542780 2025-10-10T00:44:17.4564101Z * [new tag] viable/strict/1759549779 -> viable/strict/1759549779 2025-10-10T00:44:17.4565344Z * [new tag] viable/strict/1759555455 -> viable/strict/1759555455 2025-10-10T00:44:17.4566681Z * [new tag] viable/strict/1759559176 -> viable/strict/1759559176 2025-10-10T00:44:17.4568844Z * [new tag] viable/strict/1759560629 -> viable/strict/1759560629 2025-10-10T00:44:17.4570352Z * [new tag] viable/strict/1759569848 -> viable/strict/1759569848 2025-10-10T00:44:17.4571620Z * [new tag] viable/strict/1759571382 -> viable/strict/1759571382 2025-10-10T00:44:17.4573351Z * [new tag] viable/strict/1759573474 -> viable/strict/1759573474 2025-10-10T00:44:17.4574186Z * [new tag] viable/strict/1759618187 -> viable/strict/1759618187 2025-10-10T00:44:17.4576077Z * [new tag] viable/strict/1759626742 -> viable/strict/1759626742 2025-10-10T00:44:17.4576903Z * [new tag] viable/strict/1759632427 -> viable/strict/1759632427 2025-10-10T00:44:17.4578753Z * [new tag] viable/strict/1759634971 -> viable/strict/1759634971 2025-10-10T00:44:17.4579696Z * [new tag] viable/strict/1759661382 -> viable/strict/1759661382 2025-10-10T00:44:17.4581366Z * [new tag] viable/strict/1759663294 -> viable/strict/1759663294 2025-10-10T00:44:17.4582922Z * [new tag] viable/strict/1759708178 -> viable/strict/1759708178 2025-10-10T00:44:17.4584114Z * [new tag] viable/strict/1759715695 -> viable/strict/1759715695 2025-10-10T00:44:17.4585452Z * [new tag] viable/strict/1759728293 -> viable/strict/1759728293 2025-10-10T00:44:17.4586866Z * [new tag] viable/strict/1759735513 -> viable/strict/1759735513 2025-10-10T00:44:17.4588318Z * [new tag] viable/strict/1759739177 -> viable/strict/1759739177 2025-10-10T00:44:17.4589709Z * [new tag] viable/strict/1759758635 -> viable/strict/1759758635 2025-10-10T00:44:17.4591016Z * [new tag] viable/strict/1759765784 -> viable/strict/1759765784 2025-10-10T00:44:17.4592358Z * [new tag] viable/strict/1759767948 -> viable/strict/1759767948 2025-10-10T00:44:17.4593863Z * [new tag] viable/strict/1759771461 -> viable/strict/1759771461 2025-10-10T00:44:17.4595161Z * [new tag] viable/strict/1759776706 -> viable/strict/1759776706 2025-10-10T00:44:17.4596571Z * [new tag] viable/strict/1759782317 -> viable/strict/1759782317 2025-10-10T00:44:17.4598225Z * [new tag] viable/strict/1759783777 -> viable/strict/1759783777 2025-10-10T00:44:17.4599198Z * [new tag] viable/strict/1759785815 -> viable/strict/1759785815 2025-10-10T00:44:17.4601101Z * [new tag] viable/strict/1759789459 -> viable/strict/1759789459 2025-10-10T00:44:17.4602371Z * [new tag] viable/strict/1759790974 -> viable/strict/1759790974 2025-10-10T00:44:17.4603803Z * [new tag] viable/strict/1759794583 -> viable/strict/1759794583 2025-10-10T00:44:17.4605160Z * [new tag] viable/strict/1759797408 -> viable/strict/1759797408 2025-10-10T00:44:17.4606487Z * [new tag] viable/strict/1759799518 -> viable/strict/1759799518 2025-10-10T00:44:17.4608151Z * [new tag] viable/strict/1759804909 -> viable/strict/1759804909 2025-10-10T00:44:17.4609524Z * [new tag] viable/strict/1759807643 -> viable/strict/1759807643 2025-10-10T00:44:17.4610965Z * [new tag] viable/strict/1759809089 -> viable/strict/1759809089 2025-10-10T00:44:17.4612284Z * [new tag] viable/strict/1759811145 -> viable/strict/1759811145 2025-10-10T00:44:17.4613812Z * [new tag] viable/strict/1759812581 -> viable/strict/1759812581 2025-10-10T00:44:17.4615081Z * [new tag] viable/strict/1759814683 -> viable/strict/1759814683 2025-10-10T00:44:17.4616527Z * [new tag] viable/strict/1759821889 -> viable/strict/1759821889 2025-10-10T00:44:17.4617833Z * [new tag] viable/strict/1759823376 -> viable/strict/1759823376 2025-10-10T00:44:17.4619175Z * [new tag] viable/strict/1759827107 -> viable/strict/1759827107 2025-10-10T00:44:17.4620537Z * [new tag] viable/strict/1759830577 -> viable/strict/1759830577 2025-10-10T00:44:17.4622277Z * [new tag] viable/strict/1759832720 -> viable/strict/1759832720 2025-10-10T00:44:17.4623124Z * [new tag] viable/strict/1759842063 -> viable/strict/1759842063 2025-10-10T00:44:17.4624811Z * [new tag] viable/strict/1759847121 -> viable/strict/1759847121 2025-10-10T00:44:17.4626080Z * [new tag] viable/strict/1759850721 -> viable/strict/1759850721 2025-10-10T00:44:17.4627548Z * [new tag] viable/strict/1759857870 -> viable/strict/1759857870 2025-10-10T00:44:17.4628825Z * [new tag] viable/strict/1759863143 -> viable/strict/1759863143 2025-10-10T00:44:17.4630179Z * [new tag] viable/strict/1759875874 -> viable/strict/1759875874 2025-10-10T00:44:17.4631688Z * [new tag] viable/strict/1759877385 -> viable/strict/1759877385 2025-10-10T00:44:17.4632991Z * [new tag] viable/strict/1759883801 -> viable/strict/1759883801 2025-10-10T00:44:17.4634402Z * [new tag] viable/strict/1759885922 -> viable/strict/1759885922 2025-10-10T00:44:17.4635800Z * [new tag] viable/strict/1759888488 -> viable/strict/1759888488 2025-10-10T00:44:17.4637135Z * [new tag] viable/strict/1759895471 -> viable/strict/1759895471 2025-10-10T00:44:17.4638520Z * [new tag] viable/strict/1759904803 -> viable/strict/1759904803 2025-10-10T00:44:17.4639953Z * [new tag] viable/strict/1759908300 -> viable/strict/1759908300 2025-10-10T00:44:17.4641306Z * [new tag] viable/strict/1759915520 -> viable/strict/1759915520 2025-10-10T00:44:17.4642719Z * [new tag] viable/strict/1759916978 -> viable/strict/1759916978 2025-10-10T00:44:17.4644206Z * [new tag] viable/strict/1759930024 -> viable/strict/1759930024 2025-10-10T00:44:17.4645570Z * [new tag] viable/strict/1759948122 -> viable/strict/1759948122 2025-10-10T00:44:17.4646912Z * [new tag] viable/strict/1759952983 -> viable/strict/1759952983 2025-10-10T00:44:17.4648731Z * [new tag] viable/strict/1759955121 -> viable/strict/1759955121 2025-10-10T00:44:17.4649679Z * [new tag] viable/strict/1759962298 -> viable/strict/1759962298 2025-10-10T00:44:17.4651274Z * [new tag] viable/strict/1759965837 -> viable/strict/1759965837 2025-10-10T00:44:17.4652887Z * [new tag] viable/strict/1759970213 -> viable/strict/1759970213 2025-10-10T00:44:17.4654417Z * [new tag] viable/strict/1759974894 -> viable/strict/1759974894 2025-10-10T00:44:17.4655283Z * [new tag] viable/strict/1759977763 -> viable/strict/1759977763 2025-10-10T00:44:17.4656966Z * [new tag] viable/strict/1759979241 -> viable/strict/1759979241 2025-10-10T00:44:17.4658830Z * [new tag] viable/strict/1759985417 -> viable/strict/1759985417 2025-10-10T00:44:17.4660172Z * [new tag] viable/strict/1759987490 -> viable/strict/1759987490 2025-10-10T00:44:17.4661603Z * [new tag] viable/strict/1759996180 -> viable/strict/1759996180 2025-10-10T00:44:17.4663021Z * [new tag] whc_flight_1 -> whc_flight_1 2025-10-10T00:44:17.4664781Z * [new tag] whc_flight_2 -> whc_flight_2 2025-10-10T00:44:17.4666692Z * [new tag] whc_flight_4 -> whc_flight_4 2025-10-10T00:44:17.5908398Z [command]/usr/bin/git rev-parse --verify --quiet 344e6365a0068c2d2847fcec0c55dd53291d475e^{object} 2025-10-10T00:44:17.5941930Z 344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:44:17.5946126Z ##[endgroup] 2025-10-10T00:44:17.5946557Z ##[group]Determining the checkout info 2025-10-10T00:44:17.5947828Z ##[endgroup] 2025-10-10T00:44:17.5952610Z [command]/usr/bin/git sparse-checkout disable 2025-10-10T00:44:17.5999155Z [command]/usr/bin/git config --local --unset-all extensions.worktreeConfig 2025-10-10T00:44:17.6032754Z ##[group]Checking out the ref 2025-10-10T00:44:17.6036200Z [command]/usr/bin/git checkout --progress --force 344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:44:18.6325562Z Updating files: 68% (13654/19929) 2025-10-10T00:44:18.6464291Z Updating files: 69% (13752/19929) 2025-10-10T00:44:18.6807461Z Updating files: 70% (13951/19929) 2025-10-10T00:44:18.6926151Z Updating files: 71% (14150/19929) 2025-10-10T00:44:18.6996712Z Updating files: 72% (14349/19929) 2025-10-10T00:44:18.7189765Z Updating files: 73% (14549/19929) 2025-10-10T00:44:18.7491561Z Updating files: 74% (14748/19929) 2025-10-10T00:44:18.7938947Z Updating files: 75% (14947/19929) 2025-10-10T00:44:18.8198123Z Updating files: 76% (15147/19929) 2025-10-10T00:44:18.8358092Z Updating files: 77% (15346/19929) 2025-10-10T00:44:18.8526688Z Updating files: 78% (15545/19929) 2025-10-10T00:44:18.8855195Z Updating files: 79% (15744/19929) 2025-10-10T00:44:18.9163498Z Updating files: 80% (15944/19929) 2025-10-10T00:44:18.9449013Z Updating files: 81% (16143/19929) 2025-10-10T00:44:18.9732683Z Updating files: 82% (16342/19929) 2025-10-10T00:44:18.9928357Z Updating files: 83% (16542/19929) 2025-10-10T00:44:19.0094136Z Updating files: 84% (16741/19929) 2025-10-10T00:44:19.0288871Z Updating files: 85% (16940/19929) 2025-10-10T00:44:19.0476352Z Updating files: 86% (17139/19929) 2025-10-10T00:44:19.0648056Z Updating files: 87% (17339/19929) 2025-10-10T00:44:19.0806199Z Updating files: 88% (17538/19929) 2025-10-10T00:44:19.0971438Z Updating files: 89% (17737/19929) 2025-10-10T00:44:19.1178454Z Updating files: 90% (17937/19929) 2025-10-10T00:44:19.1345029Z Updating files: 91% (18136/19929) 2025-10-10T00:44:19.1519155Z Updating files: 92% (18335/19929) 2025-10-10T00:44:19.1748587Z Updating files: 93% (18534/19929) 2025-10-10T00:44:19.1991395Z Updating files: 94% (18734/19929) 2025-10-10T00:44:19.2207276Z Updating files: 95% (18933/19929) 2025-10-10T00:44:19.2399211Z Updating files: 96% (19132/19929) 2025-10-10T00:44:19.2597822Z Updating files: 97% (19332/19929) 2025-10-10T00:44:19.2906226Z Updating files: 98% (19531/19929) 2025-10-10T00:44:19.3114104Z Updating files: 99% (19730/19929) 2025-10-10T00:44:19.3114739Z Updating files: 100% (19929/19929) 2025-10-10T00:44:19.3115393Z Updating files: 100% (19929/19929), done. 2025-10-10T00:44:19.3398261Z Note: switching to '344e6365a0068c2d2847fcec0c55dd53291d475e'. 2025-10-10T00:44:19.3398779Z 2025-10-10T00:44:19.3399089Z You are in 'detached HEAD' state. You can look around, make experimental 2025-10-10T00:44:19.3399622Z changes and commit them, and you can discard any commits you make in this 2025-10-10T00:44:19.3400143Z state without impacting any branches by switching back to a branch. 2025-10-10T00:44:19.3400443Z 2025-10-10T00:44:19.3400669Z If you want to create a new branch to retain commits you create, you may 2025-10-10T00:44:19.3401160Z do so (now or later) by using -c with the switch command. Example: 2025-10-10T00:44:19.3401435Z 2025-10-10T00:44:19.3401562Z git switch -c 2025-10-10T00:44:19.3401758Z 2025-10-10T00:44:19.3401871Z Or undo this operation with: 2025-10-10T00:44:19.3402055Z 2025-10-10T00:44:19.3402149Z git switch - 2025-10-10T00:44:19.3402287Z 2025-10-10T00:44:19.3402520Z Turn off this advice by setting config variable advice.detachedHead to false 2025-10-10T00:44:19.3402861Z 2025-10-10T00:44:19.3403973Z HEAD is now at 344e6365a00 [inductor][eazy] change how torch.use_deterministic_algorithms affect inductor (#164905) 2025-10-10T00:44:19.3583907Z ##[endgroup] 2025-10-10T00:44:19.3584382Z ##[group]Setting up auth for fetching submodules 2025-10-10T00:44:19.3590158Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-10-10T00:44:19.3650784Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2025-10-10T00:44:19.3691355Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2025-10-10T00:44:19.3731654Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2025-10-10T00:44:19.3765076Z ##[endgroup] 2025-10-10T00:44:19.3765495Z ##[group]Fetching submodules 2025-10-10T00:44:19.3769323Z [command]/usr/bin/git submodule sync --recursive 2025-10-10T00:44:19.4189606Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2025-10-10T00:44:19.4593865Z Submodule 'android/libs/fbjni' (https://github.com/facebookincubator/fbjni.git) registered for path 'android/libs/fbjni' 2025-10-10T00:44:19.4595710Z Submodule 'third_party/NNPACK_deps/FP16' (https://github.com/Maratyszcza/FP16.git) registered for path 'third_party/FP16' 2025-10-10T00:44:19.4600350Z Submodule 'third_party/NNPACK_deps/FXdiv' (https://github.com/Maratyszcza/FXdiv.git) registered for path 'third_party/FXdiv' 2025-10-10T00:44:19.4604342Z Submodule 'third_party/NNPACK' (https://github.com/Maratyszcza/NNPACK.git) registered for path 'third_party/NNPACK' 2025-10-10T00:44:19.4608519Z Submodule 'third_party/NVTX' (https://github.com/NVIDIA/NVTX.git) registered for path 'third_party/NVTX' 2025-10-10T00:44:19.4613254Z Submodule 'third_party/VulkanMemoryAllocator' (https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator.git) registered for path 'third_party/VulkanMemoryAllocator' 2025-10-10T00:44:19.4617336Z Submodule 'third_party/XNNPACK' (https://github.com/google/XNNPACK.git) registered for path 'third_party/XNNPACK' 2025-10-10T00:44:19.4621664Z Submodule 'third_party/aiter' (https://github.com/ROCm/aiter.git) registered for path 'third_party/aiter' 2025-10-10T00:44:19.4626308Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/benchmark' 2025-10-10T00:44:19.4630848Z Submodule 'third_party/composable_kernel' (https://github.com/ROCm/composable_kernel.git) registered for path 'third_party/composable_kernel' 2025-10-10T00:44:19.4635589Z Submodule 'third_party/cpp-httplib' (https://github.com/yhirose/cpp-httplib.git) registered for path 'third_party/cpp-httplib' 2025-10-10T00:44:19.4640184Z Submodule 'third_party/cpuinfo' (https://github.com/pytorch/cpuinfo.git) registered for path 'third_party/cpuinfo' 2025-10-10T00:44:19.4644963Z Submodule 'third_party/cudnn_frontend' (https://github.com/NVIDIA/cudnn-frontend.git) registered for path 'third_party/cudnn_frontend' 2025-10-10T00:44:19.4649910Z Submodule 'third_party/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'third_party/cutlass' 2025-10-10T00:44:19.4654932Z Submodule 'third_party/fbgemm' (https://github.com/pytorch/fbgemm) registered for path 'third_party/fbgemm' 2025-10-10T00:44:19.4661188Z Submodule 'third_party/flash-attention' (https://github.com/Dao-AILab/flash-attention.git) registered for path 'third_party/flash-attention' 2025-10-10T00:44:19.4669660Z Submodule 'third_party/flatbuffers' (https://github.com/google/flatbuffers.git) registered for path 'third_party/flatbuffers' 2025-10-10T00:44:19.4674679Z Submodule 'third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/fmt' 2025-10-10T00:44:19.4679970Z Submodule 'third_party/gemmlowp/gemmlowp' (https://github.com/google/gemmlowp.git) registered for path 'third_party/gemmlowp/gemmlowp' 2025-10-10T00:44:19.4685228Z Submodule 'third_party/gloo' (https://github.com/pytorch/gloo) registered for path 'third_party/gloo' 2025-10-10T00:44:19.4690969Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/googletest' 2025-10-10T00:44:19.4696139Z Submodule 'third_party/ideep' (https://github.com/intel/ideep) registered for path 'third_party/ideep' 2025-10-10T00:44:19.4702139Z Submodule 'third_party/ittapi' (https://github.com/intel/ittapi.git) registered for path 'third_party/ittapi' 2025-10-10T00:44:19.4707695Z Submodule 'third_party/kineto' (https://github.com/pytorch/kineto) registered for path 'third_party/kineto' 2025-10-10T00:44:19.4713497Z Submodule 'third_party/kleidiai' (https://github.com/ARM-software/kleidiai.git) registered for path 'third_party/kleidiai' 2025-10-10T00:44:19.4719285Z Submodule 'third_party/mimalloc' (https://github.com/microsoft/mimalloc.git) registered for path 'third_party/mimalloc' 2025-10-10T00:44:19.4725403Z Submodule 'third_party/nlohmann' (https://github.com/nlohmann/json.git) registered for path 'third_party/nlohmann' 2025-10-10T00:44:19.4731469Z Submodule 'third_party/onnx' (https://github.com/onnx/onnx.git) registered for path 'third_party/onnx' 2025-10-10T00:44:19.4737426Z Submodule 'third_party/opentelemetry-cpp' (https://github.com/open-telemetry/opentelemetry-cpp.git) registered for path 'third_party/opentelemetry-cpp' 2025-10-10T00:44:19.4743287Z Submodule 'third_party/pocketfft' (https://github.com/mreineck/pocketfft) registered for path 'third_party/pocketfft' 2025-10-10T00:44:19.4749305Z Submodule 'third_party/protobuf' (https://github.com/protocolbuffers/protobuf.git) registered for path 'third_party/protobuf' 2025-10-10T00:44:19.4755525Z Submodule 'third_party/NNPACK_deps/psimd' (https://github.com/Maratyszcza/psimd.git) registered for path 'third_party/psimd' 2025-10-10T00:44:19.4761842Z Submodule 'third_party/NNPACK_deps/pthreadpool' (https://github.com/Maratyszcza/pthreadpool.git) registered for path 'third_party/pthreadpool' 2025-10-10T00:44:19.4771667Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/pybind11' 2025-10-10T00:44:19.4778161Z Submodule 'third_party/python-peachpy' (https://github.com/malfet/PeachPy.git) registered for path 'third_party/python-peachpy' 2025-10-10T00:44:19.4784624Z Submodule 'third_party/sleef' (https://github.com/shibatch/sleef) registered for path 'third_party/sleef' 2025-10-10T00:44:19.4791315Z Submodule 'third_party/tensorpipe' (https://github.com/pytorch/tensorpipe.git) registered for path 'third_party/tensorpipe' 2025-10-10T00:44:19.4833157Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/android/libs/fbjni'... 2025-10-10T00:44:19.7404474Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/FXdiv'... 2025-10-10T00:44:19.7405598Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/FP16'... 2025-10-10T00:44:19.7445468Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fmt'... 2025-10-10T00:44:22.8442174Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/NNPACK'... 2025-10-10T00:44:22.8443204Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/NVTX'... 2025-10-10T00:44:22.8444541Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/benchmark'... 2025-10-10T00:44:22.8445649Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/flash-attention'... 2025-10-10T00:44:22.8446656Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/gloo'... 2025-10-10T00:44:22.8447832Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/gemmlowp/gemmlowp'... 2025-10-10T00:44:22.8448914Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cpuinfo'... 2025-10-10T00:44:22.8449917Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cpp-httplib'... 2025-10-10T00:44:22.8450934Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep'... 2025-10-10T00:44:22.8452228Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ittapi'... 2025-10-10T00:44:22.8453205Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kleidiai'... 2025-10-10T00:44:22.8454859Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pocketfft'... 2025-10-10T00:44:22.8455907Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cudnn_frontend'... 2025-10-10T00:44:22.8456930Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/psimd'... 2025-10-10T00:44:22.8457958Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/mimalloc'... 2025-10-10T00:44:22.8458978Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/googletest'... 2025-10-10T00:44:22.8460028Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pthreadpool'... 2025-10-10T00:44:22.8461189Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/VulkanMemoryAllocator'... 2025-10-10T00:44:22.8489324Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/flatbuffers'... 2025-10-10T00:44:23.4783436Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/python-peachpy'... 2025-10-10T00:44:23.4860995Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto'... 2025-10-10T00:44:23.4933359Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe'... 2025-10-10T00:44:23.5934896Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/XNNPACK'... 2025-10-10T00:44:40.9256971Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/sleef'... 2025-10-10T00:44:40.9257657Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pybind11'... 2025-10-10T00:44:40.9258307Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cutlass'... 2025-10-10T00:44:40.9258995Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm'... 2025-10-10T00:44:40.9259627Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx'... 2025-10-10T00:44:40.9260307Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/composable_kernel'... 2025-10-10T00:44:40.9260995Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/aiter'... 2025-10-10T00:44:40.9261644Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/nlohmann'... 2025-10-10T00:44:40.9262342Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp'... 2025-10-10T00:44:40.9263040Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf'... 2025-10-10T00:44:40.9473149Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2025-10-10T00:44:40.9649866Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2025-10-10T00:44:40.9791197Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2025-10-10T00:44:41.0147341Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2025-10-10T00:44:41.1125822Z Submodule path 'third_party/NVTX': checked out '2942f167cc30c5e3a44a2aecd5b0d9c07ff61a07' 2025-10-10T00:44:41.1791100Z Submodule path 'third_party/VulkanMemoryAllocator': checked out '1d8f600fd424278486eade7ed3e877c99f0846b1' 2025-10-10T00:44:42.1555497Z Submodule path 'third_party/XNNPACK': checked out '51a0103656eff6fc9bfd39a4597923c4b542c883' 2025-10-10T00:44:42.3558196Z Submodule path 'third_party/aiter': checked out '01aae101b9e5e94d6c16a9514c9fb8df99c93150' 2025-10-10T00:44:42.3587918Z Submodule '3rdparty/composable_kernel' (https://github.com/ROCm/composable_kernel.git) registered for path 'third_party/aiter/3rdparty/composable_kernel' 2025-10-10T00:44:42.3623245Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/aiter/3rdparty/composable_kernel'... 2025-10-10T00:44:46.4933183Z Submodule path 'third_party/aiter/3rdparty/composable_kernel': checked out 'cffe8fa2a442ac8e80dd236a1a5d24fe3d7e0cbf' 2025-10-10T00:44:46.5257446Z Submodule path 'third_party/benchmark': checked out '299e5928955cc62af9968370293b916f5130916f' 2025-10-10T00:44:46.9837709Z Submodule path 'third_party/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-10-10T00:44:47.0430403Z Submodule path 'third_party/cpp-httplib': checked out '89c932f313c6437c38f2982869beacc89c2f2246' 2025-10-10T00:44:47.1616870Z Submodule path 'third_party/cpuinfo': checked out '5e3d2445e6a84d9599bee2bf78edbb4d80865e1d' 2025-10-10T00:44:47.2186505Z Submodule path 'third_party/cudnn_frontend': checked out 'f937055efc6d414d11f4c6577e3977fe74f35fb6' 2025-10-10T00:44:48.0401667Z Submodule path 'third_party/cutlass': checked out 'f3fde58372d33e9a5650ba7b80fc48b3b49d40c8' 2025-10-10T00:44:48.2407527Z Submodule path 'third_party/fbgemm': checked out '3cefe0564a8c3de514a152d40a2b4770f2ee5be0' 2025-10-10T00:44:48.2437073Z Submodule 'external/asmjit' (https://github.com/asmjit/asmjit.git) registered for path 'third_party/fbgemm/external/asmjit' 2025-10-10T00:44:48.2440300Z Submodule 'external/composable_kernel' (https://github.com/ROCm/composable_kernel.git) registered for path 'third_party/fbgemm/external/composable_kernel' 2025-10-10T00:44:48.2444430Z Submodule 'external/cpuinfo' (https://github.com/pytorch/cpuinfo) registered for path 'third_party/fbgemm/external/cpuinfo' 2025-10-10T00:44:48.2447871Z Submodule 'external/cutlass' (https://github.com/jwfromm/cutlass) registered for path 'third_party/fbgemm/external/cutlass' 2025-10-10T00:44:48.2452114Z Submodule 'external/googletest' (https://github.com/google/googletest) registered for path 'third_party/fbgemm/external/googletest' 2025-10-10T00:44:48.2456334Z Submodule 'external/hipify_torch' (https://github.com/ROCmSoftwarePlatform/hipify_torch.git) registered for path 'third_party/fbgemm/external/hipify_torch' 2025-10-10T00:44:48.2460436Z Submodule 'external/json' (https://github.com/nlohmann/json.git) registered for path 'third_party/fbgemm/external/json' 2025-10-10T00:44:48.2498250Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/external/asmjit'... 2025-10-10T00:44:49.6642825Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/external/hipify_torch'... 2025-10-10T00:44:49.6643801Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/external/cpuinfo'... 2025-10-10T00:44:49.6644599Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/external/googletest'... 2025-10-10T00:44:49.7644234Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/external/composable_kernel'... 2025-10-10T00:44:52.4289403Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/external/cutlass'... 2025-10-10T00:44:52.5290256Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/external/json'... 2025-10-10T00:44:55.7986202Z Submodule path 'third_party/fbgemm/external/asmjit': checked out 'a3199e8857792cd10b7589ff5d58343d2c9008ea' 2025-10-10T00:44:56.2612056Z Submodule path 'third_party/fbgemm/external/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-10-10T00:44:56.3822336Z Submodule path 'third_party/fbgemm/external/cpuinfo': checked out '6543fec09b2f04ac4a666882998b534afc9c1349' 2025-10-10T00:44:57.1779719Z Submodule path 'third_party/fbgemm/external/cutlass': checked out '311f3c8e51dc0eb56310cfc6980bf63d0fbd7917' 2025-10-10T00:44:57.2346069Z Submodule path 'third_party/fbgemm/external/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-10-10T00:44:57.2514239Z Submodule path 'third_party/fbgemm/external/hipify_torch': checked out '63b6a7b541fa7f08f8475ca7d74054db36ff2691' 2025-10-10T00:44:57.3856945Z Submodule path 'third_party/fbgemm/external/json': checked out '9cca280a4d0ccf0c08f47a99aa71d1b0e52f8d03' 2025-10-10T00:44:57.4809059Z Submodule path 'third_party/flash-attention': checked out '979702c87a8713a8e0a5e9fee122b90d2ef13be5' 2025-10-10T00:44:57.4838017Z Submodule 'csrc/composable_kernel' (https://github.com/ROCm/composable_kernel.git) registered for path 'third_party/flash-attention/csrc/composable_kernel' 2025-10-10T00:44:57.4840867Z Submodule 'csrc/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'third_party/flash-attention/csrc/cutlass' 2025-10-10T00:44:57.4876110Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/flash-attention/csrc/composable_kernel'... 2025-10-10T00:45:01.3021727Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/flash-attention/csrc/cutlass'... 2025-10-10T00:45:01.6325644Z Submodule path 'third_party/flash-attention/csrc/composable_kernel': checked out '888317e698e9803c62bd38568abc9e05d7709f33' 2025-10-10T00:45:02.3546996Z Submodule path 'third_party/flash-attention/csrc/cutlass': checked out 'c506e16788cb08416a4a57e11a9067beeee29420' 2025-10-10T00:45:02.5412227Z Submodule path 'third_party/flatbuffers': checked out 'a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757' 2025-10-10T00:45:02.5790065Z Submodule path 'third_party/fmt': checked out 'e424e3f2e607da02742f73db84873b8084fc714c' 2025-10-10T00:45:02.6281453Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2025-10-10T00:45:02.6644950Z Submodule path 'third_party/gloo': checked out '54cbae0d3a67fa890b4c3d9ee162b7860315e341' 2025-10-10T00:45:02.7206234Z Submodule path 'third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-10-10T00:45:02.7389728Z Submodule path 'third_party/ideep': checked out '719d8e6cd7f7a0e01b155657526d693acf97c2b3' 2025-10-10T00:45:02.7416248Z Submodule 'mkl-dnn' (https://github.com/intel/mkl-dnn.git) registered for path 'third_party/ideep/mkl-dnn' 2025-10-10T00:45:02.7450486Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep/mkl-dnn'... 2025-10-10T00:45:19.3390925Z Submodule path 'third_party/ideep/mkl-dnn': checked out '8d263e693366ef8db40acc569cc7d8edf644556d' 2025-10-10T00:45:19.3673964Z Submodule path 'third_party/ittapi': checked out 'dec1d23ca65ab069d225dfe40dea14f455170959' 2025-10-10T00:45:19.4692729Z Submodule path 'third_party/kineto': checked out '001ba8eb519438592f79dbc8e86a349f5f6c6829' 2025-10-10T00:45:19.4720714Z Submodule 'libkineto/third_party/dynolog' (https://github.com/facebookincubator/dynolog.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog' 2025-10-10T00:45:19.4723493Z Submodule 'libkineto/third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/kineto/libkineto/third_party/fmt' 2025-10-10T00:45:19.4727595Z Submodule 'libkineto/third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/googletest' 2025-10-10T00:45:19.4764461Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog'... 2025-10-10T00:45:20.5464812Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/fmt'... 2025-10-10T00:45:21.0467613Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/googletest'... 2025-10-10T00:45:21.1569477Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog': checked out 'd2ffe0a4e3acace628db49974246b66fc3e85fb1' 2025-10-10T00:45:21.1597296Z Submodule 'third_party/DCGM' (https://github.com/NVIDIA/DCGM.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-10-10T00:45:21.1601709Z Submodule 'third_party/cpr' (https://github.com/libcpr/cpr.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-10-10T00:45:21.1605638Z Submodule 'third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-10-10T00:45:21.1610356Z Submodule 'third_party/gflags' (https://github.com/gflags/gflags.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-10-10T00:45:21.1614322Z Submodule 'third_party/glog' (https://github.com/google/glog.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-10-10T00:45:21.1619258Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-10-10T00:45:21.1624368Z Submodule 'third_party/json' (https://github.com/nlohmann/json.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-10-10T00:45:21.1629776Z Submodule 'third_party/pfs' (https://github.com/dtrugman/pfs.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-10-10T00:45:21.1635137Z Submodule 'third_party/prometheus-cpp' (https://github.com/jupp0r/prometheus-cpp.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-10-10T00:45:21.1675266Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'... 2025-10-10T00:45:22.7212726Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'... 2025-10-10T00:45:22.7213988Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'... 2025-10-10T00:45:22.7215061Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'... 2025-10-10T00:45:22.7216152Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'... 2025-10-10T00:45:22.7217377Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/glog'... 2025-10-10T00:45:22.7218830Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'... 2025-10-10T00:45:22.7286605Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'... 2025-10-10T00:45:22.8288946Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/json'... 2025-10-10T00:45:29.0648057Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM': checked out 'ffde4e54bc7249a6039a5e6b45b395141e1217f9' 2025-10-10T00:45:29.0905837Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr': checked out '871ed52d350214a034f6ef8a3b8f51c5ce1bd400' 2025-10-10T00:45:29.1369311Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05' 2025-10-10T00:45:29.1559311Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags': checked out 'e171aa2d15ed9eb17054558e0b3a6a413bb01067' 2025-10-10T00:45:29.1584235Z Submodule 'doc' (https://github.com/gflags/gflags.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-10-10T00:45:29.1625310Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'... 2025-10-10T00:45:29.4573268Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc': checked out '8411df715cf522606e3b1aca386ddfc0b63d34b4' 2025-10-10T00:45:29.4836589Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog': checked out 'b33e3bad4c46c8a6345525fd822af355e5ef9446' 2025-10-10T00:45:29.5403868Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-10-10T00:45:29.6667327Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json': checked out '4f8fba14066156b73f1189a2b8bd568bde5284c5' 2025-10-10T00:45:29.6900643Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs': checked out 'f68a2fa8ea36c783bdd760371411fcb495aa3150' 2025-10-10T00:45:29.7147844Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp': checked out 'b1234816facfdda29845c46696a02998a4af115a' 2025-10-10T00:45:29.7174084Z Submodule 'civetweb' (https://github.com/civetweb/civetweb.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:45:29.7178065Z Submodule 'googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:45:29.7214000Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'... 2025-10-10T00:45:31.8364594Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'... 2025-10-10T00:45:32.1288430Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'd7ba35bbb649209c66e582d5a0244ba988a15159' 2025-10-10T00:45:32.1870426Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-10-10T00:45:32.2278476Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '40626af88bd7df9a5fb80be7b25ac85b122d6c21' 2025-10-10T00:45:32.2843774Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-10-10T00:45:32.3426333Z Submodule path 'third_party/kleidiai': checked out 'cca02c2f69dd18e1f12647c1c0bdc8cf90e680c7' 2025-10-10T00:45:32.3936879Z Submodule path 'third_party/mimalloc': checked out 'fbd8b99c2b828428947d70fdc046bb55609be93e' 2025-10-10T00:45:32.5376738Z Submodule path 'third_party/nlohmann': checked out '55f93686c01528224f448c19128836e7df245f72' 2025-10-10T00:45:33.1593545Z Submodule path 'third_party/onnx': checked out 'e709452ef2bbc1d113faf678c24e6d3467696e83' 2025-10-10T00:45:33.1640635Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/onnx/third_party/pybind11' 2025-10-10T00:45:33.1673655Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx/third_party/pybind11'... 2025-10-10T00:45:34.2740708Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'a2e59f0e7065404b44dfe92a28aca47ba1378dc4' 2025-10-10T00:45:34.3744210Z Submodule path 'third_party/opentelemetry-cpp': checked out 'a799f4aed9c94b765dcdaabaeab7d5e7e2310878' 2025-10-10T00:45:34.3773039Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark) registered for path 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-10-10T00:45:34.3774992Z Submodule 'third_party/googletest' (https://github.com/google/googletest) registered for path 'third_party/opentelemetry-cpp/third_party/googletest' 2025-10-10T00:45:34.3779918Z Submodule 'third_party/ms-gsl' (https://github.com/microsoft/GSL) registered for path 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-10-10T00:45:34.3784004Z Submodule 'third_party/nlohmann-json' (https://github.com/nlohmann/json) registered for path 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-10-10T00:45:34.3788576Z Submodule 'third_party/opentelemetry-proto' (https://github.com/open-telemetry/opentelemetry-proto) registered for path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-10-10T00:45:34.3792556Z Submodule 'third_party/opentracing-cpp' (https://github.com/opentracing/opentracing-cpp.git) registered for path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-10-10T00:45:34.3796908Z Submodule 'third_party/prometheus-cpp' (https://github.com/jupp0r/prometheus-cpp) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-10-10T00:45:34.3801766Z Submodule 'tools/vcpkg' (https://github.com/Microsoft/vcpkg) registered for path 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-10-10T00:45:34.3837475Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/benchmark'... 2025-10-10T00:45:34.8442047Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/opentracing-cpp'... 2025-10-10T00:45:34.8443237Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/opentelemetry-proto'... 2025-10-10T00:45:34.8444390Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/ms-gsl'... 2025-10-10T00:45:34.8445503Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp'... 2025-10-10T00:45:34.9443202Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/googletest'... 2025-10-10T00:45:35.6959424Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/nlohmann-json'... 2025-10-10T00:45:42.4839796Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/tools/vcpkg'... 2025-10-10T00:45:43.1963249Z Submodule path 'third_party/opentelemetry-cpp/third_party/benchmark': checked out 'd572f4777349d43653b21d6c2fc63020ab326db2' 2025-10-10T00:45:43.2481275Z Submodule path 'third_party/opentelemetry-cpp/third_party/googletest': checked out 'b796f7d44681514f58a683a3a71ff17c94edb0c1' 2025-10-10T00:45:43.2708211Z Submodule path 'third_party/opentelemetry-cpp/third_party/ms-gsl': checked out '6f4529395c5b7c2d661812257cd6780c67e54afa' 2025-10-10T00:45:43.4066192Z Submodule path 'third_party/opentelemetry-cpp/third_party/nlohmann-json': checked out 'bc889afb4c5bf1c0d8ee29ef35eaaf4c8bef8a5d' 2025-10-10T00:45:43.4261248Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto': checked out '4ca4f0335c63cda7ab31ea7ed70d6553aee14dce' 2025-10-10T00:45:43.4472996Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp': checked out '06b57f48ded1fa3bdd3d4346f6ef29e40e08eaf5' 2025-10-10T00:45:43.4703982Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp': checked out 'c9ffcdda9086ffd9e1283ea7a0276d831f3c8a8d' 2025-10-10T00:45:43.4726388Z Submodule 'civetweb' (https://github.com/civetweb/civetweb.git) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:45:43.4730407Z Submodule 'googletest' (https://github.com/google/googletest.git) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:45:43.4764746Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'... 2025-10-10T00:45:45.6449748Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'... 2025-10-10T00:45:45.9352111Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'eefb26f82b233268fc98577d265352720d477ba4' 2025-10-10T00:45:45.9926890Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-10-10T00:45:46.7089370Z Submodule path 'third_party/opentelemetry-cpp/tools/vcpkg': checked out '8eb57355a4ffb410a2e94c07b4dca2dffbee8e50' 2025-10-10T00:45:46.7255290Z Submodule path 'third_party/pocketfft': checked out '0fa0ef591e38c2758e3184c6c23e497b9f732ffa' 2025-10-10T00:45:47.0648452Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2025-10-10T00:45:47.0679985Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/protobuf/third_party/benchmark' 2025-10-10T00:45:47.0682913Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/protobuf/third_party/googletest' 2025-10-10T00:45:47.0718369Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf/third_party/benchmark'... 2025-10-10T00:45:47.6388338Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf/third_party/googletest'... 2025-10-10T00:45:48.1621758Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2025-10-10T00:45:48.2485630Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2025-10-10T00:45:48.2629632Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2025-10-10T00:45:48.2804204Z Submodule path 'third_party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8' 2025-10-10T00:45:48.3333500Z Submodule path 'third_party/pybind11': checked out 'f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8' 2025-10-10T00:45:48.3707796Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2025-10-10T00:45:48.4253404Z Submodule path 'third_party/sleef': checked out '5a1d179df9cf652951b59010a2d2075372d67f68' 2025-10-10T00:45:48.4631547Z Submodule path 'third_party/tensorpipe': checked out 'af0118d13e52f5a08841464a768e01a0bf3e3075' 2025-10-10T00:45:48.4658094Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/tensorpipe/third_party/googletest' 2025-10-10T00:45:48.4660864Z Submodule 'third_party/libnop' (https://github.com/google/libnop.git) registered for path 'third_party/tensorpipe/third_party/libnop' 2025-10-10T00:45:48.4664768Z Submodule 'third_party/libuv' (https://github.com/libuv/libuv.git) registered for path 'third_party/tensorpipe/third_party/libuv' 2025-10-10T00:45:48.4668821Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/tensorpipe/third_party/pybind11' 2025-10-10T00:45:48.4704631Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/googletest'... 2025-10-10T00:45:49.7379574Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libnop'... 2025-10-10T00:45:49.7380943Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11'... 2025-10-10T00:45:49.8381024Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libuv'... 2025-10-10T00:45:49.9964694Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2025-10-10T00:45:50.0186369Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2025-10-10T00:45:50.1095495Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '5152db2cbfeb5582e9c27c5ea1dba2cd9e10759b' 2025-10-10T00:45:50.1476406Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2025-10-10T00:45:50.1500766Z Submodule 'tools/clang' (https://github.com/wjakob/clang-cindex-python3) registered for path 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-10-10T00:45:50.1539037Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11/tools/clang'... 2025-10-10T00:45:50.3652749Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2025-10-10T00:45:50.3709666Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2025-10-10T00:45:50.4131819Z Entering 'android/libs/fbjni' 2025-10-10T00:45:50.4195579Z Entering 'third_party/FP16' 2025-10-10T00:45:50.4258717Z Entering 'third_party/FXdiv' 2025-10-10T00:45:50.4318844Z Entering 'third_party/NNPACK' 2025-10-10T00:45:50.4379964Z Entering 'third_party/NVTX' 2025-10-10T00:45:50.4441914Z Entering 'third_party/VulkanMemoryAllocator' 2025-10-10T00:45:50.4508597Z Entering 'third_party/XNNPACK' 2025-10-10T00:45:50.4584071Z Entering 'third_party/aiter' 2025-10-10T00:45:50.4651855Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-10-10T00:45:50.4723205Z Entering 'third_party/benchmark' 2025-10-10T00:45:50.4785677Z Entering 'third_party/composable_kernel' 2025-10-10T00:45:50.4857492Z Entering 'third_party/cpp-httplib' 2025-10-10T00:45:50.4920575Z Entering 'third_party/cpuinfo' 2025-10-10T00:45:50.4982272Z Entering 'third_party/cudnn_frontend' 2025-10-10T00:45:50.5044273Z Entering 'third_party/cutlass' 2025-10-10T00:45:50.5118218Z Entering 'third_party/fbgemm' 2025-10-10T00:45:50.5181304Z Entering 'third_party/fbgemm/external/asmjit' 2025-10-10T00:45:50.5241107Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-10-10T00:45:50.5314408Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-10-10T00:45:50.5377373Z Entering 'third_party/fbgemm/external/cutlass' 2025-10-10T00:45:50.5447381Z Entering 'third_party/fbgemm/external/googletest' 2025-10-10T00:45:50.5505968Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-10-10T00:45:50.5564048Z Entering 'third_party/fbgemm/external/json' 2025-10-10T00:45:50.5628917Z Entering 'third_party/flash-attention' 2025-10-10T00:45:50.5688209Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-10-10T00:45:50.5754533Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-10-10T00:45:50.5826198Z Entering 'third_party/flatbuffers' 2025-10-10T00:45:50.5889441Z Entering 'third_party/fmt' 2025-10-10T00:45:50.5958606Z Entering 'third_party/gemmlowp/gemmlowp' 2025-10-10T00:45:50.6027596Z Entering 'third_party/gloo' 2025-10-10T00:45:50.6088780Z Entering 'third_party/googletest' 2025-10-10T00:45:50.6150813Z Entering 'third_party/ideep' 2025-10-10T00:45:50.6213382Z Entering 'third_party/ideep/mkl-dnn' 2025-10-10T00:45:50.6282404Z Entering 'third_party/ittapi' 2025-10-10T00:45:50.6346836Z Entering 'third_party/kineto' 2025-10-10T00:45:50.6408909Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-10-10T00:45:50.6467738Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-10-10T00:45:50.6534852Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-10-10T00:45:50.6594546Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-10-10T00:45:50.6654921Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-10-10T00:45:50.6716536Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-10-10T00:45:50.6779957Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-10-10T00:45:50.6838712Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-10-10T00:45:50.6908993Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-10-10T00:45:50.6968785Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-10-10T00:45:50.7034308Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-10-10T00:45:50.7092053Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:45:50.7156680Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:45:50.7225051Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-10-10T00:45:50.7283241Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-10-10T00:45:50.7346119Z Entering 'third_party/kleidiai' 2025-10-10T00:45:50.7407545Z Entering 'third_party/mimalloc' 2025-10-10T00:45:50.7468836Z Entering 'third_party/nlohmann' 2025-10-10T00:45:50.7532917Z Entering 'third_party/onnx' 2025-10-10T00:45:50.7612874Z Entering 'third_party/onnx/third_party/pybind11' 2025-10-10T00:45:50.7681097Z Entering 'third_party/opentelemetry-cpp' 2025-10-10T00:45:50.7745919Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-10-10T00:45:50.7805042Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-10-10T00:45:50.7862785Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-10-10T00:45:50.7921254Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-10-10T00:45:50.7980662Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-10-10T00:45:50.8038447Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-10-10T00:45:50.8095586Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-10-10T00:45:50.8152077Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:45:50.8216660Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:45:50.8279658Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-10-10T00:45:50.8367301Z Entering 'third_party/pocketfft' 2025-10-10T00:45:50.8429068Z Entering 'third_party/protobuf' 2025-10-10T00:45:50.8490066Z Entering 'third_party/protobuf/third_party/benchmark' 2025-10-10T00:45:50.8550137Z Entering 'third_party/protobuf/third_party/googletest' 2025-10-10T00:45:50.8611904Z Entering 'third_party/psimd' 2025-10-10T00:45:50.8673296Z Entering 'third_party/pthreadpool' 2025-10-10T00:45:50.8734569Z Entering 'third_party/pybind11' 2025-10-10T00:45:50.8795739Z Entering 'third_party/python-peachpy' 2025-10-10T00:45:50.8855742Z Entering 'third_party/sleef' 2025-10-10T00:45:50.8916287Z Entering 'third_party/tensorpipe' 2025-10-10T00:45:50.8976148Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-10-10T00:45:50.9034032Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-10-10T00:45:50.9093965Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-10-10T00:45:50.9155934Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-10-10T00:45:50.9213603Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-10-10T00:45:50.9310101Z ##[endgroup] 2025-10-10T00:45:50.9310573Z ##[group]Persisting credentials for submodules 2025-10-10T00:45:50.9317725Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || :" 2025-10-10T00:45:50.9724781Z Entering 'android/libs/fbjni' 2025-10-10T00:45:50.9810045Z Entering 'third_party/FP16' 2025-10-10T00:45:50.9890748Z Entering 'third_party/FXdiv' 2025-10-10T00:45:50.9971753Z Entering 'third_party/NNPACK' 2025-10-10T00:45:51.0052312Z Entering 'third_party/NVTX' 2025-10-10T00:45:51.0134296Z Entering 'third_party/VulkanMemoryAllocator' 2025-10-10T00:45:51.0216722Z Entering 'third_party/XNNPACK' 2025-10-10T00:45:51.0311573Z Entering 'third_party/aiter' 2025-10-10T00:45:51.0391564Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-10-10T00:45:51.0486060Z Entering 'third_party/benchmark' 2025-10-10T00:45:51.0566678Z Entering 'third_party/composable_kernel' 2025-10-10T00:45:51.0659685Z Entering 'third_party/cpp-httplib' 2025-10-10T00:45:51.0739967Z Entering 'third_party/cpuinfo' 2025-10-10T00:45:51.0821574Z Entering 'third_party/cudnn_frontend' 2025-10-10T00:45:51.0903889Z Entering 'third_party/cutlass' 2025-10-10T00:45:51.0994165Z Entering 'third_party/fbgemm' 2025-10-10T00:45:51.1074951Z Entering 'third_party/fbgemm/external/asmjit' 2025-10-10T00:45:51.1153085Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-10-10T00:45:51.1239774Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-10-10T00:45:51.1319044Z Entering 'third_party/fbgemm/external/cutlass' 2025-10-10T00:45:51.1405666Z Entering 'third_party/fbgemm/external/googletest' 2025-10-10T00:45:51.1483140Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-10-10T00:45:51.1563426Z Entering 'third_party/fbgemm/external/json' 2025-10-10T00:45:51.1647247Z Entering 'third_party/flash-attention' 2025-10-10T00:45:51.1725316Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-10-10T00:45:51.1809660Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-10-10T00:45:51.1902681Z Entering 'third_party/flatbuffers' 2025-10-10T00:45:51.1984647Z Entering 'third_party/fmt' 2025-10-10T00:45:51.2064502Z Entering 'third_party/gemmlowp/gemmlowp' 2025-10-10T00:45:51.2143844Z Entering 'third_party/gloo' 2025-10-10T00:45:51.2223939Z Entering 'third_party/googletest' 2025-10-10T00:45:51.2306166Z Entering 'third_party/ideep' 2025-10-10T00:45:51.2385293Z Entering 'third_party/ideep/mkl-dnn' 2025-10-10T00:45:51.2472888Z Entering 'third_party/ittapi' 2025-10-10T00:45:51.2553922Z Entering 'third_party/kineto' 2025-10-10T00:45:51.2632851Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-10-10T00:45:51.2708678Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-10-10T00:45:51.2788603Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-10-10T00:45:51.2868238Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-10-10T00:45:51.2948440Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-10-10T00:45:51.3024871Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-10-10T00:45:51.3107370Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-10-10T00:45:51.3185368Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-10-10T00:45:51.3265871Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-10-10T00:45:51.3345402Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-10-10T00:45:51.3422266Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-10-10T00:45:51.3499581Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:45:51.3583916Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:45:51.3670911Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-10-10T00:45:51.3748649Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-10-10T00:45:51.3830200Z Entering 'third_party/kleidiai' 2025-10-10T00:45:51.3911621Z Entering 'third_party/mimalloc' 2025-10-10T00:45:51.3991460Z Entering 'third_party/nlohmann' 2025-10-10T00:45:51.4074638Z Entering 'third_party/onnx' 2025-10-10T00:45:51.4170068Z Entering 'third_party/onnx/third_party/pybind11' 2025-10-10T00:45:51.4257829Z Entering 'third_party/opentelemetry-cpp' 2025-10-10T00:45:51.4337178Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-10-10T00:45:51.4413986Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-10-10T00:45:51.4489835Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-10-10T00:45:51.4565940Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-10-10T00:45:51.4645788Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-10-10T00:45:51.4725953Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-10-10T00:45:51.4801669Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-10-10T00:45:51.4878354Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:45:51.4959729Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:45:51.5039422Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-10-10T00:45:51.5144715Z Entering 'third_party/pocketfft' 2025-10-10T00:45:51.5224599Z Entering 'third_party/protobuf' 2025-10-10T00:45:51.5306453Z Entering 'third_party/protobuf/third_party/benchmark' 2025-10-10T00:45:51.5382621Z Entering 'third_party/protobuf/third_party/googletest' 2025-10-10T00:45:51.5467630Z Entering 'third_party/psimd' 2025-10-10T00:45:51.5549830Z Entering 'third_party/pthreadpool' 2025-10-10T00:45:51.5632040Z Entering 'third_party/pybind11' 2025-10-10T00:45:51.5713938Z Entering 'third_party/python-peachpy' 2025-10-10T00:45:51.5796472Z Entering 'third_party/sleef' 2025-10-10T00:45:51.5876398Z Entering 'third_party/tensorpipe' 2025-10-10T00:45:51.5955137Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-10-10T00:45:51.6032484Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-10-10T00:45:51.6110241Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-10-10T00:45:51.6186433Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-10-10T00:45:51.6263901Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-10-10T00:45:51.6369755Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url" 2025-10-10T00:45:51.6783847Z Entering 'android/libs/fbjni' 2025-10-10T00:45:51.6858113Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-10-10T00:45:51.6883143Z Entering 'third_party/FP16' 2025-10-10T00:45:51.6962879Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-10-10T00:45:51.6990594Z Entering 'third_party/FXdiv' 2025-10-10T00:45:51.7067842Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-10-10T00:45:51.7094233Z Entering 'third_party/NNPACK' 2025-10-10T00:45:51.7167926Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-10-10T00:45:51.7192809Z Entering 'third_party/NVTX' 2025-10-10T00:45:51.7271483Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-10-10T00:45:51.7296872Z Entering 'third_party/VulkanMemoryAllocator' 2025-10-10T00:45:51.7370459Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-10-10T00:45:51.7395366Z Entering 'third_party/XNNPACK' 2025-10-10T00:45:51.7467655Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-10-10T00:45:51.7511145Z Entering 'third_party/aiter' 2025-10-10T00:45:51.7583321Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-10-10T00:45:51.7611497Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-10-10T00:45:51.7684189Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-10-10T00:45:51.7720713Z Entering 'third_party/benchmark' 2025-10-10T00:45:51.7795618Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-10-10T00:45:51.7822247Z Entering 'third_party/composable_kernel' 2025-10-10T00:45:51.7895694Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-10-10T00:45:51.7928631Z Entering 'third_party/cpp-httplib' 2025-10-10T00:45:51.8003861Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-10-10T00:45:51.8030122Z Entering 'third_party/cpuinfo' 2025-10-10T00:45:51.8103991Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-10-10T00:45:51.8131237Z Entering 'third_party/cudnn_frontend' 2025-10-10T00:45:51.8203858Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-10-10T00:45:51.8230970Z Entering 'third_party/cutlass' 2025-10-10T00:45:51.8306862Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-10-10T00:45:51.8341889Z Entering 'third_party/fbgemm' 2025-10-10T00:45:51.8415937Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-10-10T00:45:51.8441728Z Entering 'third_party/fbgemm/external/asmjit' 2025-10-10T00:45:51.8523678Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-10-10T00:45:51.8547431Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-10-10T00:45:51.8619734Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-10-10T00:45:51.8652594Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-10-10T00:45:51.8724648Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-10-10T00:45:51.8747886Z Entering 'third_party/fbgemm/external/cutlass' 2025-10-10T00:45:51.8818728Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-10-10T00:45:51.8852335Z Entering 'third_party/fbgemm/external/googletest' 2025-10-10T00:45:51.8926254Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-10-10T00:45:51.8950804Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-10-10T00:45:51.9026083Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-10-10T00:45:51.9050592Z Entering 'third_party/fbgemm/external/json' 2025-10-10T00:45:51.9125460Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-10-10T00:45:51.9157156Z Entering 'third_party/flash-attention' 2025-10-10T00:45:51.9232288Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-10-10T00:45:51.9255721Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-10-10T00:45:51.9330452Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-10-10T00:45:51.9363886Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-10-10T00:45:51.9436628Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-10-10T00:45:51.9472027Z Entering 'third_party/flatbuffers' 2025-10-10T00:45:51.9547371Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-10-10T00:45:51.9575390Z Entering 'third_party/fmt' 2025-10-10T00:45:51.9647024Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-10-10T00:45:51.9672820Z Entering 'third_party/gemmlowp/gemmlowp' 2025-10-10T00:45:51.9746950Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-10-10T00:45:51.9772756Z Entering 'third_party/gloo' 2025-10-10T00:45:51.9846662Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-10-10T00:45:51.9870875Z Entering 'third_party/googletest' 2025-10-10T00:45:51.9944567Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-10-10T00:45:51.9969644Z Entering 'third_party/ideep' 2025-10-10T00:45:52.0048014Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-10-10T00:45:52.0071307Z Entering 'third_party/ideep/mkl-dnn' 2025-10-10T00:45:52.0143090Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-10-10T00:45:52.0178403Z Entering 'third_party/ittapi' 2025-10-10T00:45:52.0252223Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-10-10T00:45:52.0279021Z Entering 'third_party/kineto' 2025-10-10T00:45:52.0358312Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-10-10T00:45:52.0380913Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-10-10T00:45:52.0454027Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-10-10T00:45:52.0477505Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-10-10T00:45:52.0551316Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-10-10T00:45:52.0579719Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-10-10T00:45:52.0651240Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-10-10T00:45:52.0675854Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-10-10T00:45:52.0749646Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-10-10T00:45:52.0775377Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-10-10T00:45:52.0849170Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-10-10T00:45:52.0871096Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-10-10T00:45:52.0942809Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-10-10T00:45:52.0972120Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-10-10T00:45:52.1044668Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-10-10T00:45:52.1068974Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-10-10T00:45:52.1142665Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-10-10T00:45:52.1167971Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-10-10T00:45:52.1243522Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-10-10T00:45:52.1268866Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-10-10T00:45:52.1342929Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-10-10T00:45:52.1367108Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-10-10T00:45:52.1438567Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-10-10T00:45:52.1460682Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:45:52.1538172Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-10-10T00:45:52.1565246Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:45:52.1640415Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-10-10T00:45:52.1672382Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-10-10T00:45:52.1744477Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-10-10T00:45:52.1768728Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-10-10T00:45:52.1841979Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-10-10T00:45:52.1870729Z Entering 'third_party/kleidiai' 2025-10-10T00:45:52.1944574Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-10-10T00:45:52.1970310Z Entering 'third_party/mimalloc' 2025-10-10T00:45:52.2046104Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-10-10T00:45:52.2072066Z Entering 'third_party/nlohmann' 2025-10-10T00:45:52.2148684Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-10-10T00:45:52.2176116Z Entering 'third_party/onnx' 2025-10-10T00:45:52.2247607Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-10-10T00:45:52.2289486Z Entering 'third_party/onnx/third_party/pybind11' 2025-10-10T00:45:52.2362333Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-10-10T00:45:52.2391888Z Entering 'third_party/opentelemetry-cpp' 2025-10-10T00:45:52.2466266Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-10-10T00:45:52.2493639Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-10-10T00:45:52.2564300Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-10-10T00:45:52.2587762Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-10-10T00:45:52.2660777Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-10-10T00:45:52.2685074Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-10-10T00:45:52.2755857Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-10-10T00:45:52.2779458Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-10-10T00:45:52.2852669Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-10-10T00:45:52.2878304Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-10-10T00:45:52.2947727Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-10-10T00:45:52.2971333Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-10-10T00:45:52.3043781Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-10-10T00:45:52.3067668Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-10-10T00:45:52.3142465Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-10-10T00:45:52.3163878Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:45:52.3238805Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-10-10T00:45:52.3265388Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:45:52.3337146Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-10-10T00:45:52.3365863Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-10-10T00:45:52.3437276Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-10-10T00:45:52.3485216Z Entering 'third_party/pocketfft' 2025-10-10T00:45:52.3560020Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-10-10T00:45:52.3588615Z Entering 'third_party/protobuf' 2025-10-10T00:45:52.3671613Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-10-10T00:45:52.3699083Z Entering 'third_party/protobuf/third_party/benchmark' 2025-10-10T00:45:52.3770552Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-10-10T00:45:52.3793632Z Entering 'third_party/protobuf/third_party/googletest' 2025-10-10T00:45:52.3864240Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-10-10T00:45:52.3891511Z Entering 'third_party/psimd' 2025-10-10T00:45:52.3966570Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-10-10T00:45:52.3991664Z Entering 'third_party/pthreadpool' 2025-10-10T00:45:52.4063883Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-10-10T00:45:52.4089696Z Entering 'third_party/pybind11' 2025-10-10T00:45:52.4163009Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-10-10T00:45:52.4190830Z Entering 'third_party/python-peachpy' 2025-10-10T00:45:52.4262097Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-10-10T00:45:52.4287227Z Entering 'third_party/sleef' 2025-10-10T00:45:52.4365448Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-10-10T00:45:52.4391140Z Entering 'third_party/tensorpipe' 2025-10-10T00:45:52.4464131Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-10-10T00:45:52.4488368Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-10-10T00:45:52.4561076Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-10-10T00:45:52.4584606Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-10-10T00:45:52.4655952Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-10-10T00:45:52.4679534Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-10-10T00:45:52.4751589Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-10-10T00:45:52.4775854Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-10-10T00:45:52.4846847Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-10-10T00:45:52.4868557Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-10-10T00:45:52.4941048Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-10-10T00:45:52.5752211Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2025-10-10T00:45:52.6169123Z Entering 'android/libs/fbjni' 2025-10-10T00:45:52.6231357Z Entering 'third_party/FP16' 2025-10-10T00:45:52.6295736Z Entering 'third_party/FXdiv' 2025-10-10T00:45:52.6357397Z Entering 'third_party/NNPACK' 2025-10-10T00:45:52.6420226Z Entering 'third_party/NVTX' 2025-10-10T00:45:52.6480738Z Entering 'third_party/VulkanMemoryAllocator' 2025-10-10T00:45:52.6545125Z Entering 'third_party/XNNPACK' 2025-10-10T00:45:52.6620425Z Entering 'third_party/aiter' 2025-10-10T00:45:52.6679969Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-10-10T00:45:52.6748811Z Entering 'third_party/benchmark' 2025-10-10T00:45:52.6809165Z Entering 'third_party/composable_kernel' 2025-10-10T00:45:52.6879621Z Entering 'third_party/cpp-httplib' 2025-10-10T00:45:52.6941566Z Entering 'third_party/cpuinfo' 2025-10-10T00:45:52.7003223Z Entering 'third_party/cudnn_frontend' 2025-10-10T00:45:52.7064012Z Entering 'third_party/cutlass' 2025-10-10T00:45:52.7135060Z Entering 'third_party/fbgemm' 2025-10-10T00:45:52.7201863Z Entering 'third_party/fbgemm/external/asmjit' 2025-10-10T00:45:52.7259216Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-10-10T00:45:52.7326745Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-10-10T00:45:52.7387391Z Entering 'third_party/fbgemm/external/cutlass' 2025-10-10T00:45:52.7457721Z Entering 'third_party/fbgemm/external/googletest' 2025-10-10T00:45:52.7515813Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-10-10T00:45:52.7573924Z Entering 'third_party/fbgemm/external/json' 2025-10-10T00:45:52.7638495Z Entering 'third_party/flash-attention' 2025-10-10T00:45:52.7699031Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-10-10T00:45:52.7764977Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-10-10T00:45:52.7834810Z Entering 'third_party/flatbuffers' 2025-10-10T00:45:52.7899796Z Entering 'third_party/fmt' 2025-10-10T00:45:52.7960041Z Entering 'third_party/gemmlowp/gemmlowp' 2025-10-10T00:45:52.8026944Z Entering 'third_party/gloo' 2025-10-10T00:45:52.8087311Z Entering 'third_party/googletest' 2025-10-10T00:45:52.8148615Z Entering 'third_party/ideep' 2025-10-10T00:45:52.8206593Z Entering 'third_party/ideep/mkl-dnn' 2025-10-10T00:45:52.8275751Z Entering 'third_party/ittapi' 2025-10-10T00:45:52.8337589Z Entering 'third_party/kineto' 2025-10-10T00:45:52.8395957Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-10-10T00:45:52.8453766Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-10-10T00:45:52.8514729Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-10-10T00:45:52.8574035Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-10-10T00:45:52.8633822Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-10-10T00:45:52.8696620Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-10-10T00:45:52.8761626Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-10-10T00:45:52.8820366Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-10-10T00:45:52.8878268Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-10-10T00:45:52.8941701Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-10-10T00:45:52.9002079Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-10-10T00:45:52.9062137Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:45:52.9125448Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:45:52.9196000Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-10-10T00:45:52.9255429Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-10-10T00:45:52.9318123Z Entering 'third_party/kleidiai' 2025-10-10T00:45:52.9379919Z Entering 'third_party/mimalloc' 2025-10-10T00:45:52.9440426Z Entering 'third_party/nlohmann' 2025-10-10T00:45:52.9502653Z Entering 'third_party/onnx' 2025-10-10T00:45:52.9578933Z Entering 'third_party/onnx/third_party/pybind11' 2025-10-10T00:45:52.9646380Z Entering 'third_party/opentelemetry-cpp' 2025-10-10T00:45:52.9709095Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-10-10T00:45:52.9769481Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-10-10T00:45:52.9833243Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-10-10T00:45:52.9891758Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-10-10T00:45:52.9950951Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-10-10T00:45:53.0009561Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-10-10T00:45:53.0068152Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-10-10T00:45:53.0126954Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:45:53.0190586Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:45:53.0254581Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-10-10T00:45:53.0342900Z Entering 'third_party/pocketfft' 2025-10-10T00:45:53.0403449Z Entering 'third_party/protobuf' 2025-10-10T00:45:53.0464235Z Entering 'third_party/protobuf/third_party/benchmark' 2025-10-10T00:45:53.0522411Z Entering 'third_party/protobuf/third_party/googletest' 2025-10-10T00:45:53.0585283Z Entering 'third_party/psimd' 2025-10-10T00:45:53.0646440Z Entering 'third_party/pthreadpool' 2025-10-10T00:45:53.0706312Z Entering 'third_party/pybind11' 2025-10-10T00:45:53.0766183Z Entering 'third_party/python-peachpy' 2025-10-10T00:45:53.0828994Z Entering 'third_party/sleef' 2025-10-10T00:45:53.0888138Z Entering 'third_party/tensorpipe' 2025-10-10T00:45:53.0947229Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-10-10T00:45:53.1005945Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-10-10T00:45:53.1065296Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-10-10T00:45:53.1125712Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-10-10T00:45:53.1183151Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-10-10T00:45:53.1274417Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2025-10-10T00:45:53.1683517Z Entering 'android/libs/fbjni' 2025-10-10T00:45:53.1746975Z Entering 'third_party/FP16' 2025-10-10T00:45:53.1808134Z Entering 'third_party/FXdiv' 2025-10-10T00:45:53.1870667Z Entering 'third_party/NNPACK' 2025-10-10T00:45:53.1932331Z Entering 'third_party/NVTX' 2025-10-10T00:45:53.1992649Z Entering 'third_party/VulkanMemoryAllocator' 2025-10-10T00:45:53.2053618Z Entering 'third_party/XNNPACK' 2025-10-10T00:45:53.2131381Z Entering 'third_party/aiter' 2025-10-10T00:45:53.2197200Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-10-10T00:45:53.2266519Z Entering 'third_party/benchmark' 2025-10-10T00:45:53.2328088Z Entering 'third_party/composable_kernel' 2025-10-10T00:45:53.2398600Z Entering 'third_party/cpp-httplib' 2025-10-10T00:45:53.2460073Z Entering 'third_party/cpuinfo' 2025-10-10T00:45:53.2520414Z Entering 'third_party/cudnn_frontend' 2025-10-10T00:45:53.2585638Z Entering 'third_party/cutlass' 2025-10-10T00:45:53.2656521Z Entering 'third_party/fbgemm' 2025-10-10T00:45:53.2721017Z Entering 'third_party/fbgemm/external/asmjit' 2025-10-10T00:45:53.2778520Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-10-10T00:45:53.2849300Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-10-10T00:45:53.2907284Z Entering 'third_party/fbgemm/external/cutlass' 2025-10-10T00:45:53.2974259Z Entering 'third_party/fbgemm/external/googletest' 2025-10-10T00:45:53.3032486Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-10-10T00:45:53.3089446Z Entering 'third_party/fbgemm/external/json' 2025-10-10T00:45:53.3160252Z Entering 'third_party/flash-attention' 2025-10-10T00:45:53.3225339Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-10-10T00:45:53.3290102Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-10-10T00:45:53.3364399Z Entering 'third_party/flatbuffers' 2025-10-10T00:45:53.3428249Z Entering 'third_party/fmt' 2025-10-10T00:45:53.3487684Z Entering 'third_party/gemmlowp/gemmlowp' 2025-10-10T00:45:53.3548835Z Entering 'third_party/gloo' 2025-10-10T00:45:53.3609177Z Entering 'third_party/googletest' 2025-10-10T00:45:53.3671853Z Entering 'third_party/ideep' 2025-10-10T00:45:53.3730773Z Entering 'third_party/ideep/mkl-dnn' 2025-10-10T00:45:53.3800364Z Entering 'third_party/ittapi' 2025-10-10T00:45:53.3860875Z Entering 'third_party/kineto' 2025-10-10T00:45:53.3921254Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-10-10T00:45:53.3984493Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-10-10T00:45:53.4044823Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-10-10T00:45:53.4110124Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-10-10T00:45:53.4171191Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-10-10T00:45:53.4232266Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-10-10T00:45:53.4297272Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-10-10T00:45:53.4357942Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-10-10T00:45:53.4420329Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-10-10T00:45:53.4480842Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-10-10T00:45:53.4540125Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-10-10T00:45:53.4597239Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:45:53.4660022Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:45:53.4729574Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-10-10T00:45:53.4788637Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-10-10T00:45:53.4850936Z Entering 'third_party/kleidiai' 2025-10-10T00:45:53.4917759Z Entering 'third_party/mimalloc' 2025-10-10T00:45:53.4978482Z Entering 'third_party/nlohmann' 2025-10-10T00:45:53.5040389Z Entering 'third_party/onnx' 2025-10-10T00:45:53.5118036Z Entering 'third_party/onnx/third_party/pybind11' 2025-10-10T00:45:53.5185274Z Entering 'third_party/opentelemetry-cpp' 2025-10-10T00:45:53.5247715Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-10-10T00:45:53.5308751Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-10-10T00:45:53.5367699Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-10-10T00:45:53.5426882Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-10-10T00:45:53.5488441Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-10-10T00:45:53.5546633Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-10-10T00:45:53.5604743Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-10-10T00:45:53.5660143Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:45:53.5722101Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:45:53.5784629Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-10-10T00:45:53.5866756Z Entering 'third_party/pocketfft' 2025-10-10T00:45:53.5927228Z Entering 'third_party/protobuf' 2025-10-10T00:45:53.5989842Z Entering 'third_party/protobuf/third_party/benchmark' 2025-10-10T00:45:53.6048291Z Entering 'third_party/protobuf/third_party/googletest' 2025-10-10T00:45:53.6112315Z Entering 'third_party/psimd' 2025-10-10T00:45:53.6171862Z Entering 'third_party/pthreadpool' 2025-10-10T00:45:53.6234735Z Entering 'third_party/pybind11' 2025-10-10T00:45:53.6295188Z Entering 'third_party/python-peachpy' 2025-10-10T00:45:53.6356508Z Entering 'third_party/sleef' 2025-10-10T00:45:53.6416893Z Entering 'third_party/tensorpipe' 2025-10-10T00:45:53.6475905Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-10-10T00:45:53.6534851Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-10-10T00:45:53.6592186Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-10-10T00:45:53.6649314Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-10-10T00:45:53.6704943Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-10-10T00:45:53.6790181Z ##[endgroup] 2025-10-10T00:45:53.6839551Z [command]/usr/bin/git log -1 --format=%H 2025-10-10T00:45:53.6869031Z 344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:45:53.6994349Z ##[group]Run cd "${GITHUB_WORKSPACE}" 2025-10-10T00:45:53.6994697Z cd "${GITHUB_WORKSPACE}" 2025-10-10T00:45:53.6994999Z # Clean stale submodule dirs 2025-10-10T00:45:53.6995307Z if [ -z "${NO_SUDO}" ]; then 2025-10-10T00:45:53.6995670Z  sudo git submodule foreach --recursive git clean -ffdx 2025-10-10T00:45:53.6996037Z else 2025-10-10T00:45:53.6996339Z  git submodule foreach --recursive git clean -ffdx 2025-10-10T00:45:53.6996685Z fi 2025-10-10T00:45:53.7008056Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:45:53.7008412Z env: 2025-10-10T00:45:53.7008797Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:45:53.7009053Z NO_SUDO: true 2025-10-10T00:45:53.7009276Z ##[endgroup] 2025-10-10T00:45:53.7446874Z Entering 'android/libs/fbjni' 2025-10-10T00:45:53.7496232Z Entering 'third_party/FP16' 2025-10-10T00:45:53.7542438Z Entering 'third_party/FXdiv' 2025-10-10T00:45:53.7587405Z Entering 'third_party/NNPACK' 2025-10-10T00:45:53.7638670Z Entering 'third_party/NVTX' 2025-10-10T00:45:53.7694115Z Entering 'third_party/VulkanMemoryAllocator' 2025-10-10T00:45:53.7742030Z Entering 'third_party/XNNPACK' 2025-10-10T00:45:53.7904879Z Entering 'third_party/aiter' 2025-10-10T00:45:53.7965015Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-10-10T00:45:53.8115926Z Entering 'third_party/benchmark' 2025-10-10T00:45:53.8164829Z Entering 'third_party/composable_kernel' 2025-10-10T00:45:53.8324068Z Entering 'third_party/cpp-httplib' 2025-10-10T00:45:53.8372120Z Entering 'third_party/cpuinfo' 2025-10-10T00:45:53.8424239Z Entering 'third_party/cudnn_frontend' 2025-10-10T00:45:53.8475908Z Entering 'third_party/cutlass' 2025-10-10T00:45:53.8611397Z Entering 'third_party/fbgemm' 2025-10-10T00:45:53.8695094Z Entering 'third_party/fbgemm/external/asmjit' 2025-10-10T00:45:53.8746658Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-10-10T00:45:53.8904564Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-10-10T00:45:53.8955007Z Entering 'third_party/fbgemm/external/cutlass' 2025-10-10T00:45:53.9085354Z Entering 'third_party/fbgemm/external/googletest' 2025-10-10T00:45:53.9135835Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-10-10T00:45:53.9178788Z Entering 'third_party/fbgemm/external/json' 2025-10-10T00:45:53.9242578Z Entering 'third_party/flash-attention' 2025-10-10T00:45:53.9299726Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-10-10T00:45:53.9431945Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-10-10T00:45:53.9550383Z Entering 'third_party/flatbuffers' 2025-10-10T00:45:53.9652793Z Entering 'third_party/fmt' 2025-10-10T00:45:53.9701453Z Entering 'third_party/gemmlowp/gemmlowp' 2025-10-10T00:45:53.9749723Z Entering 'third_party/gloo' 2025-10-10T00:45:53.9798769Z Entering 'third_party/googletest' 2025-10-10T00:45:53.9848995Z Entering 'third_party/ideep' 2025-10-10T00:45:53.9892652Z Entering 'third_party/ideep/mkl-dnn' 2025-10-10T00:45:54.0010948Z Entering 'third_party/ittapi' 2025-10-10T00:45:54.0061372Z Entering 'third_party/kineto' 2025-10-10T00:45:54.0112148Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-10-10T00:45:54.0166698Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-10-10T00:45:54.0233388Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-10-10T00:45:54.0278697Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-10-10T00:45:54.0325732Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-10-10T00:45:54.0367212Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-10-10T00:45:54.0414987Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-10-10T00:45:54.0460019Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-10-10T00:45:54.0510068Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-10-10T00:45:54.0565673Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-10-10T00:45:54.0616349Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-10-10T00:45:54.0660593Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:45:54.0732827Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:45:54.0798998Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-10-10T00:45:54.0845970Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-10-10T00:45:54.0897479Z Entering 'third_party/kleidiai' 2025-10-10T00:45:54.0952154Z Entering 'third_party/mimalloc' 2025-10-10T00:45:54.1002792Z Entering 'third_party/nlohmann' 2025-10-10T00:45:54.1067179Z Entering 'third_party/onnx' 2025-10-10T00:45:54.1526287Z Entering 'third_party/onnx/third_party/pybind11' 2025-10-10T00:45:54.1581657Z Entering 'third_party/opentelemetry-cpp' 2025-10-10T00:45:54.1658531Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-10-10T00:45:54.1703711Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-10-10T00:45:54.1751971Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-10-10T00:45:54.1797776Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-10-10T00:45:54.1858941Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-10-10T00:45:54.1903076Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-10-10T00:45:54.1949115Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-10-10T00:45:54.1995599Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T00:45:54.2067543Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T00:45:54.2119733Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-10-10T00:45:54.2471890Z Entering 'third_party/pocketfft' 2025-10-10T00:45:54.2518939Z Entering 'third_party/protobuf' 2025-10-10T00:45:54.2630175Z Entering 'third_party/protobuf/third_party/benchmark' 2025-10-10T00:45:54.2675786Z Entering 'third_party/protobuf/third_party/googletest' 2025-10-10T00:45:54.2728136Z Entering 'third_party/psimd' 2025-10-10T00:45:54.2774102Z Entering 'third_party/pthreadpool' 2025-10-10T00:45:54.2820278Z Entering 'third_party/pybind11' 2025-10-10T00:45:54.2870741Z Entering 'third_party/python-peachpy' 2025-10-10T00:45:54.2918234Z Entering 'third_party/sleef' 2025-10-10T00:45:54.2968281Z Entering 'third_party/tensorpipe' 2025-10-10T00:45:54.3019586Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-10-10T00:45:54.3067231Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-10-10T00:45:54.3110492Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-10-10T00:45:54.3159509Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-10-10T00:45:54.3204591Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-10-10T00:45:54.3384931Z Prepare all required actions 2025-10-10T00:45:54.3385436Z Getting action download info 2025-10-10T00:45:54.5038658Z ##[group]Run ./.github/actions/setup-linux 2025-10-10T00:45:54.5038969Z env: 2025-10-10T00:45:54.5039183Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:45:54.5039433Z ##[endgroup] 2025-10-10T00:45:54.5086695Z ##[group]Run set -euo pipefail 2025-10-10T00:45:54.5087013Z set -euo pipefail 2025-10-10T00:45:54.5087388Z function get_ec2_metadata() { 2025-10-10T00:45:54.5087745Z  # Pulled from instance metadata endpoint for EC2 2025-10-10T00:45:54.5088335Z  # see https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html 2025-10-10T00:45:54.5088879Z  category=$1 2025-10-10T00:45:54.5089225Z  # If it is GCP runner (runner name contains gcp), do not run this 2025-10-10T00:45:54.5089631Z  runner_name_str=i-088ba17e0301f2c3f 2025-10-10T00:45:54.5089976Z  if [[ -f /.inarc ]]; then 2025-10-10T00:45:54.5090312Z  echo "ARC Runner, no info on ec2 metadata" 2025-10-10T00:45:54.5090681Z  elif [[ $runner_name_str == *"gcp"* ]]; then 2025-10-10T00:45:54.5091122Z  echo "Runner is from Google Cloud Platform, No info on ec2 metadata" 2025-10-10T00:45:54.5091525Z  else 2025-10-10T00:45:54.5092323Z  curl -H "X-aws-ec2-metadata-token: $(curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 30")" -fsSL "http://169.254.169.254/latest/meta-data/${category}" 2025-10-10T00:45:54.5093159Z  fi 2025-10-10T00:45:54.5093545Z } 2025-10-10T00:45:54.5093810Z echo "ami-id: $(get_ec2_metadata ami-id)" 2025-10-10T00:45:54.5094223Z echo "instance-id: $(get_ec2_metadata instance-id)" 2025-10-10T00:45:54.5094685Z echo "instance-type: $(get_ec2_metadata instance-type)" 2025-10-10T00:45:54.5095070Z echo "system info $(uname -a)" 2025-10-10T00:45:54.5105368Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:45:54.5105722Z env: 2025-10-10T00:45:54.5105933Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:45:54.5106185Z ##[endgroup] 2025-10-10T00:45:54.5284439Z ami-id: ami-08982f1c5bf93d976 2025-10-10T00:45:54.5413633Z instance-id: i-088ba17e0301f2c3f 2025-10-10T00:45:54.5538335Z instance-type: g5.4xlarge 2025-10-10T00:45:54.5553256Z system info Linux ip-10-0-20-73.ec2.internal 6.1.150-174.273.amzn2023.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Sep 9 12:21:26 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux 2025-10-10T00:45:54.5588243Z ##[group]Run if [ -f /usr/bin/nvidia-smi ]; then nvidia-smi; fi 2025-10-10T00:45:54.5588732Z if [ -f /usr/bin/nvidia-smi ]; then nvidia-smi; fi 2025-10-10T00:45:54.5598823Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:45:54.5599194Z env: 2025-10-10T00:45:54.5599407Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:45:54.5599673Z ##[endgroup] 2025-10-10T00:45:56.2036384Z Fri Oct 10 00:45:56 2025 2025-10-10T00:45:56.2036807Z +-----------------------------------------------------------------------------------------+ 2025-10-10T00:45:56.2037323Z | NVIDIA-SMI 580.82.07 Driver Version: 580.82.07 CUDA Version: 13.0 | 2025-10-10T00:45:56.2037820Z +-----------------------------------------+------------------------+----------------------+ 2025-10-10T00:45:56.2038334Z | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | 2025-10-10T00:45:56.2038889Z | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | 2025-10-10T00:45:56.2039359Z | | | MIG M. | 2025-10-10T00:45:56.2039713Z |=========================================+========================+======================| 2025-10-10T00:45:56.2129037Z | 0 NVIDIA A10G Off | 00000000:00:1E.0 Off | 0 | 2025-10-10T00:45:56.2130418Z | 0% 25C P0 53W / 300W | 0MiB / 23028MiB | 3% Default | 2025-10-10T00:45:56.2131162Z | | | N/A | 2025-10-10T00:45:56.2131777Z +-----------------------------------------+------------------------+----------------------+ 2025-10-10T00:45:56.2132071Z 2025-10-10T00:45:56.2132258Z +-----------------------------------------------------------------------------------------+ 2025-10-10T00:45:56.2132701Z | Processes: | 2025-10-10T00:45:56.2133157Z | GPU GI CI PID Type Process name GPU Memory | 2025-10-10T00:45:56.2133587Z | ID ID Usage | 2025-10-10T00:45:56.2133955Z |=========================================================================================| 2025-10-10T00:45:56.2134398Z | No running processes found | 2025-10-10T00:45:56.2134884Z +-----------------------------------------------------------------------------------------+ 2025-10-10T00:45:56.6468725Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-10-10T00:45:56.6469591Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-10-10T00:45:56.6482626Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:45:56.6483174Z env: 2025-10-10T00:45:56.6483389Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:45:56.6483657Z ##[endgroup] 2025-10-10T00:45:56.6556929Z ##[group]Run if systemctl is-active --quiet docker; then 2025-10-10T00:45:56.6557559Z if systemctl is-active --quiet docker; then 2025-10-10T00:45:56.6557937Z  echo "Docker daemon is running..."; 2025-10-10T00:45:56.6558273Z else 2025-10-10T00:45:56.6558615Z  echo "Starting docker daemon..." && sudo systemctl start docker; 2025-10-10T00:45:56.6559016Z fi 2025-10-10T00:45:56.6578847Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:45:56.6579215Z env: 2025-10-10T00:45:56.6579440Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:45:56.6579709Z ##[endgroup] 2025-10-10T00:45:56.6681333Z Docker daemon is running... 2025-10-10T00:45:56.6828435Z ##[group]Run nick-fields/retry@v3.0.0 2025-10-10T00:45:56.6828744Z with: 2025-10-10T00:45:56.6828953Z shell: bash 2025-10-10T00:45:56.6829175Z timeout_minutes: 5 2025-10-10T00:45:56.6829433Z max_attempts: 3 2025-10-10T00:45:56.6829675Z retry_wait_seconds: 30 2025-10-10T00:45:56.6831797Z command: AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\") aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \ --password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com" # For LF Runners we need to make sure we also login to Meta's ECR docker registry too. META_AWS_ACCOUNT_ID=308535385114 if [ "$AWS_ACCOUNT_ID" != "$META_AWS_ACCOUNT_ID" ] ; then aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \ --password-stdin "$META_AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com" fi 2025-10-10T00:45:56.6833860Z polling_interval_seconds: 1 2025-10-10T00:45:56.6834143Z warning_on_retry: true 2025-10-10T00:45:56.6834411Z continue_on_error: false 2025-10-10T00:45:56.6834654Z env: 2025-10-10T00:45:56.6834871Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:45:56.6835148Z AWS_RETRY_MODE: standard 2025-10-10T00:45:56.6835408Z AWS_MAX_ATTEMPTS: 5 2025-10-10T00:45:56.6835657Z AWS_DEFAULT_REGION: us-east-1 2025-10-10T00:45:56.6835928Z ##[endgroup] 2025-10-10T00:45:57.8319433Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2025-10-10T00:45:57.8320392Z Configure a credential helper to remove this warning. See 2025-10-10T00:45:57.8320964Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2025-10-10T00:45:57.8321374Z 2025-10-10T00:45:57.8321482Z Login Succeeded 2025-10-10T00:45:58.7650628Z Command completed after 1 attempt(s). 2025-10-10T00:45:58.7737107Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-10-10T00:45:58.7737607Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-10-10T00:45:58.7738038Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-10-10T00:45:58.7749142Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:45:58.7749509Z env: 2025-10-10T00:45:58.7749730Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:45:58.7749998Z ##[endgroup] 2025-10-10T00:45:58.7860256Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2025-10-10T00:45:58.7860789Z # ignore expansion of "docker ps -q" since it could be empty 2025-10-10T00:45:58.7861206Z # shellcheck disable=SC2046 2025-10-10T00:45:58.7861535Z docker stop $(docker ps -q) || true 2025-10-10T00:45:58.7861865Z # Prune all of the docker images 2025-10-10T00:45:58.7862187Z docker system prune -af 2025-10-10T00:45:58.7870533Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:45:58.7870883Z env: 2025-10-10T00:45:58.7871088Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:45:58.7871346Z ##[endgroup] 2025-10-10T00:45:58.8158064Z "docker stop" requires at least 1 argument. 2025-10-10T00:45:58.8158455Z See 'docker stop --help'. 2025-10-10T00:45:58.8158625Z 2025-10-10T00:45:58.8159092Z Usage: docker stop [OPTIONS] CONTAINER [CONTAINER...] 2025-10-10T00:45:58.8159347Z 2025-10-10T00:45:58.8159463Z Stop one or more running containers 2025-10-10T00:45:58.8388351Z Total reclaimed space: 0B 2025-10-10T00:45:58.8589823Z ##[group]Run pytorch/test-infra/.github/actions/calculate-docker-image@main 2025-10-10T00:45:58.8590270Z with: 2025-10-10T00:45:58.8591013Z docker-image-name: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:45:58.8591854Z use-custom-docker-registry: true 2025-10-10T00:45:58.8592163Z docker-build-dir: .ci/docker 2025-10-10T00:45:58.8592457Z docker-build-script: ./build.sh 2025-10-10T00:45:58.8592745Z working-directory: . 2025-10-10T00:45:58.8593081Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:45:58.8593461Z force-push: false 2025-10-10T00:45:58.8593685Z env: 2025-10-10T00:45:58.8593896Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:45:58.8594156Z ##[endgroup] 2025-10-10T00:45:58.8624089Z ##[group]Run set -ex 2025-10-10T00:45:58.8624377Z set -ex 2025-10-10T00:45:58.8624606Z  2025-10-10T00:45:58.8625014Z # If the docker build directory or the build script doesn't exist, the action will 2025-10-10T00:45:58.8625639Z # gracefully return the docker image name as it is. Pulling docker image in Linux 2025-10-10T00:45:58.8626172Z # job could then download the pre-built image as usual 2025-10-10T00:45:58.8626812Z if [[ -d "${DOCKER_BUILD_DIR}" ]] && [[ -f "${DOCKER_BUILD_DIR}/${DOCKER_BUILD_SCRIPT}" ]] && [[ "${USE_CUSTOM_DOCKER_REGISTRY}" == "true" ]]; then 2025-10-10T00:45:58.8627423Z  echo "skip=false" >> "${GITHUB_OUTPUT}" 2025-10-10T00:45:58.8627747Z else 2025-10-10T00:45:58.8628018Z  echo "skip=true" >> "${GITHUB_OUTPUT}" 2025-10-10T00:45:58.8628443Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2025-10-10T00:45:58.8628845Z  2025-10-10T00:45:58.8629374Z  echo "Not using custom ECR registry. Either it was not requested or there is no Docker build script in the ${REPO_NAME} repo..." 2025-10-10T00:45:58.8629965Z  exit 0 2025-10-10T00:45:58.8630183Z fi 2025-10-10T00:45:58.8630396Z  2025-10-10T00:45:58.8630730Z if [[ "${DOCKER_IMAGE_NAME}" == *"${DOCKER_REGISTRY}/${REPO_NAME}"* ]]; then 2025-10-10T00:45:58.8631293Z  # The docker image name already includes the ECR prefix and tag, so we can just 2025-10-10T00:45:58.8631896Z  # use it as it is, but first let's extract the tag 2025-10-10T00:45:58.8632467Z  DOCKER_TAG=$(echo "${DOCKER_IMAGE_NAME}" | awk -F '[:,]' '{print $2}') 2025-10-10T00:45:58.8633072Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-10-10T00:45:58.8633653Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2025-10-10T00:45:58.8634090Z else 2025-10-10T00:45:58.8634357Z  if [[ "${DOCKER_IMAGE_NAME}" == *:* ]]; then 2025-10-10T00:45:58.8634724Z  CUSTOM_TAG_PREFIX=${DOCKER_IMAGE_NAME#*:} 2025-10-10T00:45:58.8635107Z  DOCKER_IMAGE_NAME=${DOCKER_IMAGE_NAME%%:*} 2025-10-10T00:45:58.8635434Z  fi 2025-10-10T00:45:58.8635869Z  DOCKER_TAG=${CUSTOM_TAG_PREFIX:+${CUSTOM_TAG_PREFIX}-}$(git rev-parse HEAD:"${DOCKER_BUILD_DIR}") 2025-10-10T00:45:58.8636433Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-10-10T00:45:58.8637025Z  echo "docker-image=${DOCKER_REGISTRY}/${REPO_NAME}/${DOCKER_IMAGE_NAME}:${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-10-10T00:45:58.8637670Z  echo "custom-tag-prefix=${CUSTOM_TAG_PREFIX}" >> "${GITHUB_OUTPUT}" 2025-10-10T00:45:58.8638069Z fi 2025-10-10T00:45:58.8647565Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:45:58.8647914Z env: 2025-10-10T00:45:58.8648131Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:45:58.8648555Z REPO_NAME: pytorch 2025-10-10T00:45:58.8649458Z DOCKER_IMAGE_NAME: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:45:58.8650276Z DOCKER_BUILD_DIR: .ci/docker 2025-10-10T00:45:58.8650560Z DOCKER_BUILD_SCRIPT: ./build.sh 2025-10-10T00:45:58.8650934Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:45:58.8651323Z USE_CUSTOM_DOCKER_REGISTRY: true 2025-10-10T00:45:58.8651613Z CUSTOM_TAG_PREFIX: 2025-10-10T00:45:58.8651851Z ##[endgroup] 2025-10-10T00:45:58.8683487Z + [[ -d .ci/docker ]] 2025-10-10T00:45:58.8683872Z + [[ -f .ci/docker/./build.sh ]] 2025-10-10T00:45:58.8684228Z + [[ true == \t\r\u\e ]] 2025-10-10T00:45:58.8684478Z + echo skip=false 2025-10-10T00:45:58.8685465Z + [[ 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac == *\3\0\8\5\3\5\3\8\5\1\1\4\.\d\k\r\.\e\c\r\.\u\s\-\e\a\s\t\-\1\.\a\m\a\z\o\n\a\w\s\.\c\o\m\/\p\y\t\o\r\c\h* ]] 2025-10-10T00:45:58.8692705Z ++ echo 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:45:58.8693560Z ++ awk -F '[:,]' '{print $2}' 2025-10-10T00:45:58.8723521Z + DOCKER_TAG=pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:45:58.8724381Z + echo docker-tag=pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:45:58.8725485Z + echo docker-image=308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:45:58.8777569Z ##[group]Run set +e 2025-10-10T00:45:58.8777868Z set +e 2025-10-10T00:45:58.8778105Z set -x 2025-10-10T00:45:58.8778319Z  2025-10-10T00:45:58.8778552Z login() { 2025-10-10T00:45:58.8779014Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2025-10-10T00:45:58.8779499Z } 2025-10-10T00:45:58.8779714Z  2025-10-10T00:45:58.8779928Z retry () { 2025-10-10T00:45:58.8780207Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2025-10-10T00:45:58.8780509Z } 2025-10-10T00:45:58.8780709Z  2025-10-10T00:45:58.8780951Z retry login "${DOCKER_REGISTRY}" 2025-10-10T00:45:58.8781246Z  2025-10-10T00:45:58.8781466Z START_TIME=$(date +%s) 2025-10-10T00:45:58.8781758Z # Wait up to 120 minutes 2025-10-10T00:45:58.8782109Z while [[ $(( $(date +%s) - 7200 )) -lt $START_TIME ]]; do 2025-10-10T00:45:58.8782575Z  # Check if image already exists, if it does then skip building it 2025-10-10T00:45:58.8783045Z  if docker manifest inspect "${DOCKER_IMAGE}"; then 2025-10-10T00:45:58.8783390Z  exit 0 2025-10-10T00:45:58.8783620Z  fi 2025-10-10T00:45:58.8783839Z  2025-10-10T00:45:58.8784213Z  # NB: This flag is used by Docker build workflow to push the image to ECR, so we can 2025-10-10T00:45:58.8784832Z  # use this to differentiate between the Docker build and regular build jobs. For the 2025-10-10T00:45:58.8785440Z  # latter, it will wait for the Docker images to become available before continuing 2025-10-10T00:45:58.8785933Z  if [ "${DOCKER_PUSH:-false}" == "true" ]; then 2025-10-10T00:45:58.8786324Z  # It's a Docker build job, let's build the image 2025-10-10T00:45:58.8786662Z  break 2025-10-10T00:45:58.8786885Z  else 2025-10-10T00:45:58.8787213Z  # It's a regular build job, wait for the image to become available 2025-10-10T00:45:58.8787602Z  sleep 300 2025-10-10T00:45:58.8787849Z  fi 2025-10-10T00:45:58.8788251Z done 2025-10-10T00:45:58.8788463Z  2025-10-10T00:45:58.8788797Z # NB: This part requires a full checkout. Otherwise, the merge base will 2025-10-10T00:45:58.8789490Z # be empty. The default action would be to continue rebuild the image 2025-10-10T00:45:58.8789991Z if [[ "$BASE_REVISION" = "$(git rev-parse HEAD)" ]]; then 2025-10-10T00:45:58.8790416Z  # if we're on the base branch then use the parent commit 2025-10-10T00:45:58.8790804Z  MERGE_BASE=$(git rev-parse HEAD~) 2025-10-10T00:45:58.8791113Z else 2025-10-10T00:45:58.8791444Z  # otherwise we're on a PR, so use the most recent base commit 2025-10-10T00:45:58.8791892Z  MERGE_BASE=$(git merge-base HEAD "$BASE_REVISION") 2025-10-10T00:45:58.8792245Z fi 2025-10-10T00:45:58.8792469Z  2025-10-10T00:45:58.8792714Z if [[ -z "${MERGE_BASE}" ]]; then 2025-10-10T00:45:58.8793067Z  echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2025-10-10T00:45:58.8793388Z  2025-10-10T00:45:58.8793845Z  echo "Finding merge base only works with full checkout, please set fetch-depth to 0, continuing ..." 2025-10-10T00:45:58.8794369Z  exit 0 2025-10-10T00:45:58.8794602Z fi 2025-10-10T00:45:58.8794809Z  2025-10-10T00:45:58.8795111Z if ! git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}"; then 2025-10-10T00:45:58.8795753Z  echo "Directory '${DOCKER_BUILD_DIR}' not found in commit $MERGE_BASE, you should rebase onto a more recent commit" 2025-10-10T00:45:58.8796302Z  exit 1 2025-10-10T00:45:58.8796519Z fi 2025-10-10T00:45:58.8796736Z  2025-10-10T00:45:58.8797089Z PREVIOUS_DOCKER_TAG=$(git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}") 2025-10-10T00:45:58.8797702Z # If no image exists but the hash is the same as the previous hash then we should error out here 2025-10-10T00:45:58.8798257Z if [[ "${PREVIOUS_DOCKER_TAG}" == "${DOCKER_TAG}" ]]; then 2025-10-10T00:45:58.8799442Z  echo "WARNING: Something has gone wrong and the previous image isn't available for the merge-base of your branch" 2025-10-10T00:45:58.8800164Z  echo " Will re-build docker image to store in local cache, TTS may be longer" 2025-10-10T00:45:58.8800599Z fi 2025-10-10T00:45:58.8800817Z  2025-10-10T00:45:58.8801081Z echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2025-10-10T00:45:58.8810354Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:45:58.8810708Z env: 2025-10-10T00:45:58.8810923Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:45:58.8811186Z DOCKER_BUILD_DIR: .ci/docker 2025-10-10T00:45:58.8811526Z BASE_REVISION: 344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:45:58.8812416Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:45:58.8813472Z DOCKER_TAG: pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:45:58.8814116Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:45:58.8814491Z DOCKER_PUSH: 2025-10-10T00:45:58.8814706Z ##[endgroup] 2025-10-10T00:45:58.8850302Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:45:58.8850831Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:45:58.8854448Z + aws ecr get-login-password --region us-east-1 2025-10-10T00:45:58.8855150Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:45:59.4085272Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2025-10-10T00:45:59.4085870Z Configure a credential helper to remove this warning. See 2025-10-10T00:45:59.4086558Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2025-10-10T00:45:59.4086961Z 2025-10-10T00:45:59.4087546Z Login Succeeded 2025-10-10T00:45:59.4115125Z ++ date +%s 2025-10-10T00:45:59.4128810Z + START_TIME=1760057159 2025-10-10T00:45:59.4132021Z ++ date +%s 2025-10-10T00:45:59.4146380Z + [[ 1760049959 -lt 1760057159 ]] 2025-10-10T00:45:59.4147348Z + docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:45:59.6740889Z { 2025-10-10T00:45:59.6741651Z "schemaVersion": 2, 2025-10-10T00:45:59.6742233Z "mediaType": "application/vnd.docker.distribution.manifest.v2+json", 2025-10-10T00:45:59.6742804Z "config": { 2025-10-10T00:45:59.6743221Z "mediaType": "application/vnd.docker.container.image.v1+json", 2025-10-10T00:45:59.6743618Z "size": 31326, 2025-10-10T00:45:59.6744024Z "digest": "sha256:2b326b7b17db730c6c973cebcc035ac7bd2de93f7608c304eb88a054c020cb51" 2025-10-10T00:45:59.6744488Z }, 2025-10-10T00:45:59.6744693Z "layers": [ 2025-10-10T00:45:59.6744909Z { 2025-10-10T00:45:59.6745256Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6745667Z "size": 30447990, 2025-10-10T00:45:59.6746088Z "digest": "sha256:828c1365039a657352c737a62d13e1932951b5658eb6bd9b9096ea9b73562453" 2025-10-10T00:45:59.6746547Z }, 2025-10-10T00:45:59.6746744Z { 2025-10-10T00:45:59.6747195Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6747735Z "size": 1554, 2025-10-10T00:45:59.6748249Z "digest": "sha256:bb2a9ef82ad25cad5b8b42333f1ad9021e173ba817f20bea4750c3a2fac61a0f" 2025-10-10T00:45:59.6748706Z }, 2025-10-10T00:45:59.6748890Z { 2025-10-10T00:45:59.6749322Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6749722Z "size": 313649854, 2025-10-10T00:45:59.6750127Z "digest": "sha256:0f76777182a77d74abe645a336fb40383387e35b785ebac305aa6a107b2e7998" 2025-10-10T00:45:59.6750559Z }, 2025-10-10T00:45:59.6750750Z { 2025-10-10T00:45:59.6751071Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6751481Z "size": 791, 2025-10-10T00:45:59.6751923Z "digest": "sha256:5c003cead6fcb32042a1cc6d0acce01b4801aecb822c4b59c9a9923339b90744" 2025-10-10T00:45:59.6752397Z }, 2025-10-10T00:45:59.6752586Z { 2025-10-10T00:45:59.6752901Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6753288Z "size": 106, 2025-10-10T00:45:59.6753683Z "digest": "sha256:0c32f5cb2c476d0602ac51d1ba79c514253ef564c67abdf96c715b43478b6350" 2025-10-10T00:45:59.6754125Z }, 2025-10-10T00:45:59.6754314Z { 2025-10-10T00:45:59.6754625Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6755029Z "size": 703, 2025-10-10T00:45:59.6755417Z "digest": "sha256:70689f59dfb338fa8873920df080deffcd3f25861090773b7d045d0c5a1285d0" 2025-10-10T00:45:59.6755856Z }, 2025-10-10T00:45:59.6756042Z { 2025-10-10T00:45:59.6756363Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6756764Z "size": 1215, 2025-10-10T00:45:59.6757159Z "digest": "sha256:e0e260d97e48adf66986d23f296852f807477ca40ef84e996eccb966a00d1007" 2025-10-10T00:45:59.6757597Z }, 2025-10-10T00:45:59.6757798Z { 2025-10-10T00:45:59.6758117Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6758516Z "size": 484, 2025-10-10T00:45:59.6758892Z "digest": "sha256:447b9f68411929ed23d1985366cee316c228c567732004c0251851a5226e3346" 2025-10-10T00:45:59.6759324Z }, 2025-10-10T00:45:59.6759519Z { 2025-10-10T00:45:59.6759845Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6760237Z "size": 110343522, 2025-10-10T00:45:59.6760651Z "digest": "sha256:fc4a90a2e4bea59386a1521e5b9258825d5a6938b035dc686f1cdf6abb614a98" 2025-10-10T00:45:59.6761098Z }, 2025-10-10T00:45:59.6761287Z { 2025-10-10T00:45:59.6761602Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6762271Z "size": 5036, 2025-10-10T00:45:59.6762673Z "digest": "sha256:bddc1d18a3d6ffb4a405012119238059a9ff6b0b5cf99ff07d8b450da5e7a62c" 2025-10-10T00:45:59.6763123Z }, 2025-10-10T00:45:59.6763309Z { 2025-10-10T00:45:59.6763774Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6764176Z "size": 1709, 2025-10-10T00:45:59.6764568Z "digest": "sha256:6541008b1d6a3623c69c59142dc4bce9e776540d8cdc1870114ddda7a20f93f8" 2025-10-10T00:45:59.6765010Z }, 2025-10-10T00:45:59.6765199Z { 2025-10-10T00:45:59.6765516Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6765912Z "size": 724, 2025-10-10T00:45:59.6766312Z "digest": "sha256:954a2d7b7a43ef19f402520fa8ddf2025bdc12bf25cab78f9ed33fa49649ec80" 2025-10-10T00:45:59.6766763Z }, 2025-10-10T00:45:59.6766953Z { 2025-10-10T00:45:59.6767358Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6767749Z "size": 543, 2025-10-10T00:45:59.6768149Z "digest": "sha256:757b602ab7ba23e8992f66ee2d27b3f81ff0395b74d8b935211e172a14da967f" 2025-10-10T00:45:59.6768596Z }, 2025-10-10T00:45:59.6768783Z { 2025-10-10T00:45:59.6769102Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6769498Z "size": 3252549426, 2025-10-10T00:45:59.6769899Z "digest": "sha256:7a66b563415761efd6546abf65ebe5999d7a06af8857e1506be24205987b384c" 2025-10-10T00:45:59.6770336Z }, 2025-10-10T00:45:59.6770518Z { 2025-10-10T00:45:59.6770862Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6771410Z "size": 32, 2025-10-10T00:45:59.6771942Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-10-10T00:45:59.6772500Z }, 2025-10-10T00:45:59.6772688Z { 2025-10-10T00:45:59.6773006Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6773403Z "size": 381, 2025-10-10T00:45:59.6774122Z "digest": "sha256:a272ef38f8ce49245ad653e0cd55b38aab7b9d5180bc744400b5cf69c4580a0d" 2025-10-10T00:45:59.6774600Z }, 2025-10-10T00:45:59.6774789Z { 2025-10-10T00:45:59.6775112Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6775512Z "size": 236069, 2025-10-10T00:45:59.6775904Z "digest": "sha256:a4e16af38ef98db595d32b48a525f677c01909e643e48c81d1cb195c3fee8ebc" 2025-10-10T00:45:59.6776343Z }, 2025-10-10T00:45:59.6776531Z { 2025-10-10T00:45:59.6776850Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6777238Z "size": 231, 2025-10-10T00:45:59.6777627Z "digest": "sha256:015ebdb069f7feab1651069139a0cb3cc9803ba197f5c46fc133779d31b61030" 2025-10-10T00:45:59.6778077Z }, 2025-10-10T00:45:59.6778270Z { 2025-10-10T00:45:59.6778585Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6778992Z "size": 3539938, 2025-10-10T00:45:59.6779388Z "digest": "sha256:774f7b49312f0299a66eb62f46346ea5f71819890c9bc278dbab666305a7b59d" 2025-10-10T00:45:59.6779831Z }, 2025-10-10T00:45:59.6780018Z { 2025-10-10T00:45:59.6780343Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6780746Z "size": 1479, 2025-10-10T00:45:59.6781153Z "digest": "sha256:0424a3772840ccdb1709d0599a2a23ecad3ecac66f1d78dee33ce2a34cca3aed" 2025-10-10T00:45:59.6781593Z }, 2025-10-10T00:45:59.6781789Z { 2025-10-10T00:45:59.6782108Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6782509Z "size": 482, 2025-10-10T00:45:59.6782896Z "digest": "sha256:4355c82d0b076e33626dcda0ab63757f43e07c1860845ee837ce0d27f91b1008" 2025-10-10T00:45:59.6783347Z }, 2025-10-10T00:45:59.6783541Z { 2025-10-10T00:45:59.6783863Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6784256Z "size": 200, 2025-10-10T00:45:59.6784652Z "digest": "sha256:540f54cb26f4a87582a229a42d14fdf7845fa87c3ddac44073d4fa4983808159" 2025-10-10T00:45:59.6785093Z }, 2025-10-10T00:45:59.6785393Z { 2025-10-10T00:45:59.6785706Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6786103Z "size": 608, 2025-10-10T00:45:59.6786594Z "digest": "sha256:fb3109e064fb68fcdf55d805bd5d505d4b2d7ba683a7239616f3205bfc22b984" 2025-10-10T00:45:59.6787047Z }, 2025-10-10T00:45:59.6787232Z { 2025-10-10T00:45:59.6787555Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6787955Z "size": 7871298247, 2025-10-10T00:45:59.6788365Z "digest": "sha256:b97b9f73a17c453e588434989d1f1dac003e68d7adf901f4d93aa26fa620bf7c" 2025-10-10T00:45:59.6788812Z }, 2025-10-10T00:45:59.6789001Z { 2025-10-10T00:45:59.6789318Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6789719Z "size": 828, 2025-10-10T00:45:59.6790113Z "digest": "sha256:916d375b255ab9ba770536591f11e96ccfb511090b6e6397f3bb346a511ea8cd" 2025-10-10T00:45:59.6790554Z }, 2025-10-10T00:45:59.6790750Z { 2025-10-10T00:45:59.6791080Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6791495Z "size": 33451725, 2025-10-10T00:45:59.6791945Z "digest": "sha256:dfbfc2d91df60b30357ae6548e620847b73cbeff8164dea1130ddd967b038965" 2025-10-10T00:45:59.6792424Z }, 2025-10-10T00:45:59.6792619Z { 2025-10-10T00:45:59.6792946Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6793335Z "size": 104, 2025-10-10T00:45:59.6793732Z "digest": "sha256:012785c1f3f5422d9e3266af997b56162502b0dc9caeb30fe4f73ac15ead3e4d" 2025-10-10T00:45:59.6794176Z }, 2025-10-10T00:45:59.6794383Z { 2025-10-10T00:45:59.6794706Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6795104Z "size": 1495, 2025-10-10T00:45:59.6795501Z "digest": "sha256:aaef93273d5dc8ba224c173dd276d33d4516cf0be3e0ab06da7a9bb346ee91a3" 2025-10-10T00:45:59.6795959Z }, 2025-10-10T00:45:59.6796153Z { 2025-10-10T00:45:59.6796469Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6796866Z "size": 453792519, 2025-10-10T00:45:59.6797279Z "digest": "sha256:732deb59e36e883ad74c429e2c50a99789dfd31ca377c747aab8b1432e306b90" 2025-10-10T00:45:59.6797727Z }, 2025-10-10T00:45:59.6797912Z { 2025-10-10T00:45:59.6798226Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6798875Z "size": 163, 2025-10-10T00:45:59.6799263Z "digest": "sha256:8d7304dd5719df3109a2c34be0fe309ddc9403958af441c2d693ec85fc16a7ad" 2025-10-10T00:45:59.6799701Z }, 2025-10-10T00:45:59.6799883Z { 2025-10-10T00:45:59.6800198Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6800600Z "size": 346, 2025-10-10T00:45:59.6800980Z "digest": "sha256:9427facc45489f426177706ff763d3427cf4e04bb0d898d1ba280027829c89d8" 2025-10-10T00:45:59.6801409Z }, 2025-10-10T00:45:59.6801594Z { 2025-10-10T00:45:59.6801907Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6802303Z "size": 32, 2025-10-10T00:45:59.6802690Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-10-10T00:45:59.6803144Z }, 2025-10-10T00:45:59.6803336Z { 2025-10-10T00:45:59.6803660Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6804045Z "size": 106, 2025-10-10T00:45:59.6804443Z "digest": "sha256:094a1b5be591ab0f554c9878b44790486fb69d8b4b7c8b973bdde2cfc28a319b" 2025-10-10T00:45:59.6804890Z }, 2025-10-10T00:45:59.6805086Z { 2025-10-10T00:45:59.6805394Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6805795Z "size": 425, 2025-10-10T00:45:59.6806193Z "digest": "sha256:3194bad7208279dbf4b6ab6fbe10d2abc486216750e1cdad06e57a4662a1bdaf" 2025-10-10T00:45:59.6806645Z }, 2025-10-10T00:45:59.6806832Z { 2025-10-10T00:45:59.6807218Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6807611Z "size": 19309415, 2025-10-10T00:45:59.6808141Z "digest": "sha256:363217a81a98f330190b0a947c38d916b55314c4afc797152164dfd2b73a24a9" 2025-10-10T00:45:59.6808560Z }, 2025-10-10T00:45:59.6808753Z { 2025-10-10T00:45:59.6809225Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6809629Z "size": 108, 2025-10-10T00:45:59.6810033Z "digest": "sha256:e3c423926c143a5a642bd860db3344a08bb0e25a926bcacbfb880dae359812ab" 2025-10-10T00:45:59.6810487Z }, 2025-10-10T00:45:59.6810686Z { 2025-10-10T00:45:59.6811009Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6811403Z "size": 639, 2025-10-10T00:45:59.6811803Z "digest": "sha256:b0da2665c54862c0c7bc7fe48bb4bccd145a0661780a03aa993545d78b3ddea7" 2025-10-10T00:45:59.6812260Z }, 2025-10-10T00:45:59.6812451Z { 2025-10-10T00:45:59.6812768Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6813169Z "size": 724, 2025-10-10T00:45:59.6813577Z "digest": "sha256:954a2d7b7a43ef19f402520fa8ddf2025bdc12bf25cab78f9ed33fa49649ec80" 2025-10-10T00:45:59.6814031Z }, 2025-10-10T00:45:59.6814215Z { 2025-10-10T00:45:59.6814542Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6814949Z "size": 148, 2025-10-10T00:45:59.6815349Z "digest": "sha256:2348e9e015bffc9fa96b06ce47bb33567a4c148e942fdf2795d61b341ee156a0" 2025-10-10T00:45:59.6815798Z }, 2025-10-10T00:45:59.6816002Z { 2025-10-10T00:45:59.6816331Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6816727Z "size": 135, 2025-10-10T00:45:59.6817112Z "digest": "sha256:2a52776326f7d5686493d3cef7252cb528c71cd5dda8239aaa617ee1481d4a3d" 2025-10-10T00:45:59.6817554Z }, 2025-10-10T00:45:59.6817746Z { 2025-10-10T00:45:59.6818068Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6818465Z "size": 140, 2025-10-10T00:45:59.6818862Z "digest": "sha256:034e6954f81ccaa27b149b815708660cdd2881ed9741444ff77b6edcff67abbc" 2025-10-10T00:45:59.6819317Z }, 2025-10-10T00:45:59.6819512Z { 2025-10-10T00:45:59.6819833Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6820257Z "size": 32, 2025-10-10T00:45:59.6820658Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-10-10T00:45:59.6821116Z }, 2025-10-10T00:45:59.6821304Z { 2025-10-10T00:45:59.6821632Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6822033Z "size": 222, 2025-10-10T00:45:59.6822424Z "digest": "sha256:846309eb8d412fe2691832129ce8f4a9d8a7eabd0e784975b3951727a592d1aa" 2025-10-10T00:45:59.6822876Z }, 2025-10-10T00:45:59.6823069Z { 2025-10-10T00:45:59.6823402Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6823812Z "size": 312300632, 2025-10-10T00:45:59.6824209Z "digest": "sha256:d55a9a44329d60720f399c31761f0267874fe1805b49e82488de4ca8f6fc89b3" 2025-10-10T00:45:59.6824657Z }, 2025-10-10T00:45:59.6824855Z { 2025-10-10T00:45:59.6825187Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6825585Z "size": 3136957676, 2025-10-10T00:45:59.6826003Z "digest": "sha256:95a331d8771d577f375297638de00712e150bac6d7e1d85b6e462b1e5d84644f" 2025-10-10T00:45:59.6826459Z }, 2025-10-10T00:45:59.6826656Z { 2025-10-10T00:45:59.6826980Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6827387Z "size": 128, 2025-10-10T00:45:59.6827790Z "digest": "sha256:1e8ec4d415bd4f7262b44fdcf078ebb5237ca4783e0224223fb1b41787da516b" 2025-10-10T00:45:59.6828252Z }, 2025-10-10T00:45:59.6828456Z { 2025-10-10T00:45:59.6828789Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6829212Z "size": 880, 2025-10-10T00:45:59.6829637Z "digest": "sha256:f201e431abd382ad3ca7f251d7a97828bb378171252a0c223160bd7d730f1004" 2025-10-10T00:45:59.6830108Z }, 2025-10-10T00:45:59.6830391Z { 2025-10-10T00:45:59.6830731Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6831145Z "size": 724, 2025-10-10T00:45:59.6831651Z "digest": "sha256:954a2d7b7a43ef19f402520fa8ddf2025bdc12bf25cab78f9ed33fa49649ec80" 2025-10-10T00:45:59.6832113Z }, 2025-10-10T00:45:59.6832322Z { 2025-10-10T00:45:59.6832661Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6833070Z "size": 141, 2025-10-10T00:45:59.6833487Z "digest": "sha256:feab5475b34a50d2dccdbcd5ab6690f1585c4e7c4a502b09e9ceb13850cf1bde" 2025-10-10T00:45:59.6833961Z }, 2025-10-10T00:45:59.6834176Z { 2025-10-10T00:45:59.6834509Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6834916Z "size": 32, 2025-10-10T00:45:59.6835332Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-10-10T00:45:59.6835810Z }, 2025-10-10T00:45:59.6836021Z { 2025-10-10T00:45:59.6836365Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6836796Z "size": 160, 2025-10-10T00:45:59.6837191Z "digest": "sha256:378fd2234507125805f303057645359a726ac2ccb138f684cb2969edc4e98c4a" 2025-10-10T00:45:59.6837637Z }, 2025-10-10T00:45:59.6837831Z { 2025-10-10T00:45:59.6838170Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6838570Z "size": 1012, 2025-10-10T00:45:59.6838983Z "digest": "sha256:1fae04b5f599ad99b51e54a8ed8e53f3fc9329126fbbb6f4382df9ae3c2cf455" 2025-10-10T00:45:59.6839439Z }, 2025-10-10T00:45:59.6839636Z { 2025-10-10T00:45:59.6839964Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6840369Z "size": 724, 2025-10-10T00:45:59.6840775Z "digest": "sha256:954a2d7b7a43ef19f402520fa8ddf2025bdc12bf25cab78f9ed33fa49649ec80" 2025-10-10T00:45:59.6841241Z }, 2025-10-10T00:45:59.6841446Z { 2025-10-10T00:45:59.6841780Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6842186Z "size": 135, 2025-10-10T00:45:59.6842600Z "digest": "sha256:f659162a95c52dc73ae220b54e26d905db7a4c013acee1cae18ddc327374e15a" 2025-10-10T00:45:59.6843061Z }, 2025-10-10T00:45:59.6843263Z { 2025-10-10T00:45:59.6843584Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6843989Z "size": 32, 2025-10-10T00:45:59.6844399Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-10-10T00:45:59.6844867Z }, 2025-10-10T00:45:59.6845051Z { 2025-10-10T00:45:59.6845374Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6845781Z "size": 158, 2025-10-10T00:45:59.6846176Z "digest": "sha256:2058b9a977f21ecc711e48161566bdc41c149c7c37663dc4b059b30d8c864e37" 2025-10-10T00:45:59.6846612Z }, 2025-10-10T00:45:59.6846804Z { 2025-10-10T00:45:59.6847179Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6847581Z "size": 1371, 2025-10-10T00:45:59.6847979Z "digest": "sha256:dab6b0cf2780455b96f8cce7d7fb4d6eefd4bfeb2752fc63942f391b60318133" 2025-10-10T00:45:59.6848426Z }, 2025-10-10T00:45:59.6848616Z { 2025-10-10T00:45:59.6848942Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6849328Z "size": 32, 2025-10-10T00:45:59.6849725Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-10-10T00:45:59.6850180Z }, 2025-10-10T00:45:59.6850372Z { 2025-10-10T00:45:59.6850683Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6851089Z "size": 137, 2025-10-10T00:45:59.6851481Z "digest": "sha256:80b892828c2018f5fcf5f7e0cd5a46d548f62864ccdd5db253932b48bdb61ba6" 2025-10-10T00:45:59.6851924Z }, 2025-10-10T00:45:59.6852101Z { 2025-10-10T00:45:59.6852421Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6852819Z "size": 527, 2025-10-10T00:45:59.6853322Z "digest": "sha256:39c871222ef73b099b88a6dac2e570adabb31aa07c118b552a1d3f2ba99ffda2" 2025-10-10T00:45:59.6853776Z }, 2025-10-10T00:45:59.6853979Z { 2025-10-10T00:45:59.6854392Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6854800Z "size": 32, 2025-10-10T00:45:59.6855199Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-10-10T00:45:59.6855670Z }, 2025-10-10T00:45:59.6855871Z { 2025-10-10T00:45:59.6856202Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6856606Z "size": 104, 2025-10-10T00:45:59.6857008Z "digest": "sha256:435d0b76171940790f1d319d7f1174d5a0a53da17e173a09b471486dea73483f" 2025-10-10T00:45:59.6857446Z }, 2025-10-10T00:45:59.6857643Z { 2025-10-10T00:45:59.6857969Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6858367Z "size": 429, 2025-10-10T00:45:59.6858756Z "digest": "sha256:eceb7605f7313336b4f766e60e8653f47601d1f7076bc63ba69588a0195e3229" 2025-10-10T00:45:59.6859213Z }, 2025-10-10T00:45:59.6859411Z { 2025-10-10T00:45:59.6859746Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6860141Z "size": 32, 2025-10-10T00:45:59.6860545Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-10-10T00:45:59.6861007Z }, 2025-10-10T00:45:59.6861200Z { 2025-10-10T00:45:59.6861514Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6861905Z "size": 109, 2025-10-10T00:45:59.6862308Z "digest": "sha256:a304fa6d1d55c5fbebb8ec8e5a1c7bb142e8af2e951d473b397efbda2a45c510" 2025-10-10T00:45:59.6862762Z }, 2025-10-10T00:45:59.6862939Z { 2025-10-10T00:45:59.6863251Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6863645Z "size": 1896, 2025-10-10T00:45:59.6864052Z "digest": "sha256:dddcb55ac23cfb9ef5a3f950702cc267d698b5402aa2dc47e930f83bacd7c980" 2025-10-10T00:45:59.6864502Z }, 2025-10-10T00:45:59.6864695Z { 2025-10-10T00:45:59.6865015Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6865410Z "size": 244516374, 2025-10-10T00:45:59.6865828Z "digest": "sha256:13fafff4fe6da8831ff415365691d8cb9d7a15bdfc76d0b76bb2de39170ce19d" 2025-10-10T00:45:59.6866281Z }, 2025-10-10T00:45:59.6866470Z { 2025-10-10T00:45:59.6866786Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6867179Z "size": 106, 2025-10-10T00:45:59.6867569Z "digest": "sha256:99bbffa40cd4b1427108a83be4e7c6a0e1c92776b9ab36f970dacb05b47e0c90" 2025-10-10T00:45:59.6868030Z }, 2025-10-10T00:45:59.6868226Z { 2025-10-10T00:45:59.6868542Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6868937Z "size": 164, 2025-10-10T00:45:59.6869321Z "digest": "sha256:8e33d89cd243b62e2c943572c35c7cc768da414b81d5de615810ef4859ff346e" 2025-10-10T00:45:59.6869770Z }, 2025-10-10T00:45:59.6869963Z { 2025-10-10T00:45:59.6870378Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6870861Z "size": 7943, 2025-10-10T00:45:59.6871554Z "digest": "sha256:631373e616c1a3053aef9f876d586e3b4cbca24922396f58fcdb67f1a5f19dc0" 2025-10-10T00:45:59.6872052Z }, 2025-10-10T00:45:59.6872345Z { 2025-10-10T00:45:59.6872828Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6873344Z "size": 8067, 2025-10-10T00:45:59.6883997Z "digest": "sha256:e481af7e1aecf5d9650753ed342ff9c26bae052b68d8518277a8df0e502ae9f4" 2025-10-10T00:45:59.6884453Z }, 2025-10-10T00:45:59.6884641Z { 2025-10-10T00:45:59.6884953Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6885342Z "size": 303, 2025-10-10T00:45:59.6885739Z "digest": "sha256:8ca97272458cf79b384bbce300aec89d97a5eb77177b1af3ae65622b588ff126" 2025-10-10T00:45:59.6886182Z }, 2025-10-10T00:45:59.6886371Z { 2025-10-10T00:45:59.6886829Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6887305Z "size": 13360640, 2025-10-10T00:45:59.6887796Z "digest": "sha256:55788cbe41f5bf13f05cba31189b0375d867b50a900c1ccb7c5336ef3f001cd2" 2025-10-10T00:45:59.6888235Z }, 2025-10-10T00:45:59.6888415Z { 2025-10-10T00:45:59.6888730Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6889122Z "size": 105, 2025-10-10T00:45:59.6889521Z "digest": "sha256:cbf6e78aa3723a7132ef2ed135e43b6f5b51b8de69639852d63bf9a7d103c26b" 2025-10-10T00:45:59.6889959Z }, 2025-10-10T00:45:59.6890148Z { 2025-10-10T00:45:59.6890462Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6890852Z "size": 54145652, 2025-10-10T00:45:59.6891245Z "digest": "sha256:b7e76793f7847e2d50a6b22400872d7e586e148707376ec566f29db0c45d8f08" 2025-10-10T00:45:59.6891676Z }, 2025-10-10T00:45:59.6891861Z { 2025-10-10T00:45:59.6892176Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-10-10T00:45:59.6892569Z "size": 32, 2025-10-10T00:45:59.6892955Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-10-10T00:45:59.6893405Z } 2025-10-10T00:45:59.6893594Z ] 2025-10-10T00:45:59.6893775Z } 2025-10-10T00:45:59.6893975Z + exit 0 2025-10-10T00:45:59.6930699Z ##[group]Run set -eux 2025-10-10T00:45:59.6930974Z set -eux 2025-10-10T00:45:59.6931379Z # It's ok if this steps fails, it would then be an anonymous user like what we used to have 2025-10-10T00:45:59.6932431Z aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token | jq --raw-output '.SecretString' | jq -r .docker_hub_readonly_token | docker login --username pytorchbot --password-stdin || true 2025-10-10T00:45:59.6942209Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:45:59.6942555Z env: 2025-10-10T00:45:59.6942770Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:45:59.6943031Z ##[endgroup] 2025-10-10T00:45:59.6977328Z + aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token 2025-10-10T00:45:59.6978619Z + jq --raw-output .SecretString 2025-10-10T00:45:59.6979595Z + jq -r .docker_hub_readonly_token 2025-10-10T00:45:59.6982008Z + docker login --username pytorchbot --password-stdin 2025-10-10T00:46:00.2709898Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2025-10-10T00:46:00.2710606Z Configure a credential helper to remove this warning. See 2025-10-10T00:46:00.2711162Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2025-10-10T00:46:00.2711532Z 2025-10-10T00:46:00.2712157Z Login Succeeded 2025-10-10T00:46:00.2816067Z ##[group]Run tag=${ECR_DOCKER_IMAGE##*:} 2025-10-10T00:46:00.2816431Z tag=${ECR_DOCKER_IMAGE##*:} 2025-10-10T00:46:00.2816810Z echo "docker pull ghcr.io/pytorch/ci-image:${tag/:/-}" 2025-10-10T00:46:00.2825913Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:46:00.2826283Z env: 2025-10-10T00:46:00.2826493Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:46:00.2827278Z ECR_DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:46:00.2828061Z ##[endgroup] 2025-10-10T00:46:00.2860754Z docker pull ghcr.io/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:46:00.2913108Z ##[group]Run pytorch/test-infra/.github/actions/pull-docker-image@main 2025-10-10T00:46:00.2913517Z with: 2025-10-10T00:46:00.2914236Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:46:00.2915101Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:46:00.2915458Z env: 2025-10-10T00:46:00.2915670Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:46:00.2916105Z ##[endgroup] 2025-10-10T00:46:00.2940738Z ##[group]Run set -x 2025-10-10T00:46:00.2940998Z set -x 2025-10-10T00:46:00.2941232Z set +e 2025-10-10T00:46:00.2941458Z  2025-10-10T00:46:00.2941676Z login() { 2025-10-10T00:46:00.2942176Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2025-10-10T00:46:00.2942918Z } 2025-10-10T00:46:00.2943235Z  2025-10-10T00:46:00.2955468Z retry () { 2025-10-10T00:46:00.2955752Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2025-10-10T00:46:00.2956065Z } 2025-10-10T00:46:00.2956296Z  2025-10-10T00:46:00.2956534Z retry login "${DOCKER_REGISTRY}" 2025-10-10T00:46:00.2956830Z  2025-10-10T00:46:00.2957300Z IMAGE_SIZE=$(docker manifest inspect "${DOCKER_IMAGE}" | jq '[.layers[].size, .config.size] | add / 1024 / 1024') 2025-10-10T00:46:00.2957927Z echo "Compressed size of image in MB: ${IMAGE_SIZE}" 2025-10-10T00:46:00.2958306Z  2025-10-10T00:46:00.2958531Z set -e 2025-10-10T00:46:00.2958866Z # ignore output since only exit code is used for conditional 2025-10-10T00:46:00.2959343Z # only pull docker image if it's not available locally 2025-10-10T00:46:00.2959868Z if ! docker inspect --type=image "${DOCKER_IMAGE}" >/dev/null 2>/dev/null; then 2025-10-10T00:46:00.2960363Z  retry docker pull "${DOCKER_IMAGE}" 2025-10-10T00:46:00.2960667Z fi 2025-10-10T00:46:00.2969117Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:46:00.2969473Z env: 2025-10-10T00:46:00.2969695Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:46:00.2970466Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:46:00.2971331Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:46:00.2971706Z ##[endgroup] 2025-10-10T00:46:00.3004915Z + set +e 2025-10-10T00:46:00.3005353Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:46:00.3005788Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:46:00.3009527Z + aws ecr get-login-password --region us-east-1 2025-10-10T00:46:00.3010229Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-10-10T00:46:00.8325403Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2025-10-10T00:46:00.8326563Z Configure a credential helper to remove this warning. See 2025-10-10T00:46:00.8327721Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2025-10-10T00:46:00.8328478Z 2025-10-10T00:46:00.8328843Z Login Succeeded 2025-10-10T00:46:00.8360737Z ++ jq '[.layers[].size, .config.size] | add / 1024 / 1024' 2025-10-10T00:46:00.8361895Z ++ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:46:01.1105636Z + IMAGE_SIZE=15115.717358589172 2025-10-10T00:46:01.1106113Z Compressed size of image in MB: 15115.717358589172 2025-10-10T00:46:01.1106667Z + echo 'Compressed size of image in MB: 15115.717358589172' 2025-10-10T00:46:01.1107027Z + set -e 2025-10-10T00:46:01.1108113Z + docker inspect --type=image 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:46:01.1248082Z + retry docker pull 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:46:01.1249491Z + docker pull 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:46:01.3812458Z pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac: Pulling from pytorch/ci-image 2025-10-10T00:46:01.3814454Z 828c1365039a: Pulling fs layer 2025-10-10T00:46:01.3814864Z bb2a9ef82ad2: Pulling fs layer 2025-10-10T00:46:01.3815285Z 0f76777182a7: Pulling fs layer 2025-10-10T00:46:01.3815675Z 5c003cead6fc: Pulling fs layer 2025-10-10T00:46:01.3816073Z 0c32f5cb2c47: Pulling fs layer 2025-10-10T00:46:01.3816441Z 70689f59dfb3: Pulling fs layer 2025-10-10T00:46:01.3817016Z e0e260d97e48: Pulling fs layer 2025-10-10T00:46:01.3817398Z 447b9f684119: Pulling fs layer 2025-10-10T00:46:01.3817808Z fc4a90a2e4be: Pulling fs layer 2025-10-10T00:46:01.3818209Z bddc1d18a3d6: Pulling fs layer 2025-10-10T00:46:01.3818606Z 6541008b1d6a: Pulling fs layer 2025-10-10T00:46:01.3819015Z 954a2d7b7a43: Pulling fs layer 2025-10-10T00:46:01.3819395Z 757b602ab7ba: Pulling fs layer 2025-10-10T00:46:01.3819756Z 7a66b5634157: Pulling fs layer 2025-10-10T00:46:01.3820127Z 4f4fb700ef54: Pulling fs layer 2025-10-10T00:46:01.3820499Z a272ef38f8ce: Pulling fs layer 2025-10-10T00:46:01.3820885Z a4e16af38ef9: Pulling fs layer 2025-10-10T00:46:01.3821243Z 015ebdb069f7: Pulling fs layer 2025-10-10T00:46:01.3821602Z 774f7b49312f: Pulling fs layer 2025-10-10T00:46:01.3821962Z 0424a3772840: Pulling fs layer 2025-10-10T00:46:01.3822239Z 4355c82d0b07: Pulling fs layer 2025-10-10T00:46:01.3822528Z 540f54cb26f4: Pulling fs layer 2025-10-10T00:46:01.3822923Z fb3109e064fb: Pulling fs layer 2025-10-10T00:46:01.3823288Z b97b9f73a17c: Pulling fs layer 2025-10-10T00:46:01.3823641Z 5c003cead6fc: Waiting 2025-10-10T00:46:01.3823946Z e0e260d97e48: Waiting 2025-10-10T00:46:01.3824271Z 916d375b255a: Pulling fs layer 2025-10-10T00:46:01.3824629Z fc4a90a2e4be: Waiting 2025-10-10T00:46:01.3824965Z dfbfc2d91df6: Pulling fs layer 2025-10-10T00:46:01.3825365Z 012785c1f3f5: Pulling fs layer 2025-10-10T00:46:01.3825700Z 0c32f5cb2c47: Waiting 2025-10-10T00:46:01.3826122Z aaef93273d5d: Pulling fs layer 2025-10-10T00:46:01.3826495Z 757b602ab7ba: Waiting 2025-10-10T00:46:01.3826814Z bddc1d18a3d6: Waiting 2025-10-10T00:46:01.3827113Z 7a66b5634157: Waiting 2025-10-10T00:46:01.3827430Z 732deb59e36e: Pulling fs layer 2025-10-10T00:46:01.3827783Z a272ef38f8ce: Waiting 2025-10-10T00:46:01.3828112Z 8d7304dd5719: Pulling fs layer 2025-10-10T00:46:01.3828539Z 9427facc4548: Pulling fs layer 2025-10-10T00:46:01.3828889Z 954a2d7b7a43: Waiting 2025-10-10T00:46:01.3829201Z a4e16af38ef9: Waiting 2025-10-10T00:46:01.3829514Z 094a1b5be591: Pulling fs layer 2025-10-10T00:46:01.3829787Z 3194bad72082: Pulling fs layer 2025-10-10T00:46:01.3830045Z 447b9f684119: Waiting 2025-10-10T00:46:01.3830279Z 015ebdb069f7: Waiting 2025-10-10T00:46:01.3830517Z 6541008b1d6a: Waiting 2025-10-10T00:46:01.3830739Z 774f7b49312f: Waiting 2025-10-10T00:46:01.3830986Z 363217a81a98: Pulling fs layer 2025-10-10T00:46:01.3831249Z 4f4fb700ef54: Waiting 2025-10-10T00:46:01.3831476Z 0424a3772840: Waiting 2025-10-10T00:46:01.3831721Z e3c423926c14: Pulling fs layer 2025-10-10T00:46:01.3831983Z fb3109e064fb: Waiting 2025-10-10T00:46:01.3832237Z b0da2665c548: Pulling fs layer 2025-10-10T00:46:01.3832494Z 732deb59e36e: Waiting 2025-10-10T00:46:01.3832743Z 8d7304dd5719: Waiting 2025-10-10T00:46:01.3833023Z aaef93273d5d: Waiting 2025-10-10T00:46:01.3833273Z 2348e9e015bf: Pulling fs layer 2025-10-10T00:46:01.3833603Z 094a1b5be591: Waiting 2025-10-10T00:46:01.3833839Z 540f54cb26f4: Waiting 2025-10-10T00:46:01.3834134Z 2a52776326f7: Pulling fs layer 2025-10-10T00:46:01.3834406Z dfbfc2d91df6: Waiting 2025-10-10T00:46:01.3834931Z 034e6954f81c: Pulling fs layer 2025-10-10T00:46:01.3835202Z 363217a81a98: Waiting 2025-10-10T00:46:01.3835435Z 9427facc4548: Waiting 2025-10-10T00:46:01.3835665Z e3c423926c14: Waiting 2025-10-10T00:46:01.3835901Z 846309eb8d41: Pulling fs layer 2025-10-10T00:46:01.3836160Z 012785c1f3f5: Waiting 2025-10-10T00:46:01.3836402Z d55a9a44329d: Pulling fs layer 2025-10-10T00:46:01.3836665Z 2a52776326f7: Waiting 2025-10-10T00:46:01.3836888Z 70689f59dfb3: Waiting 2025-10-10T00:46:01.3837132Z 95a331d8771d: Pulling fs layer 2025-10-10T00:46:01.3837499Z 1e8ec4d415bd: Pulling fs layer 2025-10-10T00:46:01.3837769Z d55a9a44329d: Waiting 2025-10-10T00:46:01.3838011Z f201e431abd3: Pulling fs layer 2025-10-10T00:46:01.3838273Z 1e8ec4d415bd: Waiting 2025-10-10T00:46:01.3838512Z 2348e9e015bf: Waiting 2025-10-10T00:46:01.3838745Z 95a331d8771d: Waiting 2025-10-10T00:46:01.3838972Z f201e431abd3: Waiting 2025-10-10T00:46:01.3839206Z 916d375b255a: Waiting 2025-10-10T00:46:01.3839453Z 3194bad72082: Waiting 2025-10-10T00:46:01.3839697Z 034e6954f81c: Waiting 2025-10-10T00:46:01.3839930Z 846309eb8d41: Waiting 2025-10-10T00:46:01.3840174Z feab5475b34a: Pulling fs layer 2025-10-10T00:46:01.3840447Z feab5475b34a: Waiting 2025-10-10T00:46:01.3840694Z 378fd2234507: Pulling fs layer 2025-10-10T00:46:01.3840977Z 1fae04b5f599: Pulling fs layer 2025-10-10T00:46:01.3841234Z 378fd2234507: Waiting 2025-10-10T00:46:01.3841481Z f659162a95c5: Pulling fs layer 2025-10-10T00:46:01.3841750Z 1fae04b5f599: Waiting 2025-10-10T00:46:01.3842048Z b0da2665c548: Waiting 2025-10-10T00:46:01.3842293Z 2058b9a977f2: Pulling fs layer 2025-10-10T00:46:01.3842571Z dab6b0cf2780: Pulling fs layer 2025-10-10T00:46:01.3842880Z 2058b9a977f2: Waiting 2025-10-10T00:46:01.3843135Z 80b892828c20: Pulling fs layer 2025-10-10T00:46:01.3843391Z dab6b0cf2780: Waiting 2025-10-10T00:46:01.3843636Z 4355c82d0b07: Waiting 2025-10-10T00:46:01.3843883Z 39c871222ef7: Pulling fs layer 2025-10-10T00:46:01.3844158Z 80b892828c20: Waiting 2025-10-10T00:46:01.3844382Z f659162a95c5: Waiting 2025-10-10T00:46:01.3844631Z 435d0b761719: Pulling fs layer 2025-10-10T00:46:01.3844904Z eceb7605f731: Pulling fs layer 2025-10-10T00:46:01.3845166Z 39c871222ef7: Waiting 2025-10-10T00:46:01.3845392Z 435d0b761719: Waiting 2025-10-10T00:46:01.3845623Z eceb7605f731: Waiting 2025-10-10T00:46:01.3845869Z a304fa6d1d55: Pulling fs layer 2025-10-10T00:46:01.3846148Z dddcb55ac23c: Pulling fs layer 2025-10-10T00:46:01.3846422Z 13fafff4fe6d: Pulling fs layer 2025-10-10T00:46:01.3846703Z 99bbffa40cd4: Pulling fs layer 2025-10-10T00:46:01.3846986Z 8e33d89cd243: Pulling fs layer 2025-10-10T00:46:01.3847383Z 631373e616c1: Pulling fs layer 2025-10-10T00:46:01.3847639Z a304fa6d1d55: Waiting 2025-10-10T00:46:01.3847890Z e481af7e1aec: Pulling fs layer 2025-10-10T00:46:01.3848163Z 13fafff4fe6d: Waiting 2025-10-10T00:46:01.3848412Z 8ca97272458c: Pulling fs layer 2025-10-10T00:46:01.3848670Z 99bbffa40cd4: Waiting 2025-10-10T00:46:01.3848920Z 55788cbe41f5: Pulling fs layer 2025-10-10T00:46:01.3849194Z cbf6e78aa372: Pulling fs layer 2025-10-10T00:46:01.3849522Z e481af7e1aec: Waiting 2025-10-10T00:46:01.3849757Z dddcb55ac23c: Waiting 2025-10-10T00:46:01.3850042Z 8ca97272458c: Waiting 2025-10-10T00:46:01.3850274Z 631373e616c1: Waiting 2025-10-10T00:46:01.3850506Z 55788cbe41f5: Waiting 2025-10-10T00:46:01.3850790Z b7e76793f784: Pulling fs layer 2025-10-10T00:46:01.3851053Z 8e33d89cd243: Waiting 2025-10-10T00:46:01.3851469Z cbf6e78aa372: Waiting 2025-10-10T00:46:01.4809848Z bb2a9ef82ad2: Download complete 2025-10-10T00:46:01.5567627Z 5c003cead6fc: Download complete 2025-10-10T00:46:01.6627043Z 0c32f5cb2c47: Download complete 2025-10-10T00:46:01.7469535Z 828c1365039a: Verifying Checksum 2025-10-10T00:46:01.7469891Z 828c1365039a: Download complete 2025-10-10T00:46:01.7599364Z 70689f59dfb3: Download complete 2025-10-10T00:46:01.8438063Z 447b9f684119: Verifying Checksum 2025-10-10T00:46:01.8438410Z 447b9f684119: Download complete 2025-10-10T00:46:01.8505793Z e0e260d97e48: Download complete 2025-10-10T00:46:01.9438607Z bddc1d18a3d6: Verifying Checksum 2025-10-10T00:46:01.9439466Z bddc1d18a3d6: Download complete 2025-10-10T00:46:02.0403498Z 6541008b1d6a: Verifying Checksum 2025-10-10T00:46:02.0403907Z 6541008b1d6a: Download complete 2025-10-10T00:46:02.1176637Z 954a2d7b7a43: Verifying Checksum 2025-10-10T00:46:02.2078326Z 757b602ab7ba: Verifying Checksum 2025-10-10T00:46:02.2078829Z 757b602ab7ba: Download complete 2025-10-10T00:46:02.9403221Z 828c1365039a: Pull complete 2025-10-10T00:46:02.9626672Z bb2a9ef82ad2: Pull complete 2025-10-10T00:46:03.0126009Z fc4a90a2e4be: Verifying Checksum 2025-10-10T00:46:03.0126647Z fc4a90a2e4be: Download complete 2025-10-10T00:46:03.0228357Z 4f4fb700ef54: Download complete 2025-10-10T00:46:03.1106526Z a272ef38f8ce: Download complete 2025-10-10T00:46:03.1998745Z a4e16af38ef9: Download complete 2025-10-10T00:46:03.2550891Z 015ebdb069f7: Verifying Checksum 2025-10-10T00:46:03.2551241Z 015ebdb069f7: Download complete 2025-10-10T00:46:03.3728808Z 774f7b49312f: Verifying Checksum 2025-10-10T00:46:03.3729313Z 774f7b49312f: Download complete 2025-10-10T00:46:03.4789681Z 0424a3772840: Download complete 2025-10-10T00:46:03.5565854Z 4355c82d0b07: Verifying Checksum 2025-10-10T00:46:03.5566194Z 4355c82d0b07: Download complete 2025-10-10T00:46:03.6431806Z 540f54cb26f4: Verifying Checksum 2025-10-10T00:46:03.6432117Z 540f54cb26f4: Download complete 2025-10-10T00:46:03.7174694Z fb3109e064fb: Verifying Checksum 2025-10-10T00:46:03.7175221Z fb3109e064fb: Download complete 2025-10-10T00:46:04.5817835Z 0f76777182a7: Verifying Checksum 2025-10-10T00:46:04.5818229Z 0f76777182a7: Download complete 2025-10-10T00:46:04.6692067Z 916d375b255a: Verifying Checksum 2025-10-10T00:46:04.6692554Z 916d375b255a: Download complete 2025-10-10T00:46:05.0801989Z dfbfc2d91df6: Verifying Checksum 2025-10-10T00:46:05.0802366Z dfbfc2d91df6: Download complete 2025-10-10T00:46:05.1793995Z 012785c1f3f5: Verifying Checksum 2025-10-10T00:46:05.1794357Z 012785c1f3f5: Download complete 2025-10-10T00:46:05.2639770Z aaef93273d5d: Verifying Checksum 2025-10-10T00:46:05.2640162Z aaef93273d5d: Download complete 2025-10-10T00:46:09.8539730Z 732deb59e36e: Verifying Checksum 2025-10-10T00:46:09.8540205Z 732deb59e36e: Download complete 2025-10-10T00:46:09.9449007Z 8d7304dd5719: Download complete 2025-10-10T00:46:10.0126882Z 9427facc4548: Download complete 2025-10-10T00:46:10.0938027Z 094a1b5be591: Verifying Checksum 2025-10-10T00:46:10.0938489Z 094a1b5be591: Download complete 2025-10-10T00:46:10.1562022Z 3194bad72082: Verifying Checksum 2025-10-10T00:46:10.1562427Z 3194bad72082: Download complete 2025-10-10T00:46:10.3929159Z 363217a81a98: Verifying Checksum 2025-10-10T00:46:10.3929503Z 363217a81a98: Download complete 2025-10-10T00:46:10.4555265Z e3c423926c14: Download complete 2025-10-10T00:46:10.5235666Z b0da2665c548: Download complete 2025-10-10T00:46:10.5904197Z 2348e9e015bf: Verifying Checksum 2025-10-10T00:46:10.5904528Z 2348e9e015bf: Download complete 2025-10-10T00:46:10.6974278Z 2a52776326f7: Verifying Checksum 2025-10-10T00:46:10.6974784Z 2a52776326f7: Download complete 2025-10-10T00:46:10.7775745Z 034e6954f81c: Download complete 2025-10-10T00:46:10.8839640Z 846309eb8d41: Download complete 2025-10-10T00:46:14.0833584Z d55a9a44329d: Verifying Checksum 2025-10-10T00:46:14.0833956Z d55a9a44329d: Download complete 2025-10-10T00:46:14.6430582Z 0f76777182a7: Pull complete 2025-10-10T00:46:14.8390560Z 5c003cead6fc: Pull complete 2025-10-10T00:46:15.0389452Z 0c32f5cb2c47: Pull complete 2025-10-10T00:46:15.2468012Z 70689f59dfb3: Pull complete 2025-10-10T00:46:15.4661542Z e0e260d97e48: Pull complete 2025-10-10T00:46:15.6787007Z 447b9f684119: Pull complete 2025-10-10T00:46:18.3831147Z fc4a90a2e4be: Pull complete 2025-10-10T00:46:18.5173552Z bddc1d18a3d6: Pull complete 2025-10-10T00:46:18.6474460Z 6541008b1d6a: Pull complete 2025-10-10T00:46:18.7624127Z 954a2d7b7a43: Pull complete 2025-10-10T00:46:18.8968456Z 757b602ab7ba: Pull complete 2025-10-10T00:46:38.1101947Z 7a66b5634157: Verifying Checksum 2025-10-10T00:46:38.1102385Z 7a66b5634157: Download complete 2025-10-10T00:46:38.2028913Z 1e8ec4d415bd: Download complete 2025-10-10T00:46:38.2908192Z f201e431abd3: Verifying Checksum 2025-10-10T00:46:38.2908542Z f201e431abd3: Download complete 2025-10-10T00:46:38.3680656Z feab5475b34a: Download complete 2025-10-10T00:46:38.4574992Z 378fd2234507: Download complete 2025-10-10T00:46:38.5307638Z 1fae04b5f599: Verifying Checksum 2025-10-10T00:46:38.5307995Z 1fae04b5f599: Download complete 2025-10-10T00:46:38.6123885Z f659162a95c5: Verifying Checksum 2025-10-10T00:46:38.6124322Z f659162a95c5: Download complete 2025-10-10T00:46:38.6804300Z 2058b9a977f2: Verifying Checksum 2025-10-10T00:46:38.6804708Z 2058b9a977f2: Download complete 2025-10-10T00:46:38.7695175Z dab6b0cf2780: Download complete 2025-10-10T00:46:38.8532784Z 80b892828c20: Verifying Checksum 2025-10-10T00:46:38.8533202Z 80b892828c20: Download complete 2025-10-10T00:46:38.9378465Z 39c871222ef7: Verifying Checksum 2025-10-10T00:46:38.9378943Z 39c871222ef7: Download complete 2025-10-10T00:46:39.0187118Z 435d0b761719: Verifying Checksum 2025-10-10T00:46:39.0187507Z 435d0b761719: Download complete 2025-10-10T00:46:39.1002375Z eceb7605f731: Verifying Checksum 2025-10-10T00:46:39.1002789Z eceb7605f731: Download complete 2025-10-10T00:46:39.1673557Z a304fa6d1d55: Download complete 2025-10-10T00:46:39.2497987Z dddcb55ac23c: Download complete 2025-10-10T00:46:41.7609892Z 13fafff4fe6d: Verifying Checksum 2025-10-10T00:46:41.7610259Z 13fafff4fe6d: Download complete 2025-10-10T00:46:41.8370032Z 99bbffa40cd4: Verifying Checksum 2025-10-10T00:46:41.8370388Z 99bbffa40cd4: Download complete 2025-10-10T00:46:41.8968816Z 8e33d89cd243: Verifying Checksum 2025-10-10T00:46:41.8969175Z 8e33d89cd243: Download complete 2025-10-10T00:46:41.9924720Z 631373e616c1: Verifying Checksum 2025-10-10T00:46:41.9927382Z 631373e616c1: Download complete 2025-10-10T00:46:42.0613708Z e481af7e1aec: Verifying Checksum 2025-10-10T00:46:42.0614199Z e481af7e1aec: Download complete 2025-10-10T00:46:42.1366682Z 8ca97272458c: Verifying Checksum 2025-10-10T00:46:42.1367110Z 8ca97272458c: Download complete 2025-10-10T00:46:42.3877968Z 55788cbe41f5: Verifying Checksum 2025-10-10T00:46:42.3878454Z 55788cbe41f5: Download complete 2025-10-10T00:46:42.4726974Z cbf6e78aa372: Verifying Checksum 2025-10-10T00:46:42.4727747Z cbf6e78aa372: Download complete 2025-10-10T00:46:43.4980058Z b7e76793f784: Verifying Checksum 2025-10-10T00:46:43.4980427Z b7e76793f784: Download complete 2025-10-10T00:46:45.5126185Z 95a331d8771d: Download complete 2025-10-10T00:47:23.1795919Z b97b9f73a17c: Verifying Checksum 2025-10-10T00:47:23.1796285Z b97b9f73a17c: Download complete 2025-10-10T00:47:55.6537697Z 7a66b5634157: Pull complete 2025-10-10T00:47:55.8715659Z 4f4fb700ef54: Pull complete 2025-10-10T00:47:56.0814939Z a272ef38f8ce: Pull complete 2025-10-10T00:47:56.3206647Z a4e16af38ef9: Pull complete 2025-10-10T00:47:56.5248634Z 015ebdb069f7: Pull complete 2025-10-10T00:47:56.7984615Z 774f7b49312f: Pull complete 2025-10-10T00:47:57.0192631Z 0424a3772840: Pull complete 2025-10-10T00:47:57.2505035Z 4355c82d0b07: Pull complete 2025-10-10T00:47:57.4728612Z 540f54cb26f4: Pull complete 2025-10-10T00:47:57.6736756Z fb3109e064fb: Pull complete 2025-10-10T00:49:30.3811540Z b97b9f73a17c: Pull complete 2025-10-10T00:49:30.5777906Z 916d375b255a: Pull complete 2025-10-10T00:49:31.2864801Z dfbfc2d91df6: Pull complete 2025-10-10T00:49:31.4976426Z 012785c1f3f5: Pull complete 2025-10-10T00:49:31.6516325Z aaef93273d5d: Pull complete 2025-10-10T00:49:40.5584297Z 732deb59e36e: Pull complete 2025-10-10T00:49:40.7735590Z 8d7304dd5719: Pull complete 2025-10-10T00:49:40.9823863Z 9427facc4548: Pull complete 2025-10-10T00:49:41.4117086Z 094a1b5be591: Pull complete 2025-10-10T00:49:41.6230045Z 3194bad72082: Pull complete 2025-10-10T00:49:42.0142471Z 363217a81a98: Pull complete 2025-10-10T00:49:42.1479782Z e3c423926c14: Pull complete 2025-10-10T00:49:42.2894205Z b0da2665c548: Pull complete 2025-10-10T00:49:42.5521530Z 2348e9e015bf: Pull complete 2025-10-10T00:49:42.7331901Z 2a52776326f7: Pull complete 2025-10-10T00:49:42.9557830Z 034e6954f81c: Pull complete 2025-10-10T00:49:43.3803709Z 846309eb8d41: Pull complete 2025-10-10T00:49:44.8446105Z d55a9a44329d: Pull complete 2025-10-10T00:50:46.2331153Z 95a331d8771d: Pull complete 2025-10-10T00:50:46.2560408Z 1e8ec4d415bd: Pull complete 2025-10-10T00:50:46.2786672Z f201e431abd3: Pull complete 2025-10-10T00:50:46.3226851Z feab5475b34a: Pull complete 2025-10-10T00:50:46.3665898Z 378fd2234507: Pull complete 2025-10-10T00:50:46.3887845Z 1fae04b5f599: Pull complete 2025-10-10T00:50:46.4345420Z f659162a95c5: Pull complete 2025-10-10T00:50:46.4787925Z 2058b9a977f2: Pull complete 2025-10-10T00:50:46.5015685Z dab6b0cf2780: Pull complete 2025-10-10T00:50:46.5472183Z 80b892828c20: Pull complete 2025-10-10T00:50:46.5688916Z 39c871222ef7: Pull complete 2025-10-10T00:50:46.6127129Z 435d0b761719: Pull complete 2025-10-10T00:50:46.6351370Z eceb7605f731: Pull complete 2025-10-10T00:50:46.6791659Z a304fa6d1d55: Pull complete 2025-10-10T00:50:46.7014298Z dddcb55ac23c: Pull complete 2025-10-10T00:50:53.7801498Z 13fafff4fe6d: Pull complete 2025-10-10T00:50:53.8172730Z 99bbffa40cd4: Pull complete 2025-10-10T00:50:53.8512339Z 8e33d89cd243: Pull complete 2025-10-10T00:50:53.8861323Z 631373e616c1: Pull complete 2025-10-10T00:50:53.9205970Z e481af7e1aec: Pull complete 2025-10-10T00:50:53.9568911Z 8ca97272458c: Pull complete 2025-10-10T00:50:55.6422550Z 55788cbe41f5: Pull complete 2025-10-10T00:50:55.8092579Z cbf6e78aa372: Pull complete 2025-10-10T00:50:57.4439373Z b7e76793f784: Pull complete 2025-10-10T00:50:57.6758984Z Digest: sha256:02f790fd120c15e1b00388d4689c10992f2a801ea5dd956622d4a5688e229be6 2025-10-10T00:50:57.7014688Z Status: Downloaded newer image for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:50:57.7051653Z 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:50:57.7115716Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-10-10T00:50:57.7116590Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-10-10T00:50:57.7128523Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:50:57.7128916Z env: 2025-10-10T00:50:57.7129133Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:50:57.7129387Z ##[endgroup] 2025-10-10T00:50:57.7295052Z ##[group]Run pytorch/test-infra/.github/actions/setup-nvidia@main 2025-10-10T00:50:57.7295466Z with: 2025-10-10T00:50:57.7295698Z driver-version: 580.82.07 2025-10-10T00:50:57.7295958Z env: 2025-10-10T00:50:57.7296162Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:50:57.7296417Z ##[endgroup] 2025-10-10T00:50:57.7436782Z ##[group]Run nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482 2025-10-10T00:50:57.7437181Z with: 2025-10-10T00:50:57.7437399Z timeout_minutes: 10 2025-10-10T00:50:57.7437642Z max_attempts: 3 2025-10-10T00:50:57.7462597Z command: # Is it disgusting to have a full shell script here in this github action? Sure # But is it the best way to make it so that this action relies on nothing else? Absolutely set -eou pipefail DISTRIBUTION=$(. /etc/os-release;echo $ID$VERSION_ID) DRIVER_FN="NVIDIA-Linux-x86_64-${DRIVER_VERSION}.run" install_nvidia_docker2_amzn2() { ( set -x # Needed for yum-config-manager sudo yum install -y yum-utils if [[ "${DISTRIBUTION}" == "amzn2023" ]] ; then YUM_REPO_URL="https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo" else # Amazon Linux 2 YUM_REPO_URL="https://nvidia.github.io/nvidia-docker/${DISTRIBUTION}/nvidia-docker.repo" fi sudo yum-config-manager --add-repo "${YUM_REPO_URL}" sudo yum install -y \ nvidia-container-toolkit-1.17.8 \ libnvidia-container-tools-1.17.8 \ libnvidia-container1-1.17.8 \ nvidia-container-toolkit-base-1.17.8 sudo systemctl restart docker ) } install_nvidia_docker2_ubuntu20() { ( set -x # Install nvidia-driver package if not installed status="$(dpkg-query -W --showformat='${db:Status-Status}' nvidia-docker2 2>&1)" if [ ! $? = 0 ] || [ ! "$status" = installed ]; then sudo apt-get install -y nvidia-container-toolkit-1.17.8 sudo systemctl restart docker fi ) } pre_install_nvidia_driver_amzn2() { ( # Purge any nvidia driver installed from RHEL repo sudo yum remove -y nvidia-driver-latest-dkms ) } install_nvidia_driver_common() { ( # Try to gather more information about the runner and its existing NVIDIA driver if any echo "Before installing NVIDIA driver" lspci lsmod modinfo nvidia || true HAS_NVIDIA_DRIVER=0 # Check if NVIDIA driver has already been installed if [ -x "$(command -v nvidia-smi)" ]; then set +e # The driver exists, check its version next. Also check only the first GPU if there are more than one of them # so that the same driver version is not print over multiple lines INSTALLED_DRIVER_VERSION=$(nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0) NVIDIA_SMI_STATUS=$? if [ "$NVIDIA_SMI_STATUS" -ne 0 ] && [ "$NVIDIA_SMI_STATUS" -ne 14 ]; then echo "Failed to get NVIDIA driver version ($INSTALLED_DRIVER_VERSION). Continuing" elif [ "$INSTALLED_DRIVER_VERSION" != "$DRIVER_VERSION" ]; then echo "NVIDIA driver ($INSTALLED_DRIVER_VERSION) has been installed, but we expect to have $DRIVER_VERSION instead. Continuing" # Turn off persistent mode so that the installation script can unload the kernel module sudo killall nvidia-persistenced || true else HAS_NVIDIA_DRIVER=1 echo "NVIDIA driver ($INSTALLED_DRIVER_VERSION) has already been installed. Skipping NVIDIA driver installation" fi set -e fi if [ "$HAS_NVIDIA_DRIVER" -eq 0 ]; then # CAUTION: this may need to be updated in future if [ "${DISTRIBUTION}" != ubuntu20.04 ]; then sudo yum groupinstall -y "Development Tools" # ensure our kernel install is the same as our underlying kernel, # groupinstall "Development Tools" has a habit of mismatching kernel headers sudo yum install -y "kernel-devel-uname-r == $(uname -r)" sudo modprobe backlight fi sudo curl -fsL -o /tmp/nvidia_driver "https://s3.amazonaws.com/ossci-linux/nvidia_driver/$DRIVER_FN" set +e sudo /bin/bash /tmp/nvidia_driver -s --no-drm NVIDIA_INSTALLATION_STATUS=$? RESET_GPU=0 if [ "$NVIDIA_INSTALLATION_STATUS" -ne 0 ]; then sudo cat /var/log/nvidia-installer.log # Fail to install NVIDIA driver, try to reset the GPU RESET_GPU=1 elif [ -x "$(command -v nvidia-smi)" ]; then # Check again if nvidia-smi works even if the driver installation completes successfully INSTALLED_DRIVER_VERSION=$(nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0) NVIDIA_SMI_STATUS=$? if [ "$NVIDIA_SMI_STATUS" -ne 0 ] && [ "$NVIDIA_SMI_STATUS" -ne 14 ]; then RESET_GPU=1 fi fi if [ "$RESET_GPU" -eq 1 ]; then NVIDIA_DEVICES=$(lspci -D | grep -i NVIDIA | cut -d' ' -f1) # The GPU can get stuck in a failure state if somehow the test crashs the GPU microcode. When this # happens, we'll try to reset all NVIDIA devices https://github.com/pytorch/pytorch/issues/88388 for PCI_ID in $NVIDIA_DEVICES; do DEVICE_ENABLED=$(cat /sys/bus/pci/devices/$PCI_ID/enable) echo "Reseting $PCI_ID (enabled state: $DEVICE_ENABLED)" # This requires sudo permission of course echo "1" | sudo tee /sys/bus/pci/devices/$PCI_ID/reset sleep 1 done fi sudo rm -fv /tmp/nvidia_driver set -e fi ) } post_install_nvidia_driver_common() { ( sudo modprobe nvidia || true echo "After installing NVIDIA driver" lspci lsmod modinfo nvidia || true ( set +e nvidia-smi # NB: Annoyingly, nvidia-smi command returns successfully with return code 0 even in # the case where the driver has already crashed as it still can get the driver version # and some basic information like the bus ID. However, the rest of the information # would be missing (ERR!), for example: # # +-----------------------------------------------------------------------------+ # | NVIDIA-SMI 525.89.02 Driver Version: 525.89.02 CUDA Version: 12.0 | # |-------------------------------+----------------------+----------------------+ # | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | # | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | # | | | MIG M. | # |===============================+======================+======================| # | 0 ERR! Off | 00000000:00:1E.0 Off | ERR! | # |ERR! ERR! ERR! ERR! / ERR! | 4184MiB / 23028MiB | ERR! Default | # | | | ERR! | # +-------------------------------+----------------------+----------------------+ # # +-----------------------------------------------------------------------------+ # | Processes: | # | GPU GI CI PID Type Process name GPU Memory | # | ID ID Usage | # |=============================================================================| # +-----------------------------------------------------------------------------+ # # This should be reported as a failure instead as it will guarantee to fail when # Docker tries to run with --gpus all # # So, the correct check here is to query one of the missing piece of info like # GPU name, so that the command can fail accordingly nvidia-smi --query-gpu=gpu_name --format=csv,noheader --id=0 NVIDIA_SMI_STATUS=$? # Allowable exit statuses for nvidia-smi, see: https://github.com/NVIDIA/gpu-operator/issues/285 if [ "$NVIDIA_SMI_STATUS" -eq 0 ] || [ "$NVIDIA_SMI_STATUS" -eq 14 ]; then echo "INFO: Ignoring allowed status ${NVIDIA_SMI_STATUS}" else echo "ERROR: nvidia-smi exited with unresolved status ${NVIDIA_SMI_STATUS}" exit ${NVIDIA_SMI_STATUS} fi set -e ) ) } install_nvidia_driver_amzn2() { ( set -x pre_install_nvidia_driver_amzn2 install_nvidia_driver_common post_install_nvidia_driver_common ) } install_nvidia_driver_ubuntu20() { ( set -x install_nvidia_driver_common post_install_nvidia_driver_common ) } echo "== Installing nvidia driver ${DRIVER_FN} ==" case "${DISTRIBUTION}" in amzn*) install_nvidia_driver_amzn2 ;; ubuntu20.04) install_nvidia_driver_ubuntu20 ;; *) echo "ERROR: Unknown distribution ${DISTRIBUTION}" exit 1 ;; esac # Install container toolkit based on distribution echo "== Installing nvidia container toolkit for ${DISTRIBUTION} ==" case "${DISTRIBUTION}" in amzn*) install_nvidia_docker2_amzn2 ;; ubuntu20.04) install_nvidia_docker2_ubuntu20 ;; *) echo "ERROR: Unknown distribution ${DISTRIBUTION}" exit 1 ;; esac echo "GPU_FLAG=--gpus all -e NVIDIA_DRIVER_CAPABILITIES=all" >> "${GITHUB_ENV}" # Fix https://github.com/NVIDIA/nvidia-docker/issues/1648 on runners with # more than one GPUs. This just needs to be run once. The command fails # on subsequent runs and complains that the mode is already on, but that's # ok sudo nvidia-persistenced || true # This should show persistence mode ON nvidia-smi # check if the container-toolkit is correctly installed and CUDA is available inside a container docker run --rm -t --gpus=all public.ecr.aws/docker/library/python:3.13 nvidia-smi 2025-10-10T00:50:57.7486898Z retry_wait_seconds: 10 2025-10-10T00:50:57.7487165Z polling_interval_seconds: 1 2025-10-10T00:50:57.7487434Z warning_on_retry: true 2025-10-10T00:50:57.7487677Z continue_on_error: false 2025-10-10T00:50:57.7487920Z env: 2025-10-10T00:50:57.7488123Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:50:57.7488372Z DRIVER_VERSION: 580.82.07 2025-10-10T00:50:57.7488613Z ##[endgroup] 2025-10-10T00:50:57.8690942Z == Installing nvidia driver NVIDIA-Linux-x86_64-580.82.07.run == 2025-10-10T00:50:57.8691872Z + pre_install_nvidia_driver_amzn2 2025-10-10T00:50:57.8697135Z + sudo yum remove -y nvidia-driver-latest-dkms 2025-10-10T00:50:58.4589915Z No match for argument: nvidia-driver-latest-dkms 2025-10-10T00:50:58.4590540Z No packages marked for removal. 2025-10-10T00:50:58.4660065Z Dependencies resolved. 2025-10-10T00:50:58.4670858Z Nothing to do. 2025-10-10T00:50:58.4671239Z Complete! 2025-10-10T00:50:58.5555310Z + install_nvidia_driver_common 2025-10-10T00:50:58.5559750Z + echo 'Before installing NVIDIA driver' 2025-10-10T00:50:58.5560073Z + lspci 2025-10-10T00:50:58.5561682Z Before installing NVIDIA driver 2025-10-10T00:50:58.6775173Z 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] 2025-10-10T00:50:58.6775683Z 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 2025-10-10T00:50:58.6776238Z 00:01.3 Non-VGA unclassified device: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08) 2025-10-10T00:50:58.6776988Z 00:03.0 VGA compatible controller: Amazon.com, Inc. Device 1111 2025-10-10T00:50:58.6777496Z 00:04.0 Non-Volatile memory controller: Amazon.com, Inc. NVMe EBS Controller 2025-10-10T00:50:58.6778026Z 00:05.0 Ethernet controller: Amazon.com, Inc. Elastic Network Adapter (ENA) 2025-10-10T00:50:58.6778513Z 00:1e.0 3D controller: NVIDIA Corporation GA102GL [A10G] (rev a1) 2025-10-10T00:50:58.6778998Z 00:1f.0 Non-Volatile memory controller: Amazon.com, Inc. NVMe SSD Controller 2025-10-10T00:50:58.6779431Z + lsmod 2025-10-10T00:50:58.6833105Z Module Size Used by 2025-10-10T00:50:58.6835704Z nvidia_uvm 1925120 0 2025-10-10T00:50:58.6836162Z nvidia 14286848 1 nvidia_uvm 2025-10-10T00:50:58.6836464Z drm 602112 1 nvidia 2025-10-10T00:50:58.6836782Z drm_panel_orientation_quirks 32768 1 drm 2025-10-10T00:50:58.6837105Z backlight 24576 1 drm 2025-10-10T00:50:58.6837487Z i2c_core 110592 2 nvidia,drm 2025-10-10T00:50:58.6837869Z xt_conntrack 16384 1 2025-10-10T00:50:58.6838222Z nft_chain_nat 16384 3 2025-10-10T00:50:58.6838517Z xt_MASQUERADE 20480 1 2025-10-10T00:50:58.6838835Z nf_nat 57344 2 nft_chain_nat,xt_MASQUERADE 2025-10-10T00:50:58.6839180Z nf_conntrack_netlink 57344 0 2025-10-10T00:50:58.6839600Z nf_conntrack 184320 4 xt_conntrack,nf_nat,nf_conntrack_netlink,xt_MASQUERADE 2025-10-10T00:50:58.6840054Z nf_defrag_ipv6 24576 1 nf_conntrack 2025-10-10T00:50:58.6840383Z nf_defrag_ipv4 16384 1 nf_conntrack 2025-10-10T00:50:58.6840688Z xfrm_user 57344 1 2025-10-10T00:50:58.6841025Z xfrm_algo 16384 1 xfrm_user 2025-10-10T00:50:58.6841336Z xt_addrtype 16384 2 2025-10-10T00:50:58.6841610Z nft_compat 20480 4 2025-10-10T00:50:58.6841925Z nf_tables 311296 57 nft_compat,nft_chain_nat 2025-10-10T00:50:58.6842356Z nfnetlink 20480 4 nft_compat,nf_conntrack_netlink,nf_tables 2025-10-10T00:50:58.6842751Z br_netfilter 36864 0 2025-10-10T00:50:58.6843046Z bridge 323584 1 br_netfilter 2025-10-10T00:50:58.6843619Z stp 16384 1 bridge 2025-10-10T00:50:58.6843929Z llc 16384 2 bridge,stp 2025-10-10T00:50:58.6844228Z overlay 167936 0 2025-10-10T00:50:58.6844494Z tls 139264 0 2025-10-10T00:50:58.6844755Z nls_ascii 16384 1 2025-10-10T00:50:58.6845023Z nls_cp437 20480 1 2025-10-10T00:50:58.6845302Z vfat 24576 1 2025-10-10T00:50:58.6845571Z fat 86016 1 vfat 2025-10-10T00:50:58.6845848Z sunrpc 700416 1 2025-10-10T00:50:58.6846113Z i8042 45056 0 2025-10-10T00:50:58.6846372Z ena 184320 0 2025-10-10T00:50:58.6846644Z serio 28672 3 i8042 2025-10-10T00:50:58.6847038Z button 24576 0 2025-10-10T00:50:58.6847312Z ghash_clmulni_intel 16384 0 2025-10-10T00:50:58.6847595Z sch_fq_codel 20480 17 2025-10-10T00:50:58.6847871Z fuse 184320 1 2025-10-10T00:50:58.6848136Z dm_mod 188416 0 2025-10-10T00:50:58.6848401Z configfs 57344 1 2025-10-10T00:50:58.6848668Z loop 36864 0 2025-10-10T00:50:58.6848935Z dmi_sysfs 20480 0 2025-10-10T00:50:58.6849199Z crc32_pclmul 16384 0 2025-10-10T00:50:58.6849470Z crc32c_intel 24576 0 2025-10-10T00:50:58.6849741Z efivarfs 24576 1 2025-10-10T00:50:58.6850012Z + modinfo nvidia 2025-10-10T00:50:58.6860387Z filename: /lib/modules/6.1.150-174.273.amzn2023.x86_64/kernel/drivers/video/nvidia.ko 2025-10-10T00:50:58.6860904Z import_ns: DMA_BUF 2025-10-10T00:50:58.6861167Z alias: char-major-195-* 2025-10-10T00:50:58.6861451Z version: 580.82.07 2025-10-10T00:50:58.6861714Z supported: external 2025-10-10T00:50:58.6861980Z license: Dual MIT/GPL 2025-10-10T00:50:58.6862275Z firmware: nvidia/580.82.07/gsp_tu10x.bin 2025-10-10T00:50:58.6862816Z firmware: nvidia/580.82.07/gsp_ga10x.bin 2025-10-10T00:50:58.6863168Z srcversion: BA7240A71DCF7DC6FE88C1D 2025-10-10T00:50:58.6863518Z alias: of:N*T*Cnvidia,tegra264-displayC* 2025-10-10T00:50:58.6863877Z alias: of:N*T*Cnvidia,tegra264-display 2025-10-10T00:50:58.6864235Z alias: of:N*T*Cnvidia,tegra234-displayC* 2025-10-10T00:50:58.6864592Z alias: of:N*T*Cnvidia,tegra234-display 2025-10-10T00:50:58.6864947Z alias: pci:v000010DEd*sv*sd*bc06sc80i00* 2025-10-10T00:50:58.6865287Z alias: pci:v000010DEd*sv*sd*bc03sc02i00* 2025-10-10T00:50:58.6865634Z alias: pci:v000010DEd*sv*sd*bc03sc00i00* 2025-10-10T00:50:58.6865953Z depends: i2c-core,drm 2025-10-10T00:50:58.6866218Z retpoline: Y 2025-10-10T00:50:58.6866445Z name: nvidia 2025-10-10T00:50:58.6866816Z vermagic: 6.1.150-174.273.amzn2023.x86_64 SMP preempt mod_unload modversions 2025-10-10T00:50:58.6867301Z parm: NvSwitchRegDwords:NvSwitch regkey (charp) 2025-10-10T00:50:58.6867924Z parm: NvSwitchBlacklist:NvSwitchBlacklist=uuid[,uuid...] (charp) 2025-10-10T00:50:58.6868366Z parm: NVreg_ResmanDebugLevel:int 2025-10-10T00:50:58.6868696Z parm: NVreg_RmLogonRC:int 2025-10-10T00:50:58.6869017Z parm: NVreg_ModifyDeviceFiles:int 2025-10-10T00:50:58.6869351Z parm: NVreg_DeviceFileUID:int 2025-10-10T00:50:58.6869662Z parm: NVreg_DeviceFileGID:int 2025-10-10T00:50:58.6869978Z parm: NVreg_DeviceFileMode:int 2025-10-10T00:50:58.6870357Z parm: NVreg_InitializeSystemMemoryAllocations:int 2025-10-10T00:50:58.6870755Z parm: NVreg_UsePageAttributeTable:int 2025-10-10T00:50:58.6871095Z parm: NVreg_EnablePCIeGen3:int 2025-10-10T00:50:58.6871518Z parm: NVreg_EnableMSI:int 2025-10-10T00:50:58.6871844Z parm: NVreg_EnableStreamMemOPs:int 2025-10-10T00:50:58.6872220Z parm: NVreg_RestrictProfilingToAdminUsers:int 2025-10-10T00:50:58.6872645Z parm: NVreg_PreserveVideoMemoryAllocations:int 2025-10-10T00:50:58.6873145Z parm: NVreg_EnableS0ixPowerManagement:int 2025-10-10T00:50:58.6873571Z parm: NVreg_S0ixPowerManagementVideoMemoryThreshold:int 2025-10-10T00:50:58.6873993Z parm: NVreg_DynamicPowerManagement:int 2025-10-10T00:50:58.6874426Z parm: NVreg_DynamicPowerManagementVideoMemoryThreshold:int 2025-10-10T00:50:58.6874844Z parm: NVreg_EnableGpuFirmware:int 2025-10-10T00:50:58.6875195Z parm: NVreg_EnableGpuFirmwareLogs:int 2025-10-10T00:50:58.6875581Z parm: NVreg_OpenRmEnableUnsupportedGpus:int 2025-10-10T00:50:58.6875967Z parm: NVreg_EnableUserNUMAManagement:int 2025-10-10T00:50:58.6876315Z parm: NVreg_MemoryPoolSize:int 2025-10-10T00:50:58.6876651Z parm: NVreg_KMallocHeapMaxSize:int 2025-10-10T00:50:58.6877004Z parm: NVreg_VMallocHeapMaxSize:int 2025-10-10T00:50:58.6877338Z parm: NVreg_IgnoreMMIOCheck:int 2025-10-10T00:50:58.6877657Z parm: NVreg_NvLinkDisable:int 2025-10-10T00:50:58.6878023Z parm: NVreg_EnablePCIERelaxedOrderingMode:int 2025-10-10T00:50:58.6878408Z parm: NVreg_RegisterPCIDriver:int 2025-10-10T00:50:58.6878782Z parm: NVreg_RegisterPlatformDeviceDriver:int 2025-10-10T00:50:58.6879151Z parm: NVreg_EnableResizableBar:int 2025-10-10T00:50:58.6879502Z parm: NVreg_EnableDbgBreakpoint:int 2025-10-10T00:50:58.6879860Z parm: NVreg_EnableNonblockingOpen:int 2025-10-10T00:50:58.6880233Z parm: NVreg_CoherentGPUMemoryMode:charp 2025-10-10T00:50:58.6880583Z parm: NVreg_RegistryDwords:charp 2025-10-10T00:50:58.6880942Z parm: NVreg_RegistryDwordsPerDevice:charp 2025-10-10T00:50:58.6881285Z parm: NVreg_RmMsg:charp 2025-10-10T00:50:58.6881587Z parm: NVreg_GpuBlacklist:charp 2025-10-10T00:50:58.6881928Z parm: NVreg_TemporaryFilePath:charp 2025-10-10T00:50:58.6882349Z parm: NVreg_ExcludedGpus:charp 2025-10-10T00:50:58.6882682Z parm: NVreg_DmaRemapPeerMmio:int 2025-10-10T00:50:58.6883031Z parm: NVreg_RmNvlinkBandwidth:charp 2025-10-10T00:50:58.6883398Z parm: NVreg_RmNvlinkBandwidthLinkCount:int 2025-10-10T00:50:58.6883755Z parm: NVreg_ImexChannelCount:int 2025-10-10T00:50:58.6884095Z parm: NVreg_CreateImexChannel0:int 2025-10-10T00:50:58.6884457Z parm: NVreg_GrdmaPciTopoCheckOverride:int 2025-10-10T00:50:58.6884808Z parm: rm_firmware_active:charp 2025-10-10T00:50:58.6885106Z + HAS_NVIDIA_DRIVER=0 2025-10-10T00:50:58.6885366Z ++ command -v nvidia-smi 2025-10-10T00:50:58.6885638Z + '[' -x /usr/bin/nvidia-smi ']' 2025-10-10T00:50:58.6885906Z + set +e 2025-10-10T00:50:58.6886222Z ++ nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0 2025-10-10T00:51:00.4235888Z + INSTALLED_DRIVER_VERSION=580.82.07 2025-10-10T00:51:00.4236263Z + NVIDIA_SMI_STATUS=0 2025-10-10T00:51:00.4236525Z + '[' 0 -ne 0 ']' 2025-10-10T00:51:00.4236792Z + '[' 580.82.07 '!=' 580.82.07 ']' 2025-10-10T00:51:00.4237095Z + HAS_NVIDIA_DRIVER=1 2025-10-10T00:51:00.4237546Z + echo 'NVIDIA driver (580.82.07) has already been installed. Skipping NVIDIA driver installation' 2025-10-10T00:51:00.4238015Z + set -e 2025-10-10T00:51:00.4238217Z + '[' 1 -eq 0 ']' 2025-10-10T00:51:00.4238615Z NVIDIA driver (580.82.07) has already been installed. Skipping NVIDIA driver installation 2025-10-10T00:51:00.4239641Z + post_install_nvidia_driver_common 2025-10-10T00:51:00.4243847Z + sudo modprobe nvidia 2025-10-10T00:51:00.5503340Z + echo 'After installing NVIDIA driver' 2025-10-10T00:51:00.5503773Z + lspci 2025-10-10T00:51:00.5504024Z After installing NVIDIA driver 2025-10-10T00:51:00.5627294Z 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] 2025-10-10T00:51:00.5627959Z 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 2025-10-10T00:51:00.5628544Z 00:01.3 Non-VGA unclassified device: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08) 2025-10-10T00:51:00.5629084Z 00:03.0 VGA compatible controller: Amazon.com, Inc. Device 1111 2025-10-10T00:51:00.5629793Z 00:04.0 Non-Volatile memory controller: Amazon.com, Inc. NVMe EBS Controller 2025-10-10T00:51:00.5630445Z 00:05.0 Ethernet controller: Amazon.com, Inc. Elastic Network Adapter (ENA) 2025-10-10T00:51:00.5630936Z 00:1e.0 3D controller: NVIDIA Corporation GA102GL [A10G] (rev a1) 2025-10-10T00:51:00.5631423Z 00:1f.0 Non-Volatile memory controller: Amazon.com, Inc. NVMe SSD Controller 2025-10-10T00:51:00.5631832Z + lsmod 2025-10-10T00:51:00.5667307Z Module Size Used by 2025-10-10T00:51:00.5667695Z nvidia_uvm 1925120 0 2025-10-10T00:51:00.5668079Z nvidia 14286848 1 nvidia_uvm 2025-10-10T00:51:00.5668481Z drm 602112 1 nvidia 2025-10-10T00:51:00.5668881Z drm_panel_orientation_quirks 32768 1 drm 2025-10-10T00:51:00.5669196Z backlight 24576 1 drm 2025-10-10T00:51:00.5669496Z i2c_core 110592 2 nvidia,drm 2025-10-10T00:51:00.5669791Z xt_conntrack 16384 1 2025-10-10T00:51:00.5670081Z nft_chain_nat 16384 3 2025-10-10T00:51:00.5670448Z xt_MASQUERADE 20480 1 2025-10-10T00:51:00.5670863Z nf_nat 57344 2 nft_chain_nat,xt_MASQUERADE 2025-10-10T00:51:00.5671312Z nf_conntrack_netlink 57344 0 2025-10-10T00:51:00.5671720Z nf_conntrack 184320 4 xt_conntrack,nf_nat,nf_conntrack_netlink,xt_MASQUERADE 2025-10-10T00:51:00.5672157Z nf_defrag_ipv6 24576 1 nf_conntrack 2025-10-10T00:51:00.5672466Z nf_defrag_ipv4 16384 1 nf_conntrack 2025-10-10T00:51:00.5672763Z xfrm_user 57344 1 2025-10-10T00:51:00.5673033Z xfrm_algo 16384 1 xfrm_user 2025-10-10T00:51:00.5673324Z xt_addrtype 16384 2 2025-10-10T00:51:00.5673581Z nft_compat 20480 4 2025-10-10T00:51:00.5673892Z nf_tables 311296 57 nft_compat,nft_chain_nat 2025-10-10T00:51:00.5674515Z nfnetlink 20480 4 nft_compat,nf_conntrack_netlink,nf_tables 2025-10-10T00:51:00.5674911Z br_netfilter 36864 0 2025-10-10T00:51:00.5675189Z bridge 323584 1 br_netfilter 2025-10-10T00:51:00.5675489Z stp 16384 1 bridge 2025-10-10T00:51:00.5675806Z llc 16384 2 bridge,stp 2025-10-10T00:51:00.5676122Z overlay 167936 0 2025-10-10T00:51:00.5676375Z tls 139264 0 2025-10-10T00:51:00.5676633Z nls_ascii 16384 1 2025-10-10T00:51:00.5676889Z nls_cp437 20480 1 2025-10-10T00:51:00.5677145Z vfat 24576 1 2025-10-10T00:51:00.5677399Z fat 86016 1 vfat 2025-10-10T00:51:00.5677676Z sunrpc 700416 1 2025-10-10T00:51:00.5677933Z i8042 45056 0 2025-10-10T00:51:00.5678186Z ena 184320 0 2025-10-10T00:51:00.5678439Z serio 28672 3 i8042 2025-10-10T00:51:00.5678716Z button 24576 0 2025-10-10T00:51:00.5678985Z ghash_clmulni_intel 16384 0 2025-10-10T00:51:00.5679253Z sch_fq_codel 20480 17 2025-10-10T00:51:00.5679513Z fuse 184320 1 2025-10-10T00:51:00.5679768Z dm_mod 188416 0 2025-10-10T00:51:00.5680027Z configfs 57344 1 2025-10-10T00:51:00.5680285Z loop 36864 0 2025-10-10T00:51:00.5680534Z dmi_sysfs 20480 0 2025-10-10T00:51:00.5680790Z crc32_pclmul 16384 0 2025-10-10T00:51:00.5681047Z crc32c_intel 24576 0 2025-10-10T00:51:00.5681305Z efivarfs 24576 1 2025-10-10T00:51:00.5681553Z + modinfo nvidia 2025-10-10T00:51:00.5691331Z filename: /lib/modules/6.1.150-174.273.amzn2023.x86_64/kernel/drivers/video/nvidia.ko 2025-10-10T00:51:00.5691916Z import_ns: DMA_BUF 2025-10-10T00:51:00.5692198Z alias: char-major-195-* 2025-10-10T00:51:00.5692605Z version: 580.82.07 2025-10-10T00:51:00.5692976Z supported: external 2025-10-10T00:51:00.5693320Z license: Dual MIT/GPL 2025-10-10T00:51:00.5693624Z firmware: nvidia/580.82.07/gsp_tu10x.bin 2025-10-10T00:51:00.5694110Z firmware: nvidia/580.82.07/gsp_ga10x.bin 2025-10-10T00:51:00.5694446Z srcversion: BA7240A71DCF7DC6FE88C1D 2025-10-10T00:51:00.5694794Z alias: of:N*T*Cnvidia,tegra264-displayC* 2025-10-10T00:51:00.5695160Z alias: of:N*T*Cnvidia,tegra264-display 2025-10-10T00:51:00.5695523Z alias: of:N*T*Cnvidia,tegra234-displayC* 2025-10-10T00:51:00.5695877Z alias: of:N*T*Cnvidia,tegra234-display 2025-10-10T00:51:00.5696228Z alias: pci:v000010DEd*sv*sd*bc06sc80i00* 2025-10-10T00:51:00.5696566Z alias: pci:v000010DEd*sv*sd*bc03sc02i00* 2025-10-10T00:51:00.5696907Z alias: pci:v000010DEd*sv*sd*bc03sc00i00* 2025-10-10T00:51:00.5697230Z depends: i2c-core,drm 2025-10-10T00:51:00.5697495Z retpoline: Y 2025-10-10T00:51:00.5697711Z name: nvidia 2025-10-10T00:51:00.5698210Z vermagic: 6.1.150-174.273.amzn2023.x86_64 SMP preempt mod_unload modversions 2025-10-10T00:51:00.5699228Z parm: NvSwitchRegDwords:NvSwitch regkey (charp) 2025-10-10T00:51:00.5712617Z parm: NvSwitchBlacklist:NvSwitchBlacklist=uuid[,uuid...] (charp) 2025-10-10T00:51:00.5713089Z parm: NVreg_ResmanDebugLevel:int 2025-10-10T00:51:00.5713397Z parm: NVreg_RmLogonRC:int 2025-10-10T00:51:00.5713697Z parm: NVreg_ModifyDeviceFiles:int 2025-10-10T00:51:00.5714017Z parm: NVreg_DeviceFileUID:int 2025-10-10T00:51:00.5714320Z parm: NVreg_DeviceFileGID:int 2025-10-10T00:51:00.5714618Z parm: NVreg_DeviceFileMode:int 2025-10-10T00:51:00.5714980Z parm: NVreg_InitializeSystemMemoryAllocations:int 2025-10-10T00:51:00.5715368Z parm: NVreg_UsePageAttributeTable:int 2025-10-10T00:51:00.5715704Z parm: NVreg_EnablePCIeGen3:int 2025-10-10T00:51:00.5716005Z parm: NVreg_EnableMSI:int 2025-10-10T00:51:00.5716316Z parm: NVreg_EnableStreamMemOPs:int 2025-10-10T00:51:00.5716892Z parm: NVreg_RestrictProfilingToAdminUsers:int 2025-10-10T00:51:00.5717302Z parm: NVreg_PreserveVideoMemoryAllocations:int 2025-10-10T00:51:00.5717678Z parm: NVreg_EnableS0ixPowerManagement:int 2025-10-10T00:51:00.5718082Z parm: NVreg_S0ixPowerManagementVideoMemoryThreshold:int 2025-10-10T00:51:00.5718494Z parm: NVreg_DynamicPowerManagement:int 2025-10-10T00:51:00.5718919Z parm: NVreg_DynamicPowerManagementVideoMemoryThreshold:int 2025-10-10T00:51:00.5719332Z parm: NVreg_EnableGpuFirmware:int 2025-10-10T00:51:00.5719667Z parm: NVreg_EnableGpuFirmwareLogs:int 2025-10-10T00:51:00.5720046Z parm: NVreg_OpenRmEnableUnsupportedGpus:int 2025-10-10T00:51:00.5720425Z parm: NVreg_EnableUserNUMAManagement:int 2025-10-10T00:51:00.5720771Z parm: NVreg_MemoryPoolSize:int 2025-10-10T00:51:00.5721094Z parm: NVreg_KMallocHeapMaxSize:int 2025-10-10T00:51:00.5721431Z parm: NVreg_VMallocHeapMaxSize:int 2025-10-10T00:51:00.5721770Z parm: NVreg_IgnoreMMIOCheck:int 2025-10-10T00:51:00.5722091Z parm: NVreg_NvLinkDisable:int 2025-10-10T00:51:00.5722441Z parm: NVreg_EnablePCIERelaxedOrderingMode:int 2025-10-10T00:51:00.5722815Z parm: NVreg_RegisterPCIDriver:int 2025-10-10T00:51:00.5723175Z parm: NVreg_RegisterPlatformDeviceDriver:int 2025-10-10T00:51:00.5723538Z parm: NVreg_EnableResizableBar:int 2025-10-10T00:51:00.5723873Z parm: NVreg_EnableDbgBreakpoint:int 2025-10-10T00:51:00.5724224Z parm: NVreg_EnableNonblockingOpen:int 2025-10-10T00:51:00.5724585Z parm: NVreg_CoherentGPUMemoryMode:charp 2025-10-10T00:51:00.5724935Z parm: NVreg_RegistryDwords:charp 2025-10-10T00:51:00.5725272Z parm: NVreg_RegistryDwordsPerDevice:charp 2025-10-10T00:51:00.5725605Z parm: NVreg_RmMsg:charp 2025-10-10T00:51:00.5725898Z parm: NVreg_GpuBlacklist:charp 2025-10-10T00:51:00.5726235Z parm: NVreg_TemporaryFilePath:charp 2025-10-10T00:51:00.5726691Z parm: NVreg_ExcludedGpus:charp 2025-10-10T00:51:00.5727147Z parm: NVreg_DmaRemapPeerMmio:int 2025-10-10T00:51:00.5727483Z parm: NVreg_RmNvlinkBandwidth:charp 2025-10-10T00:51:00.5727849Z parm: NVreg_RmNvlinkBandwidthLinkCount:int 2025-10-10T00:51:00.5728206Z parm: NVreg_ImexChannelCount:int 2025-10-10T00:51:00.5728533Z parm: NVreg_CreateImexChannel0:int 2025-10-10T00:51:00.5728890Z parm: NVreg_GrdmaPciTopoCheckOverride:int 2025-10-10T00:51:00.5729237Z parm: rm_firmware_active:charp 2025-10-10T00:51:00.5729534Z + set +e 2025-10-10T00:51:00.5729728Z + nvidia-smi 2025-10-10T00:51:02.0270682Z Fri Oct 10 00:51:02 2025 2025-10-10T00:51:02.0271231Z +-----------------------------------------------------------------------------------------+ 2025-10-10T00:51:02.0271829Z | NVIDIA-SMI 580.82.07 Driver Version: 580.82.07 CUDA Version: 13.0 | 2025-10-10T00:51:02.0272346Z +-----------------------------------------+------------------------+----------------------+ 2025-10-10T00:51:02.0272864Z | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | 2025-10-10T00:51:02.0273395Z | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | 2025-10-10T00:51:02.0273843Z | | | MIG M. | 2025-10-10T00:51:02.0274185Z |=========================================+========================+======================| 2025-10-10T00:51:02.0358458Z | 0 NVIDIA A10G Off | 00000000:00:1E.0 Off | 0 | 2025-10-10T00:51:02.0359095Z | 0% 26C P0 60W / 300W | 0MiB / 23028MiB | 4% Default | 2025-10-10T00:51:02.0359619Z | | | N/A | 2025-10-10T00:51:02.0361131Z +-----------------------------------------+------------------------+----------------------+ 2025-10-10T00:51:02.0361944Z 2025-10-10T00:51:02.0362429Z +-----------------------------------------------------------------------------------------+ 2025-10-10T00:51:02.0363454Z | Processes: | 2025-10-10T00:51:02.0364614Z | GPU GI CI PID Type Process name GPU Memory | 2025-10-10T00:51:02.0365732Z | ID ID Usage | 2025-10-10T00:51:02.0366610Z |=========================================================================================| 2025-10-10T00:51:02.0367614Z | No running processes found | 2025-10-10T00:51:02.0368537Z +-----------------------------------------------------------------------------------------+ 2025-10-10T00:51:02.4627745Z + nvidia-smi --query-gpu=gpu_name --format=csv,noheader --id=0 2025-10-10T00:51:03.9183045Z NVIDIA A10G 2025-10-10T00:51:04.1915467Z + NVIDIA_SMI_STATUS=0 2025-10-10T00:51:04.1915861Z + '[' 0 -eq 0 ']' 2025-10-10T00:51:04.1916208Z + echo 'INFO: Ignoring allowed status 0' 2025-10-10T00:51:04.1916558Z + set -e 2025-10-10T00:51:04.1916779Z INFO: Ignoring allowed status 0 2025-10-10T00:51:04.1925860Z == Installing nvidia container toolkit for amzn2023 == 2025-10-10T00:51:04.1930426Z + sudo yum install -y yum-utils 2025-10-10T00:51:04.6443837Z Last metadata expiration check: 0:08:28 ago on Fri Oct 10 00:42:36 2025. 2025-10-10T00:51:04.6711939Z Package dnf-utils-4.3.0-13.amzn2023.0.5.noarch is already installed. 2025-10-10T00:51:04.7267558Z Dependencies resolved. 2025-10-10T00:51:04.7572812Z Nothing to do. 2025-10-10T00:51:04.7573148Z Complete! 2025-10-10T00:51:05.0704852Z + [[ amzn2023 == \a\m\z\n\2\0\2\3 ]] 2025-10-10T00:51:05.0705627Z + YUM_REPO_URL=https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo 2025-10-10T00:51:05.0706662Z + sudo yum-config-manager --add-repo https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo 2025-10-10T00:51:05.4513790Z Adding repo from: https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo 2025-10-10T00:51:05.5055038Z + sudo yum install -y nvidia-container-toolkit-1.17.8 libnvidia-container-tools-1.17.8 libnvidia-container1-1.17.8 nvidia-container-toolkit-base-1.17.8 2025-10-10T00:51:06.0923923Z nvidia-container-toolkit 16 kB/s | 833 B 00:00 2025-10-10T00:51:06.1190194Z Package nvidia-container-toolkit-1.17.8-1.x86_64 is already installed. 2025-10-10T00:51:06.1196129Z Package libnvidia-container-tools-1.17.8-1.x86_64 is already installed. 2025-10-10T00:51:06.1201725Z Package libnvidia-container1-1.17.8-1.x86_64 is already installed. 2025-10-10T00:51:06.1209919Z Package nvidia-container-toolkit-base-1.17.8-1.x86_64 is already installed. 2025-10-10T00:51:06.1792062Z Dependencies resolved. 2025-10-10T00:51:06.2082782Z Nothing to do. 2025-10-10T00:51:06.2083178Z Complete! 2025-10-10T00:51:06.3007570Z + sudo systemctl restart docker 2025-10-10T00:51:46.9161340Z Fri Oct 10 00:51:46 2025 2025-10-10T00:51:46.9162370Z +-----------------------------------------------------------------------------------------+ 2025-10-10T00:51:46.9163290Z | NVIDIA-SMI 580.82.07 Driver Version: 580.82.07 CUDA Version: 13.0 | 2025-10-10T00:51:46.9163786Z +-----------------------------------------+------------------------+----------------------+ 2025-10-10T00:51:46.9164293Z | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | 2025-10-10T00:51:46.9164826Z | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | 2025-10-10T00:51:46.9165274Z | | | MIG M. | 2025-10-10T00:51:46.9165620Z |=========================================+========================+======================| 2025-10-10T00:51:46.9258807Z | 0 NVIDIA A10G On | 00000000:00:1E.0 Off | 0 | 2025-10-10T00:51:46.9259401Z | 0% 26C P0 61W / 300W | 0MiB / 23028MiB | 4% Default | 2025-10-10T00:51:46.9260020Z | | | N/A | 2025-10-10T00:51:46.9260586Z +-----------------------------------------+------------------------+----------------------+ 2025-10-10T00:51:46.9260990Z 2025-10-10T00:51:46.9261227Z +-----------------------------------------------------------------------------------------+ 2025-10-10T00:51:46.9261706Z | Processes: | 2025-10-10T00:51:46.9262151Z | GPU GI CI PID Type Process name GPU Memory | 2025-10-10T00:51:46.9262642Z | ID ID Usage | 2025-10-10T00:51:46.9263177Z |=========================================================================================| 2025-10-10T00:51:46.9263970Z | No running processes found | 2025-10-10T00:51:46.9264455Z +-----------------------------------------------------------------------------------------+ 2025-10-10T00:51:47.1013619Z Unable to find image 'public.ecr.aws/docker/library/python:3.13' locally 2025-10-10T00:51:47.3434576Z 3.13: Pulling from docker/library/python 2025-10-10T00:51:47.4248271Z cae3b572364a: Pulling fs layer 2025-10-10T00:51:47.4248622Z bd090f42c4b7: Pulling fs layer 2025-10-10T00:51:47.4248902Z f0c9d6d993ac: Pulling fs layer 2025-10-10T00:51:47.4249185Z a2ade626d67a: Pulling fs layer 2025-10-10T00:51:47.4249464Z 7f924d696c9c: Pulling fs layer 2025-10-10T00:51:47.4249773Z 12e6ee790ad5: Pulling fs layer 2025-10-10T00:51:47.4250144Z 54ea66483f67: Pulling fs layer 2025-10-10T00:51:47.4250471Z 7f924d696c9c: Waiting 2025-10-10T00:51:47.4250800Z 12e6ee790ad5: Waiting 2025-10-10T00:51:47.4251077Z 54ea66483f67: Waiting 2025-10-10T00:51:47.4251732Z a2ade626d67a: Waiting 2025-10-10T00:51:47.5665108Z bd090f42c4b7: Verifying Checksum 2025-10-10T00:51:47.5665541Z bd090f42c4b7: Download complete 2025-10-10T00:51:47.6181655Z cae3b572364a: Verifying Checksum 2025-10-10T00:51:47.6182119Z cae3b572364a: Download complete 2025-10-10T00:51:47.7074229Z 7f924d696c9c: Verifying Checksum 2025-10-10T00:51:47.7074641Z 7f924d696c9c: Download complete 2025-10-10T00:51:47.7220454Z f0c9d6d993ac: Verifying Checksum 2025-10-10T00:51:47.7220896Z f0c9d6d993ac: Download complete 2025-10-10T00:51:47.7844879Z 54ea66483f67: Verifying Checksum 2025-10-10T00:51:47.7845238Z 54ea66483f67: Download complete 2025-10-10T00:51:47.8680902Z 12e6ee790ad5: Verifying Checksum 2025-10-10T00:51:47.8681264Z 12e6ee790ad5: Download complete 2025-10-10T00:51:48.4088793Z a2ade626d67a: Verifying Checksum 2025-10-10T00:51:48.4089157Z a2ade626d67a: Download complete 2025-10-10T00:51:49.4427736Z cae3b572364a: Pull complete 2025-10-10T00:51:50.1568825Z bd090f42c4b7: Pull complete 2025-10-10T00:51:52.6608531Z f0c9d6d993ac: Pull complete 2025-10-10T00:51:59.4287397Z a2ade626d67a: Pull complete 2025-10-10T00:51:59.7171022Z 7f924d696c9c: Pull complete 2025-10-10T00:52:00.4975352Z 12e6ee790ad5: Pull complete 2025-10-10T00:52:00.5205412Z 54ea66483f67: Pull complete 2025-10-10T00:52:00.5340284Z Digest: sha256:4889af0e45f04b7c5dd741421a1280919499d38d3125d714b69fa86b23b1052a 2025-10-10T00:52:00.5382883Z Status: Downloaded newer image for public.ecr.aws/docker/library/python:3.13 2025-10-10T00:52:07.6962121Z Fri Oct 10 00:52:07 2025 2025-10-10T00:52:07.6962568Z +-----------------------------------------------------------------------------------------+ 2025-10-10T00:52:07.6963082Z | NVIDIA-SMI 580.82.07 Driver Version: 580.82.07 CUDA Version: 13.0 | 2025-10-10T00:52:07.6963572Z +-----------------------------------------+------------------------+----------------------+ 2025-10-10T00:52:07.6964432Z | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | 2025-10-10T00:52:07.6964997Z | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | 2025-10-10T00:52:07.6965437Z | | | MIG M. | 2025-10-10T00:52:07.6965771Z |=========================================+========================+======================| 2025-10-10T00:52:07.7124262Z | 0 NVIDIA A10G On | 00000000:00:1E.0 Off | 0 | 2025-10-10T00:52:07.7125431Z | 0% 24C P8 10W / 300W | 0MiB / 23028MiB | 0% Default | 2025-10-10T00:52:07.7126354Z | | | N/A | 2025-10-10T00:52:07.7127490Z +-----------------------------------------+------------------------+----------------------+ 2025-10-10T00:52:07.7128105Z 2025-10-10T00:52:07.7128620Z +-----------------------------------------------------------------------------------------+ 2025-10-10T00:52:07.7129376Z | Processes: | 2025-10-10T00:52:07.7129835Z | GPU GI CI PID Type Process name GPU Memory | 2025-10-10T00:52:07.7130329Z | ID ID Usage | 2025-10-10T00:52:07.7130686Z |=========================================================================================| 2025-10-10T00:52:07.7135376Z | No running processes found | 2025-10-10T00:52:07.7135893Z +-----------------------------------------------------------------------------------------+ 2025-10-10T00:52:08.8849356Z Command completed after 1 attempt(s). 2025-10-10T00:52:08.8951131Z Prepare all required actions 2025-10-10T00:52:08.8979508Z ##[group]Run ./.github/actions/get-workflow-job-id 2025-10-10T00:52:08.8979849Z with: 2025-10-10T00:52:08.8980456Z github-token: *** 2025-10-10T00:52:08.8980877Z env: 2025-10-10T00:52:08.8981089Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:52:08.8981416Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T00:52:08.8981764Z ##[endgroup] 2025-10-10T00:52:08.9019853Z ##[group]Run set -eux 2025-10-10T00:52:08.9020113Z set -eux 2025-10-10T00:52:08.9020538Z python3 .github/scripts/get_workflow_job_id.py "${GITHUB_RUN_ID}" "${RUNNER_NAME}" 2025-10-10T00:52:08.9035795Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:52:08.9036161Z env: 2025-10-10T00:52:08.9036374Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:52:08.9036708Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T00:52:08.9037190Z GITHUB_TOKEN: *** 2025-10-10T00:52:08.9037423Z ##[endgroup] 2025-10-10T00:52:08.9080450Z + python3 .github/scripts/get_workflow_job_id.py 18392306083 i-088ba17e0301f2c3f 2025-10-10T00:52:09.5453511Z Setting output job-id=52406799277 2025-10-10T00:52:09.5454098Z Setting output job-name=linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 2, 3, linux.g5.4xlarge.nvidia.gpu) 2025-10-10T00:52:09.5578065Z ##[group]Run python3 -m pip install psutil==5.9.8 dataclasses_json==0.6.7 nvidia-ml-py==11.525.84 2025-10-10T00:52:09.5578768Z python3 -m pip install psutil==5.9.8 dataclasses_json==0.6.7 nvidia-ml-py==11.525.84 2025-10-10T00:52:09.5579652Z python3 -m tools.stats.monitor --log-interval "$MONITOR_LOG_INTERVAL" --data-collect-interval "$MONITOR_DATA_COLLECT_INTERVAL" > usage_log.txt 2>&1 & 2025-10-10T00:52:09.5580480Z echo "monitor-script-pid=${!}" >> "${GITHUB_OUTPUT}" 2025-10-10T00:52:09.5590000Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:52:09.5590380Z env: 2025-10-10T00:52:09.5590593Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:52:09.5590923Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T00:52:09.5591249Z JOB_ID: 52406799277 2025-10-10T00:52:09.5591699Z JOB_NAME: linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 2, 3, linux.g5.4xlarge.nvidia.gpu) 2025-10-10T00:52:09.5592236Z WORKFLOW_NAME: slow 2025-10-10T00:52:09.5592484Z WORKFLOW_RUN_ID: 18392306083 2025-10-10T00:52:09.5592748Z MONITOR_LOG_INTERVAL: 5 2025-10-10T00:52:09.5593011Z MONITOR_DATA_COLLECT_INTERVAL: 1 2025-10-10T00:52:09.5593286Z ##[endgroup] 2025-10-10T00:52:09.8503064Z Defaulting to user installation because normal site-packages is not writeable 2025-10-10T00:52:10.2401760Z Collecting psutil==5.9.8 2025-10-10T00:52:10.2572345Z Downloading psutil-5.9.8-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (288 kB) 2025-10-10T00:52:10.3357816Z Collecting dataclasses_json==0.6.7 2025-10-10T00:52:10.3395516Z Downloading dataclasses_json-0.6.7-py3-none-any.whl (28 kB) 2025-10-10T00:52:10.3700714Z Collecting nvidia-ml-py==11.525.84 2025-10-10T00:52:10.3736949Z Downloading nvidia_ml_py-11.525.84-py3-none-any.whl (34 kB) 2025-10-10T00:52:10.4077628Z Collecting typing-inspect<1,>=0.4.0 2025-10-10T00:52:10.4112486Z Downloading typing_inspect-0.9.0-py3-none-any.whl (8.8 kB) 2025-10-10T00:52:10.5312407Z Collecting marshmallow<4.0.0,>=3.18.0 2025-10-10T00:52:10.5349490Z Downloading marshmallow-3.26.1-py3-none-any.whl (50 kB) 2025-10-10T00:52:10.5940112Z Collecting packaging>=17.0 2025-10-10T00:52:10.5974253Z Downloading packaging-25.0-py3-none-any.whl (66 kB) 2025-10-10T00:52:10.6534862Z Collecting typing-extensions>=3.7.4 2025-10-10T00:52:10.6570772Z Downloading typing_extensions-4.15.0-py3-none-any.whl (44 kB) 2025-10-10T00:52:10.6778110Z Collecting mypy-extensions>=0.3.0 2025-10-10T00:52:10.6818037Z Downloading mypy_extensions-1.1.0-py3-none-any.whl (5.0 kB) 2025-10-10T00:52:10.7728544Z Installing collected packages: typing-extensions, packaging, mypy-extensions, typing-inspect, marshmallow, psutil, nvidia-ml-py, dataclasses-json 2025-10-10T00:52:11.0475290Z Successfully installed dataclasses-json-0.6.7 marshmallow-3.26.1 mypy-extensions-1.1.0 nvidia-ml-py-11.525.84 packaging-25.0 psutil-5.9.8 typing-extensions-4.15.0 typing-inspect-0.9.0 2025-10-10T00:52:11.2531023Z Prepare all required actions 2025-10-10T00:52:11.2531391Z Getting action download info 2025-10-10T00:52:11.4086526Z Download action repository 'seemethere/download-artifact-s3@v4' (SHA:1da556a7aa0a088e3153970611f6c432d58e80e6) 2025-10-10T00:52:11.6403140Z Download action repository 'actions/download-artifact@v4' (SHA:d3f86a106a0bac45b974a628896c90dbdf5c8093) 2025-10-10T00:52:12.0302918Z ##[group]Run ./.github/actions/download-build-artifacts 2025-10-10T00:52:12.0303329Z with: 2025-10-10T00:52:12.0303595Z name: linux-jammy-cuda12.8-py3.10-gcc11-sm86 2025-10-10T00:52:12.0303930Z s3-bucket: gha-artifacts 2025-10-10T00:52:12.0304175Z env: 2025-10-10T00:52:12.0304382Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:52:12.0304702Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T00:52:12.0305045Z ##[endgroup] 2025-10-10T00:52:12.0345622Z ##[group]Run seemethere/download-artifact-s3@v4 2025-10-10T00:52:12.0345953Z with: 2025-10-10T00:52:12.0346254Z name: linux-jammy-cuda12.8-py3.10-gcc11-sm86 2025-10-10T00:52:12.0346594Z s3-bucket: gha-artifacts 2025-10-10T00:52:12.0346857Z region: us-east-1 2025-10-10T00:52:12.0347084Z env: 2025-10-10T00:52:12.0347298Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:52:12.0347623Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T00:52:12.0347965Z ##[endgroup] 2025-10-10T00:52:12.5209285Z (node:59460) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2025-10-10T00:52:12.5209744Z 2025-10-10T00:52:12.5209940Z Please migrate your code to use AWS SDK for JavaScript (v3). 2025-10-10T00:52:12.5210456Z For more information, check the migration guide at https://a.co/7PzMCcy 2025-10-10T00:52:12.5210981Z (Use `node --trace-warnings ...` to show where the warning was created) 2025-10-10T00:52:12.8068773Z Found 1 objects with prefix pytorch/pytorch/18392306083/linux-jammy-cuda12.8-py3.10-gcc11-sm86/ 2025-10-10T00:52:12.8069510Z Starting download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/artifacts.zip 2025-10-10T00:52:20.4727605Z Finished download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/artifacts.zip 2025-10-10T00:52:20.4732982Z Artifact download has finished successfully 2025-10-10T00:52:20.5099958Z ##[group]Run unzip -o artifacts.zip 2025-10-10T00:52:20.5100292Z unzip -o artifacts.zip 2025-10-10T00:52:20.5110561Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:52:20.5110924Z env: 2025-10-10T00:52:20.5111140Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:52:20.5111459Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T00:52:20.5111801Z ##[endgroup] 2025-10-10T00:52:20.5188467Z Archive: artifacts.zip 2025-10-10T00:52:20.5190270Z creating: dist/ 2025-10-10T00:52:22.6375893Z inflating: dist/torch-2.10.0a0+git344e636-cp310-cp310-linux_x86_64.whl 2025-10-10T00:52:22.6513950Z inflating: dist/.ninja_log 2025-10-10T00:52:22.6514475Z creating: build/custom_test_artifacts/ 2025-10-10T00:52:22.6515027Z creating: build/custom_test_artifacts/custom-op-build/ 2025-10-10T00:52:22.6515568Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/ 2025-10-10T00:52:22.6516153Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/pkgRedirects/ 2025-10-10T00:52:22.6524920Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeConfigureLog.yaml 2025-10-10T00:52:22.6525561Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/ 2025-10-10T00:52:22.6526401Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeSystem.cmake 2025-10-10T00:52:22.6527143Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/ 2025-10-10T00:52:22.6527796Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/tmp/ 2025-10-10T00:52:22.6530242Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c 2025-10-10T00:52:22.6531954Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/a.out 2025-10-10T00:52:22.6532806Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake 2025-10-10T00:52:22.6533500Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/ 2025-10-10T00:52:22.6534220Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/ 2025-10-10T00:52:22.6537202Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-10-10T00:52:22.6538505Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out 2025-10-10T00:52:22.6539829Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake 2025-10-10T00:52:22.6542023Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin 2025-10-10T00:52:22.6544238Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin 2025-10-10T00:52:22.6545003Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/ 2025-10-10T00:52:22.6545678Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/ 2025-10-10T00:52:22.6605598Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2025-10-10T00:52:22.6666493Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2025-10-10T00:52:22.6667482Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2025-10-10T00:52:22.6733377Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2025-10-10T00:52:22.6734505Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2025-10-10T00:52:22.6735471Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2025-10-10T00:52:22.6736458Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2025-10-10T00:52:22.6737418Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2025-10-10T00:52:22.6738359Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2025-10-10T00:52:22.6739308Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2025-10-10T00:52:22.6740251Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2025-10-10T00:52:22.6741341Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2025-10-10T00:52:22.6742312Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2025-10-10T00:52:22.6743300Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.reg.c 2025-10-10T00:52:22.6744325Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.fatbin 2025-10-10T00:52:22.6745488Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2025-10-10T00:52:22.6746654Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.o 2025-10-10T00:52:22.6749772Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/CMakeCUDACompilerId.cu 2025-10-10T00:52:22.6825021Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/a.out 2025-10-10T00:52:22.6825769Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeCUDACompiler.cmake 2025-10-10T00:52:22.6901103Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CUDA.bin 2025-10-10T00:52:22.6901851Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeScratch/ 2025-10-10T00:52:22.6902443Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeTmp/ 2025-10-10T00:52:22.6903051Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/cmake.check_cache 2025-10-10T00:52:22.6903675Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/ 2025-10-10T00:52:22.6904365Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.ts 2025-10-10T00:52:22.6905160Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.make 2025-10-10T00:52:22.6905912Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/depend.make 2025-10-10T00:52:22.6906621Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/link.txt 2025-10-10T00:52:22.6907360Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/cmake_clean.cmake 2025-10-10T00:52:22.6908586Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/build.make 2025-10-10T00:52:22.6909489Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/DependInfo.cmake 2025-10-10T00:52:22.6910367Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/flags.make 2025-10-10T00:52:22.6911449Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/progress.make 2025-10-10T00:52:22.6933284Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o.d 2025-10-10T00:52:22.7136333Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o 2025-10-10T00:52:22.7136999Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/ 2025-10-10T00:52:22.7137732Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.ts 2025-10-10T00:52:22.7138550Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.make 2025-10-10T00:52:22.7139332Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/depend.make 2025-10-10T00:52:22.7140345Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/link.txt 2025-10-10T00:52:22.7141140Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/cmake_clean.cmake 2025-10-10T00:52:22.7141953Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/build.make 2025-10-10T00:52:22.7142926Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/DependInfo.cmake 2025-10-10T00:52:22.7143796Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/flags.make 2025-10-10T00:52:22.7144866Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/progress.make 2025-10-10T00:52:22.7166673Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o.d 2025-10-10T00:52:22.7248320Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o 2025-10-10T00:52:22.7249143Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-10-10T00:52:22.7250277Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/TargetDirectories.txt 2025-10-10T00:52:22.7251092Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/progress.marks 2025-10-10T00:52:22.7251909Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile2 2025-10-10T00:52:22.7253835Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile.cmake 2025-10-10T00:52:22.7254457Z inflating: build/custom_test_artifacts/custom-op-build/detect_cuda_version.cc 2025-10-10T00:52:22.7257455Z inflating: build/custom_test_artifacts/custom-op-build/CMakeCache.txt 2025-10-10T00:52:22.7258316Z inflating: build/custom_test_artifacts/custom-op-build/Makefile 2025-10-10T00:52:22.7259288Z inflating: build/custom_test_artifacts/custom-op-build/cmake_install.cmake 2025-10-10T00:52:22.7434646Z inflating: build/custom_test_artifacts/custom-op-build/libcustom_ops.so 2025-10-10T00:52:22.7490967Z inflating: build/custom_test_artifacts/custom-op-build/test_custom_ops 2025-10-10T00:52:22.7491491Z creating: build/custom_test_artifacts/jit-hook-build/ 2025-10-10T00:52:22.7491944Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/ 2025-10-10T00:52:22.7492480Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/pkgRedirects/ 2025-10-10T00:52:22.7500894Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeConfigureLog.yaml 2025-10-10T00:52:22.7501513Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/ 2025-10-10T00:52:22.7502131Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeSystem.cmake 2025-10-10T00:52:22.7502790Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/ 2025-10-10T00:52:22.7503430Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/tmp/ 2025-10-10T00:52:22.7506066Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c 2025-10-10T00:52:22.7507471Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/a.out 2025-10-10T00:52:22.7508550Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake 2025-10-10T00:52:22.7509220Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/ 2025-10-10T00:52:22.7509867Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/ 2025-10-10T00:52:22.7512696Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-10-10T00:52:22.7514124Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out 2025-10-10T00:52:22.7515790Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake 2025-10-10T00:52:22.7517338Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin 2025-10-10T00:52:22.7519613Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin 2025-10-10T00:52:22.7520633Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/ 2025-10-10T00:52:22.7521397Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/ 2025-10-10T00:52:22.7580921Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2025-10-10T00:52:22.7642783Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2025-10-10T00:52:22.7644305Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2025-10-10T00:52:22.7707941Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2025-10-10T00:52:22.7709254Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2025-10-10T00:52:22.7710421Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2025-10-10T00:52:22.7711557Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2025-10-10T00:52:22.7712613Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2025-10-10T00:52:22.7713658Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2025-10-10T00:52:22.7714751Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2025-10-10T00:52:22.7715797Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2025-10-10T00:52:22.7716776Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2025-10-10T00:52:22.7729762Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2025-10-10T00:52:22.7730614Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.reg.c 2025-10-10T00:52:22.7731443Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.fatbin 2025-10-10T00:52:22.7732269Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2025-10-10T00:52:22.7733068Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.o 2025-10-10T00:52:22.7733884Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/CMakeCUDACompilerId.cu 2025-10-10T00:52:22.7798948Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/a.out 2025-10-10T00:52:22.7800112Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeCUDACompiler.cmake 2025-10-10T00:52:22.7876770Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CUDA.bin 2025-10-10T00:52:22.7877917Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeScratch/ 2025-10-10T00:52:22.7878803Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeTmp/ 2025-10-10T00:52:22.7879737Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/cmake.check_cache 2025-10-10T00:52:22.7880724Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/ 2025-10-10T00:52:22.7881835Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.ts 2025-10-10T00:52:22.7883112Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.make 2025-10-10T00:52:22.7884373Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/depend.make 2025-10-10T00:52:22.7885471Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/link.txt 2025-10-10T00:52:22.7886624Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/cmake_clean.cmake 2025-10-10T00:52:22.7887877Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/build.make 2025-10-10T00:52:22.7889039Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/DependInfo.cmake 2025-10-10T00:52:22.7890185Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/flags.make 2025-10-10T00:52:22.7891325Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/progress.make 2025-10-10T00:52:22.7909429Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o.d 2025-10-10T00:52:22.7973368Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o 2025-10-10T00:52:22.7974625Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-10-10T00:52:22.7975752Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/TargetDirectories.txt 2025-10-10T00:52:22.7976770Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/progress.marks 2025-10-10T00:52:22.7977713Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile2 2025-10-10T00:52:22.7978658Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile.cmake 2025-10-10T00:52:22.7979603Z inflating: build/custom_test_artifacts/jit-hook-build/detect_cuda_version.cc 2025-10-10T00:52:22.7982366Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeCache.txt 2025-10-10T00:52:22.7983198Z inflating: build/custom_test_artifacts/jit-hook-build/Makefile 2025-10-10T00:52:22.7984162Z inflating: build/custom_test_artifacts/jit-hook-build/cmake_install.cmake 2025-10-10T00:52:22.8023538Z inflating: build/custom_test_artifacts/jit-hook-build/test_jit_hooks 2025-10-10T00:52:22.8024065Z creating: build/custom_test_artifacts/custom-backend-build/ 2025-10-10T00:52:22.8024571Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/ 2025-10-10T00:52:22.8025169Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/pkgRedirects/ 2025-10-10T00:52:22.8033114Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeConfigureLog.yaml 2025-10-10T00:52:22.8033790Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/ 2025-10-10T00:52:22.8034452Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeSystem.cmake 2025-10-10T00:52:22.8035166Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/ 2025-10-10T00:52:22.8035870Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/tmp/ 2025-10-10T00:52:22.8038331Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c 2025-10-10T00:52:22.8039699Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/a.out 2025-10-10T00:52:22.8040763Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake 2025-10-10T00:52:22.8041501Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/ 2025-10-10T00:52:22.8042288Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/ 2025-10-10T00:52:22.8045402Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-10-10T00:52:22.8046550Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out 2025-10-10T00:52:22.8047989Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake 2025-10-10T00:52:22.8050299Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin 2025-10-10T00:52:22.8051921Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin 2025-10-10T00:52:22.8052720Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/ 2025-10-10T00:52:22.8053438Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/ 2025-10-10T00:52:22.8113621Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2025-10-10T00:52:22.8174909Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2025-10-10T00:52:22.8176058Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2025-10-10T00:52:22.8240832Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2025-10-10T00:52:22.8241854Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2025-10-10T00:52:22.8242933Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2025-10-10T00:52:22.8244215Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2025-10-10T00:52:22.8245288Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2025-10-10T00:52:22.8246335Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2025-10-10T00:52:22.8247425Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2025-10-10T00:52:22.8248449Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2025-10-10T00:52:22.8249759Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2025-10-10T00:52:22.8250713Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2025-10-10T00:52:22.8251647Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.reg.c 2025-10-10T00:52:22.8252576Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.fatbin 2025-10-10T00:52:22.8253843Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2025-10-10T00:52:22.8255041Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.o 2025-10-10T00:52:22.8258173Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/CMakeCUDACompilerId.cu 2025-10-10T00:52:22.8333206Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/a.out 2025-10-10T00:52:22.8334097Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeCUDACompiler.cmake 2025-10-10T00:52:22.8409439Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CUDA.bin 2025-10-10T00:52:22.8410230Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeScratch/ 2025-10-10T00:52:22.8410867Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeTmp/ 2025-10-10T00:52:22.8411511Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/cmake.check_cache 2025-10-10T00:52:22.8412199Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/ 2025-10-10T00:52:22.8412973Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.ts 2025-10-10T00:52:22.8413843Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.make 2025-10-10T00:52:22.8414684Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/depend.make 2025-10-10T00:52:22.8415466Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/link.txt 2025-10-10T00:52:22.8416275Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/cmake_clean.cmake 2025-10-10T00:52:22.8417402Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/build.make 2025-10-10T00:52:22.8418227Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/DependInfo.cmake 2025-10-10T00:52:22.8419043Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/flags.make 2025-10-10T00:52:22.8419851Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/progress.make 2025-10-10T00:52:22.8424566Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o.d 2025-10-10T00:52:22.8546185Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o 2025-10-10T00:52:22.8546995Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/ 2025-10-10T00:52:22.8547811Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.ts 2025-10-10T00:52:22.8548724Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.make 2025-10-10T00:52:22.8549606Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/depend.make 2025-10-10T00:52:22.8550621Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/link.txt 2025-10-10T00:52:22.8551506Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/cmake_clean.cmake 2025-10-10T00:52:22.8552390Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/build.make 2025-10-10T00:52:22.8553333Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/DependInfo.cmake 2025-10-10T00:52:22.8554213Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/flags.make 2025-10-10T00:52:22.8555322Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/progress.make 2025-10-10T00:52:22.8577233Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o.d 2025-10-10T00:52:22.8632719Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o 2025-10-10T00:52:22.8633631Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-10-10T00:52:22.8634479Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/TargetDirectories.txt 2025-10-10T00:52:22.8635260Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/progress.marks 2025-10-10T00:52:22.8636344Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile2 2025-10-10T00:52:22.8638494Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile.cmake 2025-10-10T00:52:22.8639157Z inflating: build/custom_test_artifacts/custom-backend-build/detect_cuda_version.cc 2025-10-10T00:52:22.8641985Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeCache.txt 2025-10-10T00:52:22.8642870Z inflating: build/custom_test_artifacts/custom-backend-build/Makefile 2025-10-10T00:52:22.8643901Z inflating: build/custom_test_artifacts/custom-backend-build/cmake_install.cmake 2025-10-10T00:52:22.8746407Z inflating: build/custom_test_artifacts/custom-backend-build/libcustom_backend.so 2025-10-10T00:52:22.8785945Z inflating: build/custom_test_artifacts/custom-backend-build/test_custom_backend 2025-10-10T00:52:22.8786415Z creating: build/lib/ 2025-10-10T00:52:22.8869871Z inflating: build/lib/libprotobuf-lite.a 2025-10-10T00:52:22.9314644Z inflating: build/lib/libprotobuf.a 2025-10-10T00:52:22.9811260Z inflating: build/lib/libprotoc.a 2025-10-10T00:52:22.9821276Z inflating: build/lib/libpthreadpool.a 2025-10-10T00:52:22.9829723Z inflating: build/lib/libcpuinfo.a 2025-10-10T00:52:22.9837714Z inflating: build/lib/libcpuinfo_internals.a 2025-10-10T00:52:22.9838621Z inflating: build/lib/libclog.a 2025-10-10T00:52:22.9858305Z inflating: build/lib/libpytorch_qnnpack.a 2025-10-10T00:52:22.9860995Z inflating: build/lib/libnnpack_reference_layers.a 2025-10-10T00:52:22.9879121Z inflating: build/lib/libnnpack.a 2025-10-10T00:52:23.0066743Z inflating: build/lib/libmicrokernels-prod.a 2025-10-10T00:52:23.0947245Z inflating: build/lib/libmicrokernels-all.a 2025-10-10T00:52:23.1017726Z inflating: build/lib/libgtest.a 2025-10-10T00:52:23.1035067Z inflating: build/lib/libgmock.a 2025-10-10T00:52:23.1035666Z inflating: build/lib/libgtest_main.a 2025-10-10T00:52:23.1127083Z inflating: build/lib/libXNNPACK.a 2025-10-10T00:52:23.1127676Z inflating: build/lib/libgmock_main.a 2025-10-10T00:52:23.1204353Z inflating: build/lib/libbenchmark.a 2025-10-10T00:52:23.1204906Z inflating: build/lib/libbenchmark_main.a 2025-10-10T00:52:23.1213624Z inflating: build/lib/libittnotify.a 2025-10-10T00:52:23.1279436Z inflating: build/lib/libasmjit.a 2025-10-10T00:52:23.1280133Z inflating: build/lib/libjitprofiling.a 2025-10-10T00:52:23.2478015Z inflating: build/lib/libfbgemm.a 2025-10-10T00:52:23.2509433Z inflating: build/lib/libtensorpipe_uv.a 2025-10-10T00:52:23.3064214Z inflating: build/lib/libtensorpipe.a 2025-10-10T00:52:23.3312793Z inflating: build/lib/libtensorpipe_cuda.a 2025-10-10T00:52:23.3448063Z inflating: build/lib/libgloo.a 2025-10-10T00:52:23.3495138Z inflating: build/lib/libonnx_proto.a 2025-10-10T00:52:23.3941710Z inflating: build/lib/libgloo_cuda.a 2025-10-10T00:52:23.4660775Z inflating: build/lib/libonnx.a 2025-10-10T00:52:24.4957372Z inflating: build/lib/libdnnl.a 2025-10-10T00:52:24.4976897Z inflating: build/lib/libfmt.a 2025-10-10T00:52:24.5452983Z inflating: build/lib/libkineto.a 2025-10-10T00:52:24.5568380Z inflating: build/lib/libc10.so 2025-10-10T00:52:24.5628689Z inflating: build/lib/libc10_cuda.so 2025-10-10T00:52:24.5630269Z inflating: build/lib/libcaffe2_nvrtc.so 2025-10-10T00:52:24.5632064Z inflating: build/lib/libtorch_global_deps.so 2025-10-10T00:52:27.6035336Z inflating: build/lib/libtorch_cpu.so 2025-10-10T00:52:27.6803192Z inflating: build/lib/libtorch_nvshmem.so 2025-10-10T00:52:29.6237943Z inflating: build/lib/libtorch_cuda.so 2025-10-10T00:52:29.6239314Z inflating: build/lib/libtorch.so 2025-10-10T00:52:29.6290768Z inflating: build/lib/libtorch_cuda_linalg.so 2025-10-10T00:52:29.6309457Z inflating: build/lib/libjitbackend_test.so 2025-10-10T00:52:29.6380975Z inflating: build/lib/libtorchbind_test.so 2025-10-10T00:52:29.6406476Z inflating: build/lib/libbackend_with_compiler.so 2025-10-10T00:52:29.6433425Z inflating: build/lib/libaoti_custom_ops.so 2025-10-10T00:52:29.6436306Z inflating: build/lib/libc10d_cuda_test.so 2025-10-10T00:52:29.6441077Z inflating: build/lib/libshm.so 2025-10-10T00:52:29.8739834Z inflating: build/lib/libtorch_python.so 2025-10-10T00:52:29.8776057Z inflating: build/lib/libnnapi_backend.so 2025-10-10T00:52:29.8776392Z creating: build/bin/ 2025-10-10T00:52:29.9235539Z inflating: build/bin/protoc-3.13.0.0 2025-10-10T00:52:29.9693276Z inflating: build/bin/protoc 2025-10-10T00:52:29.9752753Z inflating: build/bin/c10_AllocatorConfig_test 2025-10-10T00:52:29.9808332Z inflating: build/bin/c10_CompileTimeFunctionPointer_test 2025-10-10T00:52:29.9865590Z inflating: build/bin/c10_DeviceGuard_test 2025-10-10T00:52:29.9923475Z inflating: build/bin/c10_Device_test 2025-10-10T00:52:29.9989545Z inflating: build/bin/c10_DispatchKeySet_test 2025-10-10T00:52:30.0050065Z inflating: build/bin/c10_Scalar_test 2025-10-10T00:52:30.0110617Z inflating: build/bin/c10_InlineDeviceGuard_test 2025-10-10T00:52:30.0165567Z inflating: build/bin/c10_StreamGuard_test 2025-10-10T00:52:30.0229834Z inflating: build/bin/c10_SymInt_test 2025-10-10T00:52:30.0284752Z inflating: build/bin/c10_ConstexprCrc_test 2025-10-10T00:52:30.0346707Z inflating: build/bin/c10_InlineStreamGuard_test 2025-10-10T00:52:30.0405866Z inflating: build/bin/c10_Bitset_test 2025-10-10T00:52:30.0461236Z inflating: build/bin/c10_ArrayRef_test 2025-10-10T00:52:30.0523694Z inflating: build/bin/c10_SizesAndStrides_test 2025-10-10T00:52:30.0600133Z inflating: build/bin/c10_cow_test 2025-10-10T00:52:30.0655749Z inflating: build/bin/c10_DeadlockDetection_test 2025-10-10T00:52:30.0719126Z inflating: build/bin/c10_Enumerate_test 2025-10-10T00:52:30.0777686Z inflating: build/bin/c10_IntrusiveList_test 2025-10-10T00:52:30.0839636Z inflating: build/bin/c10_Metaprogramming_test 2025-10-10T00:52:30.0901911Z inflating: build/bin/c10_LeftRight_test 2025-10-10T00:52:30.0958071Z inflating: build/bin/c10_Half_test 2025-10-10T00:52:30.1014172Z inflating: build/bin/c10_Semaphore_test 2025-10-10T00:52:30.1073266Z inflating: build/bin/c10_NetworkFlow_test 2025-10-10T00:52:30.1135026Z inflating: build/bin/c10_ThreadLocal_test 2025-10-10T00:52:30.1190683Z inflating: build/bin/c10_Synchronized_test 2025-10-10T00:52:30.1248643Z inflating: build/bin/c10_TypeIndex_test 2025-10-10T00:52:30.1305548Z inflating: build/bin/c10_TypeList_test 2025-10-10T00:52:30.1360352Z inflating: build/bin/c10_TypeTraits_test 2025-10-10T00:52:30.1418251Z inflating: build/bin/c10_accumulate_test 2025-10-10T00:52:30.1480699Z inflating: build/bin/c10_bfloat16_test 2025-10-10T00:52:30.1541752Z inflating: build/bin/c10_complex_test 2025-10-10T00:52:30.1597931Z inflating: build/bin/c10_bit_cast_test 2025-10-10T00:52:30.1660937Z inflating: build/bin/c10_complex_math_test 2025-10-10T00:52:30.1716548Z inflating: build/bin/c10_error_test 2025-10-10T00:52:30.1774904Z inflating: build/bin/c10_exception_test 2025-10-10T00:52:30.1831469Z inflating: build/bin/c10_flags_test 2025-10-10T00:52:30.1887174Z inflating: build/bin/c10_generic_math_test 2025-10-10T00:52:30.2064465Z inflating: build/bin/c10_intrusive_ptr_test 2025-10-10T00:52:30.2120809Z inflating: build/bin/c10_irange_test 2025-10-10T00:52:30.2181788Z inflating: build/bin/c10_lazy_test 2025-10-10T00:52:30.2245483Z inflating: build/bin/c10_logging_test 2025-10-10T00:52:30.2328321Z inflating: build/bin/c10_optional_test 2025-10-10T00:52:30.2387175Z inflating: build/bin/c10_registry_test 2025-10-10T00:52:30.2455337Z inflating: build/bin/c10_ordered_preserving_dict_test 2025-10-10T00:52:30.2620646Z inflating: build/bin/c10_small_vector_test 2025-10-10T00:52:30.2683314Z inflating: build/bin/c10_string_util_test 2025-10-10T00:52:30.2740722Z inflating: build/bin/c10_ssize_test 2025-10-10T00:52:30.2795524Z inflating: build/bin/c10_string_view_test 2025-10-10T00:52:30.2851475Z inflating: build/bin/c10_tempfile_test 2025-10-10T00:52:30.2900506Z inflating: build/bin/c10_intrusive_ptr_benchmark 2025-10-10T00:52:30.2962765Z inflating: build/bin/c10_typeid_test 2025-10-10T00:52:30.3021742Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_catches_stream 2025-10-10T00:52:30.3080767Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_catches_thread_and_block_and_device 2025-10-10T00:52:30.3138990Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_1_var_test 2025-10-10T00:52:30.3198074Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_multiple_writes_from_blocks_and_threads 2025-10-10T00:52:30.3255802Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_from_2_processes 2025-10-10T00:52:30.3311137Z inflating: build/bin/c10_cuda_CUDATest 2025-10-10T00:52:30.3370598Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_multiple_writes_from_multiple_blocks 2025-10-10T00:52:30.3428580Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_multiple_writes_from_same_block 2025-10-10T00:52:30.4040646Z inflating: build/bin/vec_test_all_types_DEFAULT 2025-10-10T00:52:30.4670966Z inflating: build/bin/vec_test_all_types_AVX512 2025-10-10T00:52:30.5306905Z inflating: build/bin/vec_test_all_types_AVX2 2025-10-10T00:52:30.5387788Z inflating: build/bin/Dict_test 2025-10-10T00:52:30.5446919Z inflating: build/bin/Dimname_test 2025-10-10T00:52:30.5519533Z inflating: build/bin/MaybeOwned_test 2025-10-10T00:52:30.5581785Z inflating: build/bin/NamedTensor_test 2025-10-10T00:52:30.5647141Z inflating: build/bin/apply_utils_test 2025-10-10T00:52:30.5712874Z inflating: build/bin/atest 2025-10-10T00:52:30.5783530Z inflating: build/bin/basic 2025-10-10T00:52:30.5845207Z inflating: build/bin/broadcast_test 2025-10-10T00:52:30.5902179Z inflating: build/bin/cpu_allocator_test 2025-10-10T00:52:30.5966540Z inflating: build/bin/cpu_generator_test 2025-10-10T00:52:30.6025942Z inflating: build/bin/cpu_profiling_allocator_test 2025-10-10T00:52:30.6126475Z inflating: build/bin/cpu_rng_test 2025-10-10T00:52:30.6182704Z inflating: build/bin/dlconvertor_test 2025-10-10T00:52:30.6247144Z inflating: build/bin/extension_backend_test 2025-10-10T00:52:30.6351486Z inflating: build/bin/ivalue_test 2025-10-10T00:52:30.6412683Z inflating: build/bin/half_test 2025-10-10T00:52:30.6468271Z inflating: build/bin/lazy_tensor_test 2025-10-10T00:52:30.6528653Z inflating: build/bin/math_kernel_test 2025-10-10T00:52:30.6588495Z inflating: build/bin/memory_format_test 2025-10-10T00:52:30.6648093Z inflating: build/bin/memory_overlapping_test 2025-10-10T00:52:30.6707534Z inflating: build/bin/mobile_memory_cleanup 2025-10-10T00:52:30.6769719Z inflating: build/bin/native_test 2025-10-10T00:52:30.6826313Z inflating: build/bin/operator_name_test 2025-10-10T00:52:30.6883696Z inflating: build/bin/operators_test 2025-10-10T00:52:30.6941376Z inflating: build/bin/packedtensoraccessor_test 2025-10-10T00:52:30.7015770Z inflating: build/bin/pow_test 2025-10-10T00:52:30.7079728Z inflating: build/bin/quantized_test 2025-10-10T00:52:30.7135180Z inflating: build/bin/reduce_ops_test 2025-10-10T00:52:30.7192374Z inflating: build/bin/reportMemoryUsage_test 2025-10-10T00:52:30.7255176Z inflating: build/bin/scalar_tensor_test 2025-10-10T00:52:30.7320959Z inflating: build/bin/scalar_test 2025-10-10T00:52:30.7377627Z inflating: build/bin/StorageUtils_test 2025-10-10T00:52:30.7435748Z inflating: build/bin/stride_properties_test 2025-10-10T00:52:30.7523092Z inflating: build/bin/tensor_iterator_test 2025-10-10T00:52:30.7583796Z inflating: build/bin/test_parallel 2025-10-10T00:52:30.7640768Z inflating: build/bin/thread_init_test 2025-10-10T00:52:30.7702082Z inflating: build/bin/type_ptr_test 2025-10-10T00:52:30.7767448Z inflating: build/bin/type_test 2025-10-10T00:52:30.7826229Z inflating: build/bin/undefined_tensor_test 2025-10-10T00:52:30.7882065Z inflating: build/bin/verify_api_visibility 2025-10-10T00:52:30.7958603Z inflating: build/bin/legacy_vmap_test 2025-10-10T00:52:30.8016392Z inflating: build/bin/weakref_test 2025-10-10T00:52:30.8074045Z inflating: build/bin/wrapdim_test 2025-10-10T00:52:30.8131352Z inflating: build/bin/xla_tensor_test 2025-10-10T00:52:30.8197138Z inflating: build/bin/IListRef_test 2025-10-10T00:52:30.8311801Z inflating: build/bin/List_test 2025-10-10T00:52:30.8384661Z inflating: build/bin/KernelFunction_test 2025-10-10T00:52:30.8513595Z inflating: build/bin/kernel_function_legacy_test 2025-10-10T00:52:30.8616748Z inflating: build/bin/kernel_function_test 2025-10-10T00:52:30.8751165Z inflating: build/bin/kernel_lambda_legacy_test 2025-10-10T00:52:30.8861544Z inflating: build/bin/kernel_lambda_test 2025-10-10T00:52:30.8928675Z inflating: build/bin/kernel_stackbased_test 2025-10-10T00:52:30.9031501Z inflating: build/bin/make_boxed_from_unboxed_functor_test 2025-10-10T00:52:30.9087846Z inflating: build/bin/CppSignature_test 2025-10-10T00:52:30.9149968Z inflating: build/bin/backend_fallback_test 2025-10-10T00:52:30.9204190Z inflating: build/bin/op_allowlist_test 2025-10-10T00:52:30.9531432Z inflating: build/bin/op_registration_test 2025-10-10T00:52:30.9604550Z inflating: build/bin/inline_container_test 2025-10-10T00:52:30.9664697Z inflating: build/bin/cuda_allocator_test 2025-10-10T00:52:30.9723611Z inflating: build/bin/cuda_apply_test 2025-10-10T00:52:30.9789772Z inflating: build/bin/cuda_atomic_ops_test 2025-10-10T00:52:30.9852392Z inflating: build/bin/cuda_caching_host_allocator_test 2025-10-10T00:52:30.9929252Z inflating: build/bin/cuda_complex_math_test 2025-10-10T00:52:30.9995169Z inflating: build/bin/cuda_complex_test 2025-10-10T00:52:31.0065103Z inflating: build/bin/cuda_cub_test 2025-10-10T00:52:31.0121093Z inflating: build/bin/cuda_device_test 2025-10-10T00:52:31.0192616Z inflating: build/bin/cuda_distributions_test 2025-10-10T00:52:31.0248441Z inflating: build/bin/cuda_exchange_device_test 2025-10-10T00:52:31.0306007Z inflating: build/bin/cuda_dlconvertor_test 2025-10-10T00:52:31.0364523Z inflating: build/bin/cuda_reportMemoryUsage_test 2025-10-10T00:52:31.0420592Z inflating: build/bin/cuda_allocatorTraceTracker_test 2025-10-10T00:52:31.0477559Z inflating: build/bin/cuda_integer_divider_test 2025-10-10T00:52:31.0545038Z inflating: build/bin/cuda_stream_test 2025-10-10T00:52:31.0600605Z inflating: build/bin/cuda_cudnn_test 2025-10-10T00:52:31.0664128Z inflating: build/bin/cuda_generator_test 2025-10-10T00:52:31.0720553Z inflating: build/bin/cuda_half_test 2025-10-10T00:52:31.0775952Z inflating: build/bin/cuda_optional_test 2025-10-10T00:52:31.0833945Z inflating: build/bin/cuda_packedtensoraccessor_test 2025-10-10T00:52:31.0892602Z inflating: build/bin/cuda_vectorized_test 2025-10-10T00:52:31.0951614Z inflating: build/bin/BackoffTest 2025-10-10T00:52:31.1011094Z inflating: build/bin/FileStoreTest 2025-10-10T00:52:31.1074469Z inflating: build/bin/TCPStoreTest 2025-10-10T00:52:31.2213248Z inflating: build/bin/test_jit 2025-10-10T00:52:31.2272914Z inflating: build/bin/HashStoreTest 2025-10-10T00:52:31.2287217Z inflating: build/bin/ProcessGroupMPITest 2025-10-10T00:52:31.2290598Z inflating: build/bin/example_allreduce 2025-10-10T00:52:31.2352361Z inflating: build/bin/test_dist_autograd 2025-10-10T00:52:31.2425408Z inflating: build/bin/ProcessGroupGlooTest 2025-10-10T00:52:31.2488415Z inflating: build/bin/ProcessGroupGlooAsyncTest 2025-10-10T00:52:31.2559406Z inflating: build/bin/ProcessGroupNCCLTest 2025-10-10T00:52:31.2626460Z inflating: build/bin/ProcessGroupNCCLErrorsTest 2025-10-10T00:52:31.2701115Z inflating: build/bin/test_cpp_rpc 2025-10-10T00:52:31.3909454Z inflating: build/bin/test_api 2025-10-10T00:52:31.3912092Z inflating: build/bin/parallel_benchmark 2025-10-10T00:52:31.4274469Z inflating: build/bin/test_lazy 2025-10-10T00:52:31.4278801Z inflating: build/bin/torch_shm_manager 2025-10-10T00:52:31.4279454Z creating: .additional_ci_files/ 2025-10-10T00:52:31.4351801Z inflating: .additional_ci_files/test-times.json 2025-10-10T00:52:31.4618938Z inflating: .additional_ci_files/test-class-times.json 2025-10-10T00:52:31.4780293Z ##[group]Run rm artifacts.zip 2025-10-10T00:52:31.4780599Z rm artifacts.zip 2025-10-10T00:52:31.4791140Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:52:31.4791495Z env: 2025-10-10T00:52:31.4791707Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:52:31.4792029Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T00:52:31.4792358Z ##[endgroup] 2025-10-10T00:52:31.6251977Z ##[group]Run df -H 2025-10-10T00:52:31.6252231Z df -H 2025-10-10T00:52:31.6261239Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:52:31.6261596Z env: 2025-10-10T00:52:31.6261812Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:52:31.6262361Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T00:52:31.6262880Z ##[endgroup] 2025-10-10T00:52:31.6319834Z Filesystem Size Used Avail Use% Mounted on 2025-10-10T00:52:31.6320232Z devtmpfs 4.2M 0 4.2M 0% /dev 2025-10-10T00:52:31.6320559Z tmpfs 34G 0 34G 0% /dev/shm 2025-10-10T00:52:31.6320891Z tmpfs 14G 553k 14G 1% /run 2025-10-10T00:52:31.6321218Z /dev/nvme0n1p1 161G 54G 108G 34% / 2025-10-10T00:52:31.6321538Z tmpfs 34G 13k 34G 1% /tmp 2025-10-10T00:52:31.6321876Z /dev/nvme0n1p128 11M 1.4M 9.2M 13% /boot/efi 2025-10-10T00:52:31.6322220Z tmpfs 6.7G 0 6.7G 0% /run/user/0 2025-10-10T00:52:31.6356310Z Prepare all required actions 2025-10-10T00:52:31.6357376Z Getting action download info 2025-10-10T00:52:31.7995463Z ##[group]Run ./.github/actions/download-td-artifacts 2025-10-10T00:52:31.7995799Z with: 2025-10-10T00:52:31.7996000Z env: 2025-10-10T00:52:31.7996220Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:52:31.7996558Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T00:52:31.7996905Z ##[endgroup] 2025-10-10T00:52:31.8148877Z ##[group]Run seemethere/download-artifact-s3@v4 2025-10-10T00:52:31.8149208Z with: 2025-10-10T00:52:31.8149417Z name: td_results 2025-10-10T00:52:31.8149661Z s3-bucket: gha-artifacts 2025-10-10T00:52:31.8149933Z region: us-east-1 2025-10-10T00:52:31.8150157Z env: 2025-10-10T00:52:31.8150367Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:52:31.8150696Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T00:52:31.8151039Z ##[endgroup] 2025-10-10T00:52:32.5733642Z (node:59481) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2025-10-10T00:52:32.5734257Z 2025-10-10T00:52:32.5734521Z Please migrate your code to use AWS SDK for JavaScript (v3). 2025-10-10T00:52:32.5735122Z For more information, check the migration guide at https://a.co/7PzMCcy 2025-10-10T00:52:32.5735649Z (Use `node --trace-warnings ...` to show where the warning was created) 2025-10-10T00:52:32.6789061Z Found 1 objects with prefix pytorch/pytorch/18392306083/td_results/ 2025-10-10T00:52:32.6789883Z Starting download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/td_results.json 2025-10-10T00:52:32.7600270Z Finished download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/td_results.json 2025-10-10T00:52:32.7601066Z Artifact download has finished successfully 2025-10-10T00:52:32.8056687Z ##[group]Run mkdir -p .additional_ci_files 2025-10-10T00:52:32.8057062Z mkdir -p .additional_ci_files 2025-10-10T00:52:32.8057508Z mv td_results.json .additional_ci_files/td_results.json || true 2025-10-10T00:52:32.8067355Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:52:32.8067720Z env: 2025-10-10T00:52:32.8067942Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:52:32.8068273Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T00:52:32.8068614Z ##[endgroup] 2025-10-10T00:52:32.8274605Z ##[group]Run .github/scripts/parse_ref.py 2025-10-10T00:52:32.8274984Z .github/scripts/parse_ref.py 2025-10-10T00:52:32.8284036Z shell: /usr/bin/bash -e {0} 2025-10-10T00:52:32.8284307Z env: 2025-10-10T00:52:32.8284520Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:52:32.8284846Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T00:52:32.8285189Z ##[endgroup] 2025-10-10T00:52:32.8737896Z Setting output branch=main 2025-10-10T00:52:32.8857714Z Prepare all required actions 2025-10-10T00:52:32.8858081Z Getting action download info 2025-10-10T00:52:33.0005482Z ##[group]Run ./.github/actions/filter-test-configs 2025-10-10T00:52:33.0005812Z with: 2025-10-10T00:52:33.0006253Z github-token: *** 2025-10-10T00:52:33.0007248Z test-matrix: {"include": [{"config": "slow", "shard": 1, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}, {"config": "slow", "shard": 2, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}, {"config": "slow", "shard": 3, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}]} 2025-10-10T00:52:33.0008657Z job-name: linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 2, 3, linux.g5.4xlarge.nvidia.gpu) 2025-10-10T00:52:33.0009151Z env: 2025-10-10T00:52:33.0009365Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:52:33.0009684Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T00:52:33.0010014Z ##[endgroup] 2025-10-10T00:52:33.0104131Z ##[group]Run nick-fields/retry@v3.0.0 2025-10-10T00:52:33.0104457Z with: 2025-10-10T00:52:33.0104666Z shell: bash 2025-10-10T00:52:33.0104907Z timeout_minutes: 10 2025-10-10T00:52:33.0105160Z max_attempts: 5 2025-10-10T00:52:33.0105400Z retry_wait_seconds: 30 2025-10-10T00:52:33.0106142Z command: set -eux # PyYAML 6.0 doesn't work with MacOS x86 anymore # This must run on Python-3.7 (AmazonLinux2) so can't use request=3.32.2 python3 -m pip install requests==2.27.1 pyyaml==6.0.2 2025-10-10T00:52:33.0106917Z polling_interval_seconds: 1 2025-10-10T00:52:33.0107207Z warning_on_retry: true 2025-10-10T00:52:33.0107492Z continue_on_error: false 2025-10-10T00:52:33.0107739Z env: 2025-10-10T00:52:33.0107954Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:52:33.0108278Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T00:52:33.0108856Z GITHUB_TOKEN: *** 2025-10-10T00:52:33.0109089Z ##[endgroup] 2025-10-10T00:52:33.2096224Z + python3 -m pip install requests==2.27.1 pyyaml==6.0.2 2025-10-10T00:52:33.4451213Z Defaulting to user installation because normal site-packages is not writeable 2025-10-10T00:52:33.6061901Z Collecting requests==2.27.1 2025-10-10T00:52:33.6260448Z Downloading requests-2.27.1-py2.py3-none-any.whl (63 kB) 2025-10-10T00:52:33.8267928Z Collecting pyyaml==6.0.2 2025-10-10T00:52:33.8317336Z Downloading PyYAML-6.0.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (737 kB) 2025-10-10T00:52:34.2558438Z Collecting charset-normalizer~=2.0.0 2025-10-10T00:52:34.2625141Z Downloading charset_normalizer-2.0.12-py3-none-any.whl (39 kB) 2025-10-10T00:52:34.2684638Z Requirement already satisfied: idna<4,>=2.5 in /usr/lib/python3.9/site-packages (from requests==2.27.1) (2.10) 2025-10-10T00:52:34.3197392Z Collecting certifi>=2017.4.17 2025-10-10T00:52:34.3246195Z Downloading certifi-2025.10.5-py3-none-any.whl (163 kB) 2025-10-10T00:52:34.3308013Z Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/lib/python3.9/site-packages (from requests==2.27.1) (1.25.10) 2025-10-10T00:52:34.4128328Z Installing collected packages: charset-normalizer, certifi, requests, pyyaml 2025-10-10T00:52:34.5345825Z Successfully installed certifi-2025.10.5 charset-normalizer-2.0.12 pyyaml-6.0.2 requests-2.27.1 2025-10-10T00:52:35.1840502Z Command completed after 1 attempt(s). 2025-10-10T00:52:35.1918857Z ##[group]Run set -x 2025-10-10T00:52:35.1919127Z set -x 2025-10-10T00:52:35.1919355Z  2025-10-10T00:52:35.1919714Z # Use relative path here as this could be checked out anywhere, not necessarily 2025-10-10T00:52:35.1920171Z # in runner workspace 2025-10-10T00:52:35.1920567Z python3 "${GITHUB_ACTION_PATH}/../../scripts/parse_ref.py" 2025-10-10T00:52:35.1930631Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:52:35.1930999Z env: 2025-10-10T00:52:35.1931218Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:52:35.1931554Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T00:52:35.1931908Z ##[endgroup] 2025-10-10T00:52:35.1964490Z + python3 /home/ec2-user/actions-runner/_work/pytorch/pytorch/./.github/actions/filter-test-configs/../../scripts/parse_ref.py 2025-10-10T00:52:35.2150433Z Setting output branch=main 2025-10-10T00:52:35.2226551Z ##[group]Run echo "Workflow: ${GITHUB_WORKFLOW}" 2025-10-10T00:52:35.2227279Z echo "Workflow: ${GITHUB_WORKFLOW}" 2025-10-10T00:52:35.2227791Z echo "Job name: ${JOB_NAME}" 2025-10-10T00:52:35.2228215Z  2025-10-10T00:52:35.2228761Z # Use relative path here as this could be checked out anywhere, not necessarily 2025-10-10T00:52:35.2229699Z # in runner workspace 2025-10-10T00:52:35.2230328Z python3 "${GITHUB_ACTION_PATH}/../../scripts/filter_test_configs.py" \ 2025-10-10T00:52:35.2231026Z  --workflow "${GITHUB_WORKFLOW}" \ 2025-10-10T00:52:35.2231522Z  --job-name "${JOB_NAME}" \ 2025-10-10T00:52:35.2233080Z  --test-matrix "{"include": [{"config": "slow", "shard": 1, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}, {"config": "slow", "shard": 2, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}, {"config": "slow", "shard": 3, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}]}" \ 2025-10-10T00:52:35.2234659Z  --selected-test-configs "" \ 2025-10-10T00:52:35.2235156Z  --pr-number "${PR_NUMBER}" \ 2025-10-10T00:52:35.2235616Z  --tag "${TAG}" \ 2025-10-10T00:52:35.2236060Z  --event-name "${EVENT_NAME}" \ 2025-10-10T00:52:35.2236531Z  --schedule "${SCHEDULE}" \ 2025-10-10T00:52:35.2236990Z  --branch "${HEAD_BRANCH}" 2025-10-10T00:52:35.2249033Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:52:35.2249573Z env: 2025-10-10T00:52:35.2249891Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:52:35.2250386Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T00:52:35.2251513Z GITHUB_TOKEN: *** 2025-10-10T00:52:35.2252220Z JOB_NAME: linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 2, 3, linux.g5.4xlarge.nvidia.gpu) 2025-10-10T00:52:35.2252983Z PR_NUMBER: 2025-10-10T00:52:35.2253303Z TAG: 2025-10-10T00:52:35.2253589Z EVENT_NAME: push 2025-10-10T00:52:35.2253935Z SCHEDULE: 2025-10-10T00:52:35.2254251Z HEAD_BRANCH: main 2025-10-10T00:52:35.2254584Z ##[endgroup] 2025-10-10T00:52:35.2292596Z Workflow: slow 2025-10-10T00:52:35.2293288Z Job name: linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 2, 3, linux.g5.4xlarge.nvidia.gpu) 2025-10-10T00:52:35.4289232Z Setting output keep-going=True 2025-10-10T00:52:35.4289590Z Setting output ci-verbose-test-logs=False 2025-10-10T00:52:35.4289957Z Setting output ci-test-showlocals=False 2025-10-10T00:52:35.4290291Z Setting output ci-no-test-timeout=False 2025-10-10T00:52:35.4290605Z Setting output ci-no-td=False 2025-10-10T00:52:35.4290901Z Setting output ci-td-distributed=False 2025-10-10T00:52:35.4291220Z Setting output is-unstable=False 2025-10-10T00:52:35.4291523Z Setting output reenabled-issues= 2025-10-10T00:52:35.4292563Z Setting output test-matrix={"include": [{"config": "slow", "shard": 1, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}, {"config": "slow", "shard": 2, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}, {"config": "slow", "shard": 3, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}]} 2025-10-10T00:52:35.4293613Z Setting output is-test-matrix-empty=False 2025-10-10T00:52:35.4427840Z ##[group]Run echo "Filtered matrix:" 2025-10-10T00:52:35.4428172Z echo "Filtered matrix:" 2025-10-10T00:52:35.4429171Z echo "{"include": [{"config": "slow", "shard": 1, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}, {"config": "slow", "shard": 2, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}, {"config": "slow", "shard": 3, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}]}" 2025-10-10T00:52:35.4430147Z  2025-10-10T00:52:35.4430358Z echo 2025-10-10T00:52:35.4430626Z echo "Is the current job unstable? False" 2025-10-10T00:52:35.4430949Z  2025-10-10T00:52:35.4431358Z echo 2025-10-10T00:52:35.4431614Z echo "Is keep-going label set? True" 2025-10-10T00:52:35.4431921Z  2025-10-10T00:52:35.4432124Z echo 2025-10-10T00:52:35.4432350Z echo "Reenabled issues? " 2025-10-10T00:52:35.4441518Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:52:35.4441871Z env: 2025-10-10T00:52:35.4442077Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:52:35.4442393Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T00:52:35.4442915Z ##[endgroup] 2025-10-10T00:52:35.4472621Z Filtered matrix: 2025-10-10T00:52:35.4474100Z {include: [{config: slow, shard: 1, num_shards: 3, runner: linux.g5.4xlarge.nvidia.gpu}, {config: slow, shard: 2, num_shards: 3, runner: linux.g5.4xlarge.nvidia.gpu}, {config: slow, shard: 3, num_shards: 3, runner: linux.g5.4xlarge.nvidia.gpu}]} 2025-10-10T00:52:35.4475290Z 2025-10-10T00:52:35.4475501Z Is the current job unstable? False 2025-10-10T00:52:35.4475778Z 2025-10-10T00:52:35.4475904Z Is keep-going label set? True 2025-10-10T00:52:35.4476097Z 2025-10-10T00:52:35.4476191Z Reenabled issues? 2025-10-10T00:52:35.4538245Z ##[group]Run echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2025-10-10T00:52:35.4538774Z echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2025-10-10T00:52:35.4547851Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:52:35.4548205Z env: 2025-10-10T00:52:35.4548421Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:52:35.4548748Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T00:52:35.4549108Z JOB_TIMEOUT: 240 2025-10-10T00:52:35.4549332Z ##[endgroup] 2025-10-10T00:52:35.4615357Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-10-10T00:52:35.4615865Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-10-10T00:52:35.4616300Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-10-10T00:52:35.4625035Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T00:52:35.4625392Z env: 2025-10-10T00:52:35.4625606Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:52:35.4625930Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T00:52:35.4626262Z ##[endgroup] 2025-10-10T00:52:35.4745593Z ##[group]Run set -x 2025-10-10T00:52:35.4745925Z set -x 2025-10-10T00:52:35.4746154Z  2025-10-10T00:52:35.4746406Z if [[ $TEST_CONFIG == 'multigpu' ]]; then 2025-10-10T00:52:35.4746799Z  TEST_COMMAND=.ci/pytorch/multigpu-test.sh 2025-10-10T00:52:35.4747204Z elif [[ $BUILD_ENVIRONMENT == *onnx* ]]; then 2025-10-10T00:52:35.4747560Z  TEST_COMMAND=.ci/onnx/test.sh 2025-10-10T00:52:35.4747847Z else 2025-10-10T00:52:35.4748111Z  TEST_COMMAND=.ci/pytorch/test.sh 2025-10-10T00:52:35.4748405Z fi 2025-10-10T00:52:35.4748618Z  2025-10-10T00:52:35.4748869Z # Leaving 1GB for the runner and other things 2025-10-10T00:52:35.4749408Z TOTAL_AVAILABLE_MEMORY_IN_GB=$(awk '/MemTotal/ { printf "%.3f \n", $2/1024/1024 - 1 }' /proc/meminfo) 2025-10-10T00:52:35.4750207Z # https://docs.docker.com/engine/containers/resource_constraints/#--memory-swap-details, the 3GB swap 2025-10-10T00:52:35.4750864Z # comes from https://github.com/pytorch/test-infra/pull/6058 2025-10-10T00:52:35.4751358Z TOTAL_MEMORY_WITH_SWAP=$(("${TOTAL_AVAILABLE_MEMORY_IN_GB%.*}" + 3)) 2025-10-10T00:52:35.4751754Z  2025-10-10T00:52:35.4752019Z if [[ ${BUILD_ENVIRONMENT} == *"s390x"* ]]; then 2025-10-10T00:52:35.4752362Z  SHM_OPTS= 2025-10-10T00:52:35.4752599Z  JENKINS_USER= 2025-10-10T00:52:35.4752937Z  # ensure that docker container cleanly exits in 12 hours 2025-10-10T00:52:35.4753392Z  # if for some reason cleanup action doesn't stop container 2025-10-10T00:52:35.4753776Z  # when job is cancelled 2025-10-10T00:52:35.4754076Z  DOCKER_SHELL_CMD="sleep 12h" 2025-10-10T00:52:35.4754355Z else 2025-10-10T00:52:35.4754606Z  SHM_OPTS="--shm-size=${SHM_SIZE}" 2025-10-10T00:52:35.4754934Z  JENKINS_USER="--user jenkins" 2025-10-10T00:52:35.4755246Z  DOCKER_SHELL_CMD= 2025-10-10T00:52:35.4755502Z fi 2025-10-10T00:52:35.4755708Z  2025-10-10T00:52:35.4756037Z # detached container should get cleaned up by teardown_ec2_linux 2025-10-10T00:52:35.4756542Z # TODO: Stop building test binaries as part of the build phase 2025-10-10T00:52:35.4757276Z # Used for GPU_FLAG, SHM_OPTS, JENKINS_USER and DOCKER_SHELL_CMD since that doesn't play nice 2025-10-10T00:52:35.4757782Z # shellcheck disable=SC2086,SC2090 2025-10-10T00:52:35.4758155Z container_name=$(docker run \ 2025-10-10T00:52:35.4758459Z  ${GPU_FLAG:-} \ 2025-10-10T00:52:35.4758754Z  ${SCCACHE_SERVER_PORT_DOCKER_FLAG:-} \ 2025-10-10T00:52:35.4759075Z  -e BUILD_ENVIRONMENT \ 2025-10-10T00:52:35.4759363Z  -e PR_NUMBER \ 2025-10-10T00:52:35.4759633Z  -e GITHUB_ACTIONS \ 2025-10-10T00:52:35.4759916Z  -e GITHUB_REPOSITORY \ 2025-10-10T00:52:35.4760209Z  -e GITHUB_WORKFLOW \ 2025-10-10T00:52:35.4760491Z  -e GITHUB_JOB \ 2025-10-10T00:52:35.4760751Z  -e GITHUB_RUN_ID \ 2025-10-10T00:52:35.4761027Z  -e GITHUB_RUN_NUMBER \ 2025-10-10T00:52:35.4761310Z  -e GITHUB_RUN_ATTEMPT \ 2025-10-10T00:52:35.4761604Z  -e JOB_ID \ 2025-10-10T00:52:35.4761857Z  -e JOB_NAME \ 2025-10-10T00:52:35.4762118Z  -e BASE_SHA \ 2025-10-10T00:52:35.4762358Z  -e BRANCH \ 2025-10-10T00:52:35.4762600Z  -e SHA1 \ 2025-10-10T00:52:35.4762848Z  -e AWS_DEFAULT_REGION \ 2025-10-10T00:52:35.4763135Z  -e IN_WHEEL_TEST \ 2025-10-10T00:52:35.4763398Z  -e SHARD_NUMBER \ 2025-10-10T00:52:35.4763660Z  -e TEST_CONFIG \ 2025-10-10T00:52:35.4763935Z  -e NUM_TEST_SHARDS \ 2025-10-10T00:52:35.4764217Z  -e REENABLED_ISSUES \ 2025-10-10T00:52:35.4764505Z  -e CONTINUE_THROUGH_ERROR \ 2025-10-10T00:52:35.4764914Z  -e VERBOSE_TEST_LOGS \ 2025-10-10T00:52:35.4765204Z  -e TEST_SHOWLOCALS \ 2025-10-10T00:52:35.4765483Z  -e NO_TEST_TIMEOUT \ 2025-10-10T00:52:35.4765749Z  -e NO_TD \ 2025-10-10T00:52:35.4765996Z  -e TD_DISTRIBUTED \ 2025-10-10T00:52:35.4766292Z  -e PR_LABELS \ 2025-10-10T00:52:35.4766586Z  -e MAX_JOBS="$(nproc --ignore=2)" \ 2025-10-10T00:52:35.4766904Z  -e SCCACHE_BUCKET \ 2025-10-10T00:52:35.4767280Z  -e SCCACHE_REGION \ 2025-10-10T00:52:35.4767551Z  -e XLA_CUDA \ 2025-10-10T00:52:35.4767838Z  -e XLA_CLANG_CACHE_S3_BUCKET_NAME \ 2025-10-10T00:52:35.4768201Z  -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK \ 2025-10-10T00:52:35.4768549Z  -e PYTORCH_TEST_RERUN_DISABLED_TESTS \ 2025-10-10T00:52:35.4768910Z  -e SKIP_SCCACHE_INITIALIZATION=1 \ 2025-10-10T00:52:35.4769245Z  -e HUGGING_FACE_HUB_TOKEN \ 2025-10-10T00:52:35.4769573Z  -e VLLM_TEST_HUGGING_FACE_TOKEN \ 2025-10-10T00:52:35.4769907Z  -e SCRIBE_GRAPHQL_ACCESS_TOKEN \ 2025-10-10T00:52:35.4770221Z  -e DASHBOARD_TAG \ 2025-10-10T00:52:35.4770514Z  -e ARTIFACTS_FILE_SUFFIX \ 2025-10-10T00:52:35.4770885Z  --memory="${TOTAL_AVAILABLE_MEMORY_IN_GB%.*}g" \ 2025-10-10T00:52:35.4771284Z  --memory-swap="${TOTAL_MEMORY_WITH_SWAP}g" \ 2025-10-10T00:52:35.4771689Z  --env-file="/tmp/github_env_${GITHUB_RUN_ID}" \ 2025-10-10T00:52:35.4772076Z  --security-opt seccomp=unconfined \ 2025-10-10T00:52:35.4772409Z  --cap-add=SYS_PTRACE \ 2025-10-10T00:52:35.4772705Z  --ipc=host \ 2025-10-10T00:52:35.4772950Z  ${SHM_OPTS} \ 2025-10-10T00:52:35.4773208Z  --tty \ 2025-10-10T00:52:35.4773441Z  --detach \ 2025-10-10T00:52:35.4773703Z  --name="${container_name}" \ 2025-10-10T00:52:35.4774001Z  ${JENKINS_USER} \ 2025-10-10T00:52:35.4774346Z  -v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \ 2025-10-10T00:52:35.4774739Z  -w /var/lib/jenkins/workspace \ 2025-10-10T00:52:35.4775055Z  "${DOCKER_IMAGE}" \ 2025-10-10T00:52:35.4775325Z  ${DOCKER_SHELL_CMD} 2025-10-10T00:52:35.4775583Z ) 2025-10-10T00:52:35.4775994Z echo "DOCKER_CONTAINER_ID=${container_name}" >> "${GITHUB_ENV}" 2025-10-10T00:52:35.4776372Z  2025-10-10T00:52:35.4776623Z if [[ ${BUILD_ENVIRONMENT} == *"s390x"* ]]; then 2025-10-10T00:52:35.4777171Z  docker exec -t "${container_name}" sh -c "python3 -m pip install -r .ci/docker/requirements-ci.txt" 2025-10-10T00:52:35.4777678Z fi 2025-10-10T00:52:35.4777905Z  2025-10-10T00:52:35.4778367Z docker exec -t "${container_name}" sh -c "python3 -m pip install $(echo dist/*.whl)[opt-einsum] && ${TEST_COMMAND}" 2025-10-10T00:52:35.4786974Z shell: /usr/bin/bash -e {0} 2025-10-10T00:52:35.4787238Z env: 2025-10-10T00:52:35.4787474Z GIT_DEFAULT_BRANCH: main 2025-10-10T00:52:35.4787805Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T00:52:35.4788248Z BUILD_ENVIRONMENT: linux-jammy-cuda12.8-py3.10-gcc11-sm86 2025-10-10T00:52:35.4788616Z PR_NUMBER: 2025-10-10T00:52:35.4788874Z GITHUB_REPOSITORY: pytorch/pytorch 2025-10-10T00:52:35.4789189Z GITHUB_WORKFLOW: slow 2025-10-10T00:52:35.4789432Z GITHUB_JOB: test 2025-10-10T00:52:35.4789674Z GITHUB_RUN_ID: 18392306083 2025-10-10T00:52:35.4789952Z GITHUB_RUN_NUMBER: 18872 2025-10-10T00:52:35.4790221Z GITHUB_RUN_ATTEMPT: 1 2025-10-10T00:52:35.4790464Z JOB_ID: 52406799277 2025-10-10T00:52:35.4790937Z JOB_NAME: linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 2, 3, linux.g5.4xlarge.nvidia.gpu) 2025-10-10T00:52:35.4791447Z BRANCH: main 2025-10-10T00:52:35.4791720Z SHA1: 344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:52:35.4792083Z BASE_SHA: 344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:52:35.4792407Z TEST_CONFIG: slow 2025-10-10T00:52:35.4792758Z SHARD_NUMBER: 2 2025-10-10T00:52:35.4792999Z NUM_TEST_SHARDS: 3 2025-10-10T00:52:35.4793228Z EXTRA_FLAGS: 2025-10-10T00:52:35.4793456Z OP_BENCHMARK_TESTS: 2025-10-10T00:52:35.4793705Z REENABLED_ISSUES: 2025-10-10T00:52:35.4793959Z CONTINUE_THROUGH_ERROR: True 2025-10-10T00:52:35.4794236Z VERBOSE_TEST_LOGS: False 2025-10-10T00:52:35.4794501Z TEST_SHOWLOCALS: False 2025-10-10T00:52:35.4794756Z NO_TEST_TIMEOUT: False 2025-10-10T00:52:35.4795005Z NO_TD: False 2025-10-10T00:52:35.4795229Z TD_DISTRIBUTED: False 2025-10-10T00:52:35.4795537Z SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2 2025-10-10T00:52:35.4795883Z SCCACHE_REGION: us-east-1 2025-10-10T00:52:35.4796142Z SHM_SIZE: 2g 2025-10-10T00:52:35.4796913Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:52:35.4797976Z XLA_CUDA: 2025-10-10T00:52:35.4798692Z XLA_CLANG_CACHE_S3_BUCKET_NAME: ossci-compiler-clang-cache-circleci-xla 2025-10-10T00:52:35.4799339Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK: 0 2025-10-10T00:52:35.4799667Z PYTORCH_TEST_RERUN_DISABLED_TESTS: 0 2025-10-10T00:52:35.4799945Z DASHBOARD_TAG: 2025-10-10T00:52:35.4800361Z VLLM_TEST_HUGGING_FACE_TOKEN: *** 2025-10-10T00:52:35.4800807Z HUGGING_FACE_HUB_TOKEN: *** 2025-10-10T00:52:35.4801234Z SCRIBE_GRAPHQL_ACCESS_TOKEN: *** 2025-10-10T00:52:35.4801663Z ARTIFACTS_FILE_SUFFIX: test-slow-2-3-linux.g5.4xlarge.nvidia.gpu_52406799277 2025-10-10T00:52:35.4802105Z ##[endgroup] 2025-10-10T00:52:35.4830986Z + [[ slow == \m\u\l\t\i\g\p\u ]] 2025-10-10T00:52:35.4831695Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *onnx* ]] 2025-10-10T00:52:35.4832232Z + TEST_COMMAND=.ci/pytorch/test.sh 2025-10-10T00:52:35.4835770Z ++ awk '/MemTotal/ { printf "%.3f \n", $2/1024/1024 - 1 }' /proc/meminfo 2025-10-10T00:52:35.4862447Z + TOTAL_AVAILABLE_MEMORY_IN_GB='61.094 ' 2025-10-10T00:52:35.4862815Z + TOTAL_MEMORY_WITH_SWAP=64 2025-10-10T00:52:35.4863278Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *\s\3\9\0\x* ]] 2025-10-10T00:52:35.4863667Z + SHM_OPTS=--shm-size=2g 2025-10-10T00:52:35.4863932Z + JENKINS_USER='--user jenkins' 2025-10-10T00:52:35.4864208Z + DOCKER_SHELL_CMD= 2025-10-10T00:52:35.4873483Z +++ nproc --ignore=2 2025-10-10T00:52:35.4901581Z ++ docker run --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all -e BUILD_ENVIRONMENT -e PR_NUMBER -e GITHUB_ACTIONS -e GITHUB_REPOSITORY -e GITHUB_WORKFLOW -e GITHUB_JOB -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RUN_ATTEMPT -e JOB_ID -e JOB_NAME -e BASE_SHA -e BRANCH -e SHA1 -e AWS_DEFAULT_REGION -e IN_WHEEL_TEST -e SHARD_NUMBER -e TEST_CONFIG -e NUM_TEST_SHARDS -e REENABLED_ISSUES -e CONTINUE_THROUGH_ERROR -e VERBOSE_TEST_LOGS -e TEST_SHOWLOCALS -e NO_TEST_TIMEOUT -e NO_TD -e TD_DISTRIBUTED -e PR_LABELS -e MAX_JOBS=14 -e SCCACHE_BUCKET -e SCCACHE_REGION -e XLA_CUDA -e XLA_CLANG_CACHE_S3_BUCKET_NAME -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK -e PYTORCH_TEST_RERUN_DISABLED_TESTS -e SKIP_SCCACHE_INITIALIZATION=1 -e HUGGING_FACE_HUB_TOKEN -e VLLM_TEST_HUGGING_FACE_TOKEN -e SCRIBE_GRAPHQL_ACCESS_TOKEN -e DASHBOARD_TAG -e ARTIFACTS_FILE_SUFFIX --memory=61g --memory-swap=64g --env-file=/tmp/github_env_18392306083 --security-opt seccomp=unconfined --cap-add=SYS_PTRACE --ipc=host --shm-size=2g --tty --detach --name= --user jenkins -v /home/ec2-user/actions-runner/_work/pytorch/pytorch:/var/lib/jenkins/workspace -w /var/lib/jenkins/workspace 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T00:52:42.3436971Z + container_name=0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T00:52:42.3437693Z + echo DOCKER_CONTAINER_ID=0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T00:52:42.3440471Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *\s\3\9\0\x* ]] 2025-10-10T00:52:42.3445869Z ++ echo dist/torch-2.10.0a0+git344e636-cp310-cp310-linux_x86_64.whl 2025-10-10T00:52:42.3449725Z + docker exec -t 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb sh -c 'python3 -m pip install dist/torch-2.10.0a0+git344e636-cp310-cp310-linux_x86_64.whl[opt-einsum] && .ci/pytorch/test.sh' 2025-10-10T00:52:42.8392397Z Processing ./dist/torch-2.10.0a0+git344e636-cp310-cp310-linux_x86_64.whl (from torch==2.10.0a0+git344e636) 2025-10-10T00:52:43.1551496Z Requirement already satisfied: filelock in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+git344e636->torch==2.10.0a0+git344e636) (3.18.0) 2025-10-10T00:52:43.1555628Z Requirement already satisfied: typing-extensions>=4.10.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+git344e636->torch==2.10.0a0+git344e636) (4.12.2) 2025-10-10T00:52:43.1559695Z Requirement already satisfied: sympy>=1.13.3 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+git344e636->torch==2.10.0a0+git344e636) (1.13.3) 2025-10-10T00:52:43.1564269Z Requirement already satisfied: networkx>=2.5.1 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+git344e636->torch==2.10.0a0+git344e636) (2.8.8) 2025-10-10T00:52:43.1568139Z Requirement already satisfied: jinja2 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+git344e636->torch==2.10.0a0+git344e636) (3.1.6) 2025-10-10T00:52:43.1572581Z Requirement already satisfied: fsspec>=0.8.5 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+git344e636->torch==2.10.0a0+git344e636) (2025.9.0) 2025-10-10T00:52:43.1585578Z Requirement already satisfied: opt-einsum>=3.3 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+git344e636->torch==2.10.0a0+git344e636) (3.3.0) 2025-10-10T00:52:43.1978677Z Requirement already satisfied: numpy>=1.7 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from opt-einsum>=3.3->torch==2.10.0a0+git344e636->torch==2.10.0a0+git344e636) (1.22.4) 2025-10-10T00:52:43.1997725Z Requirement already satisfied: mpmath<1.4,>=1.1.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from sympy>=1.13.3->torch==2.10.0a0+git344e636->torch==2.10.0a0+git344e636) (1.3.0) 2025-10-10T00:52:43.2055012Z Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from jinja2->torch==2.10.0a0+git344e636->torch==2.10.0a0+git344e636) (3.0.3) 2025-10-10T00:52:43.5872589Z Installing collected packages: torch 2025-10-10T00:52:55.0784849Z Successfully installed torch-2.10.0a0+git344e636 2025-10-10T00:52:55.1526826Z + export TERM=vt100 2025-10-10T00:52:55.1527338Z + TERM=vt100 2025-10-10T00:52:55.1527693Z ++ dirname .ci/pytorch/test.sh 2025-10-10T00:52:55.1543520Z + source .ci/pytorch/common.sh 2025-10-10T00:52:55.1550520Z +++ dirname .ci/pytorch/common.sh 2025-10-10T00:52:55.1660966Z ++ source .ci/pytorch/common_utils.sh 2025-10-10T00:52:55.1661308Z +++ declare -f -t trap_add 2025-10-10T00:52:55.1665446Z ++ set -ex -o pipefail 2025-10-10T00:52:55.1665817Z ++ [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *rocm* ]] 2025-10-10T00:52:55.1666194Z ++ BUILD_TEST_LIBTORCH=0 2025-10-10T00:52:55.1780245Z ++ dirname .ci/pytorch/test.sh 2025-10-10T00:52:55.1790622Z + source .ci/pytorch/common-build.sh 2025-10-10T00:52:55.1792815Z ++ [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 != *win-* ]] 2025-10-10T00:52:55.1799190Z ++++ dirname .ci/pytorch/common-build.sh 2025-10-10T00:52:55.1808933Z +++ cd .ci/pytorch 2025-10-10T00:52:55.1809188Z +++ pwd -P 2025-10-10T00:52:55.1812327Z ++ script_dir=/var/lib/jenkins/workspace/.ci/pytorch 2025-10-10T00:52:55.1812790Z ++ [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *-pch* ]] 2025-10-10T00:52:55.1813153Z ++ which sccache 2025-10-10T00:52:55.1904400Z ++ [[ -z ossci-compiler-cache-circleci-v2 ]] 2025-10-10T00:52:55.1904758Z ++ sccache --stop-server 2025-10-10T00:52:55.1937627Z ++ true 2025-10-10T00:52:55.1938273Z ++ rm -f /var/lib/jenkins/sccache_error.log 2025-10-10T00:52:55.1951030Z ++ trap_add sccache_epilogue EXIT 2025-10-10T00:52:55.1951687Z ++ trap_add_cmd=sccache_epilogue 2025-10-10T00:52:55.1951976Z ++ shift 2025-10-10T00:52:55.1952213Z ++ for trap_add_name in "$@" 2025-10-10T00:52:55.1958978Z ++++ trap -p EXIT 2025-10-10T00:52:55.1962408Z +++ eval 'extract_trap_cmd ' 2025-10-10T00:52:55.1962986Z ++++ extract_trap_cmd 2025-10-10T00:52:55.1963314Z ++++ printf '%s\n' '' 2025-10-10T00:52:55.1963602Z +++ printf '%s\n' sccache_epilogue 2025-10-10T00:52:55.1966230Z ++ trap -- ' 2025-10-10T00:52:55.1966489Z sccache_epilogue' EXIT 2025-10-10T00:52:55.1966737Z ++ [[ -n 1 ]] 2025-10-10T00:52:55.1967194Z ++ echo 'Skipping sccache server initialization, setting environment variables' 2025-10-10T00:52:55.1967763Z Skipping sccache server initialization, setting environment variables 2025-10-10T00:52:55.1968194Z ++ export SCCACHE_IDLE_TIMEOUT=0 2025-10-10T00:52:55.1968494Z ++ SCCACHE_IDLE_TIMEOUT=0 2025-10-10T00:52:55.1968840Z ++ export SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-10-10T00:52:55.1969267Z ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-10-10T00:52:55.1975762Z ++ export RUST_LOG=sccache::server=error 2025-10-10T00:52:55.1976103Z ++ RUST_LOG=sccache::server=error 2025-10-10T00:52:55.1976401Z ++ sccache --zero-stats 2025-10-10T00:52:55.4814024Z Statistics zeroed. 2025-10-10T00:52:55.4822626Z ++ which ccache 2025-10-10T00:52:55.4880101Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 != *rocm* ]] 2025-10-10T00:52:55.4880648Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 != *s390x* ]] 2025-10-10T00:52:55.4881028Z + [[ -d /var/lib/jenkins/workspace ]] 2025-10-10T00:52:55.4883605Z ++ stat -c %u /var/lib/jenkins/workspace 2025-10-10T00:52:55.4901923Z + WORKSPACE_ORIGINAL_OWNER_ID=1000 2025-10-10T00:52:55.4902329Z + trap_add cleanup_workspace EXIT 2025-10-10T00:52:55.4902634Z + trap_add_cmd=cleanup_workspace 2025-10-10T00:52:55.4902901Z + shift 2025-10-10T00:52:55.4903124Z + for trap_add_name in "$@" 2025-10-10T00:52:55.4910886Z +++ trap -p EXIT 2025-10-10T00:52:55.4914406Z ++ eval 'extract_trap_cmd trap -- '\'' 2025-10-10T00:52:55.4914874Z sccache_epilogue'\'' EXIT' 2025-10-10T00:52:55.4915261Z +++ extract_trap_cmd trap -- ' 2025-10-10T00:52:55.4915574Z sccache_epilogue' EXIT 2025-10-10T00:52:55.4915819Z +++ printf '%s\n' ' 2025-10-10T00:52:55.4916051Z sccache_epilogue' 2025-10-10T00:52:55.4916304Z ++ printf '%s\n' cleanup_workspace 2025-10-10T00:52:55.4918779Z + trap -- ' 2025-10-10T00:52:55.4919165Z sccache_epilogue 2025-10-10T00:52:55.4919493Z cleanup_workspace' EXIT 2025-10-10T00:52:55.4919809Z + sudo chown -R jenkins /var/lib/jenkins/workspace 2025-10-10T00:52:56.5419186Z + git config --global --add safe.directory /var/lib/jenkins/workspace 2025-10-10T00:52:56.5444717Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *cuda* ]] 2025-10-10T00:52:56.5448291Z ++ python -c 'import os;import numba.cuda; print(os.path.dirname(numba.cuda.__file__))' 2025-10-10T00:52:56.9605694Z + NUMBA_CUDA_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/numba/cuda 2025-10-10T00:52:56.9606543Z + '[' -n /opt/conda/envs/py_3.10/lib/python3.10/site-packages/numba/cuda ']' 2025-10-10T00:52:56.9614071Z +++ realpath .ci/pytorch/test.sh 2025-10-10T00:52:56.9628122Z ++ dirname /var/lib/jenkins/workspace/.ci/pytorch/test.sh 2025-10-10T00:52:56.9638937Z + NUMBA_PATCH=/var/lib/jenkins/workspace/.ci/pytorch/numba-cuda-13.patch 2025-10-10T00:52:56.9639495Z + pushd /opt/conda/envs/py_3.10/lib/python3.10/site-packages/numba/cuda 2025-10-10T00:52:56.9640069Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/numba/cuda ~/workspace 2025-10-10T00:52:56.9640607Z + patch -p4 2025-10-10T00:52:56.9655574Z patching file cudadrv/driver.py 2025-10-10T00:52:56.9655892Z Hunk #1 succeeded at 357 (offset -8 lines). 2025-10-10T00:52:56.9686761Z + popd 2025-10-10T00:52:56.9687090Z ~/workspace 2025-10-10T00:52:56.9687423Z + echo 'Environment variables:' 2025-10-10T00:52:56.9687768Z Environment variables: 2025-10-10T00:52:56.9688045Z + env 2025-10-10T00:52:56.9698842Z GITHUB_WORKSPACE=/home/ec2-user/actions-runner/_work/pytorch/pytorch 2025-10-10T00:52:56.9699418Z CONTINUE_THROUGH_ERROR=True 2025-10-10T00:52:56.9700222Z BUILD_ENVIRONMENT=linux-jammy-cuda12.8-py3.10-gcc11-sm86 2025-10-10T00:52:56.9700841Z VLLM_TEST_HUGGING_FACE_TOKEN=*** 2025-10-10T00:52:56.9701121Z HOSTNAME=0d479bf7aa10 2025-10-10T00:52:56.9701688Z GITHUB_PATH=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/add_path_4e89ceb4-5358-45fa-a946-adf2005c4f96 2025-10-10T00:52:56.9702351Z GITHUB_ACTION=__run_2 2025-10-10T00:52:56.9702682Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=0 2025-10-10T00:52:56.9702976Z GITHUB_RUN_NUMBER=18872 2025-10-10T00:52:56.9703224Z TEST_CONFIG=slow 2025-10-10T00:52:56.9703475Z GITHUB_REPOSITORY_OWNER_ID=21003710 2025-10-10T00:52:56.9703803Z TORCH_NVCC_FLAGS=-Xfatbin -compress-all 2025-10-10T00:52:56.9704108Z SCCACHE_IDLE_TIMEOUT=0 2025-10-10T00:52:56.9704489Z SCRIBE_GRAPHQL_ACCESS_TOKEN=*** 2025-10-10T00:52:56.9704855Z GITHUB_TRIGGERING_ACTOR=pytorchmergebot 2025-10-10T00:52:56.9705168Z GITHUB_REF_TYPE=branch 2025-10-10T00:52:56.9705506Z BASE_SHA=344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:52:56.9705843Z XLA_CUDA= 2025-10-10T00:52:56.9706071Z NCCL_LIB_DIR=/usr/local/cuda/lib64/ 2025-10-10T00:52:56.9706467Z HUGGING_FACE_HUB_TOKEN=*** 2025-10-10T00:52:56.9707147Z *** 2025-10-10T00:52:56.9707450Z GITHUB_REPOSITORY_ID=65600975 2025-10-10T00:52:56.9707808Z GITHUB_ACTIONS=true 2025-10-10T00:52:56.9708148Z NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T00:52:56.9708606Z SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-10-10T00:52:56.9709155Z SHA1=344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:52:56.9709684Z GITHUB_SHA=344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:52:56.9710436Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/slow.yml@refs/heads/main 2025-10-10T00:52:56.9711118Z UCC_HOME=/usr 2025-10-10T00:52:56.9711467Z VERBOSE_TEST_LOGS=False 2025-10-10T00:52:56.9711881Z GITHUB_REF=refs/heads/main 2025-10-10T00:52:56.9712290Z SHARD_NUMBER=2 2025-10-10T00:52:56.9712622Z GITHUB_REF_PROTECTED=true 2025-10-10T00:52:56.9712991Z HOME=/var/lib/jenkins 2025-10-10T00:52:56.9713415Z GITHUB_API_URL=https://api.github.com 2025-10-10T00:52:56.9713880Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2025-10-10T00:52:56.9714370Z UCX_COMMIT=7836b165abdbe468a2f607e7254011c07d788152 2025-10-10T00:52:56.9714840Z USE_SYSTEM_NCCL=1 2025-10-10T00:52:56.9715191Z NUM_TEST_SHARDS=3 2025-10-10T00:52:56.9715505Z UCX_HOME=/usr 2025-10-10T00:52:56.9716688Z GITHUB_STATE=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/save_state_4e89ceb4-5358-45fa-a946-adf2005c4f96 2025-10-10T00:52:56.9717926Z JOB_NAME=linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 2, 3, linux.g5.4xlarge.nvidia.gpu) 2025-10-10T00:52:56.9719148Z GITHUB_ENV=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_env_4e89ceb4-5358-45fa-a946-adf2005c4f96 2025-10-10T00:52:56.9720199Z GITHUB_EVENT_PATH=/home/ec2-user/actions-runner/_work/_temp/_github_workflow/event.json 2025-10-10T00:52:56.9720966Z GITHUB_EVENT_NAME=push 2025-10-10T00:52:56.9721349Z DASHBOARD_TAG= 2025-10-10T00:52:56.9721713Z GITHUB_RUN_ID=18392306083 2025-10-10T00:52:56.9722140Z INSTALLED_OPENBLAS= 2025-10-10T00:52:56.9723065Z GITHUB_STEP_SUMMARY=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/step_summary_4e89ceb4-5358-45fa-a946-adf2005c4f96 2025-10-10T00:52:56.9723837Z GITHUB_ACTOR=pytorchmergebot 2025-10-10T00:52:56.9724106Z PR_NUMBER= 2025-10-10T00:52:56.9724326Z DESIRED_CUDA=12.8.1 2025-10-10T00:52:56.9724557Z GITHUB_RUN_ATTEMPT=1 2025-10-10T00:52:56.9724802Z ANACONDA_PYTHON_VERSION=3.10 2025-10-10T00:52:56.9725132Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2025-10-10T00:52:56.9725462Z TERM=vt100 2025-10-10T00:52:56.9725675Z INSTALLED_VISION=yes 2025-10-10T00:52:56.9725901Z BRANCH=main 2025-10-10T00:52:56.9726116Z SCCACHE_REGION=us-east-1 2025-10-10T00:52:56.9726383Z OPENSSL_ROOT_DIR=/opt/openssl 2025-10-10T00:52:56.9726655Z CUDA_PATH=/usr/local/cuda 2025-10-10T00:52:56.9727280Z GITHUB_ACTION_PATH=/home/ec2-user/actions-runner/_work/pytorch/pytorch/./.github/actions/setup-linux 2025-10-10T00:52:56.9727840Z GITHUB_SERVER_URL=https://github.com 2025-10-10T00:52:56.9728340Z UCC_COMMIT=430e241bf5d38cbc73fc7a6b89155397232e3f96 2025-10-10T00:52:56.9728676Z REENABLED_ISSUES= 2025-10-10T00:52:56.9728887Z DOCS= 2025-10-10T00:52:56.9729088Z SHLVL=1 2025-10-10T00:52:56.9729285Z MAX_JOBS=14 2025-10-10T00:52:56.9729498Z GITHUB_ACTOR_ID=97764156 2025-10-10T00:52:56.9729829Z GITHUB_WORKFLOW_SHA=344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:52:56.9730192Z GITHUB_REF_NAME=main 2025-10-10T00:52:56.9730562Z XLA_CLANG_CACHE_S3_BUCKET_NAME=ossci-compiler-clang-cache-circleci-xla 2025-10-10T00:52:56.9730970Z GITHUB_JOB=test 2025-10-10T00:52:56.9731199Z NO_TEST_TIMEOUT=False 2025-10-10T00:52:56.9731442Z TD_DISTRIBUTED=False 2025-10-10T00:52:56.9731703Z GITHUB_REPOSITORY=pytorch/pytorch 2025-10-10T00:52:56.9731993Z GITHUB_RETENTION_DAYS=90 2025-10-10T00:52:56.9732245Z OPENSSL_DIR=/opt/openssl 2025-10-10T00:52:56.9732508Z GITHUB_ACTION_REPOSITORY= 2025-10-10T00:52:56.9733235Z PATH=/opt/cache/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-10-10T00:52:56.9733974Z GITHUB_BASE_REF= 2025-10-10T00:52:56.9734193Z INSTALLED_ACL= 2025-10-10T00:52:56.9734568Z ARTIFACTS_FILE_SUFFIX=test-slow-2-3-linux.g5.4xlarge.nvidia.gpu_52406799277 2025-10-10T00:52:56.9734996Z CI=true 2025-10-10T00:52:56.9735212Z GITHUB_REPOSITORY_OWNER=pytorch 2025-10-10T00:52:56.9735518Z RUST_LOG=sccache::server=error 2025-10-10T00:52:56.9735788Z JOB_ID=52406799277 2025-10-10T00:52:56.9736008Z GITHUB_HEAD_REF= 2025-10-10T00:52:56.9736231Z GITHUB_ACTION_REF= 2025-10-10T00:52:56.9736507Z SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2 2025-10-10T00:52:56.9736844Z TEST_SHOWLOCALS=False 2025-10-10T00:52:56.9737085Z GITHUB_WORKFLOW=slow 2025-10-10T00:52:56.9737336Z DEBIAN_FRONTEND=noninteractive 2025-10-10T00:52:56.9737918Z GITHUB_OUTPUT=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_output_4e89ceb4-5358-45fa-a946-adf2005c4f96 2025-10-10T00:52:56.9738510Z NO_TD=False 2025-10-10T00:52:56.9738745Z SKIP_SCCACHE_INITIALIZATION=1 2025-10-10T00:52:56.9739044Z NCCL_INCLUDE_DIR=/usr/local/cuda/include/ 2025-10-10T00:52:56.9739335Z _=/usr/bin/env 2025-10-10T00:52:56.9739681Z OLDPWD=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/numba/cuda 2025-10-10T00:52:56.9740176Z ++ python -c 'import site; print(site.getsitepackages()[0])' 2025-10-10T00:52:56.9848716Z + TORCH_INSTALL_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch 2025-10-10T00:52:56.9849321Z + TORCH_BIN_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2025-10-10T00:52:56.9849966Z + TORCH_LIB_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib 2025-10-10T00:52:56.9850551Z + TORCH_TEST_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/test 2025-10-10T00:52:56.9850969Z + BUILD_DIR=build 2025-10-10T00:52:56.9851213Z + BUILD_RENAMED_DIR=build_renamed 2025-10-10T00:52:56.9851489Z + BUILD_BIN_DIR=build/bin 2025-10-10T00:52:56.9851734Z + SHARD_NUMBER=2 2025-10-10T00:52:56.9851980Z + NUM_TEST_SHARDS=3 2025-10-10T00:52:56.9852231Z + export TORCH_SERIALIZATION_DEBUG=1 2025-10-10T00:52:56.9852527Z + TORCH_SERIALIZATION_DEBUG=1 2025-10-10T00:52:56.9852795Z + export VALGRIND=ON 2025-10-10T00:52:56.9853029Z + VALGRIND=ON 2025-10-10T00:52:56.9853334Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *clang9* ]] 2025-10-10T00:52:56.9853762Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *xpu* ]] 2025-10-10T00:52:56.9854099Z + detect_cuda_arch 2025-10-10T00:52:56.9854391Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *cuda* ]] 2025-10-10T00:52:56.9854747Z + command -v nvidia-smi 2025-10-10T00:52:56.9854984Z /usr/bin/nvidia-smi 2025-10-10T00:52:56.9861516Z ++ nvidia-smi --query-gpu=compute_cap --format=csv 2025-10-10T00:52:56.9862001Z ++ tail -n 1 2025-10-10T00:52:57.0186604Z + TORCH_CUDA_ARCH_LIST=8.6 2025-10-10T00:52:57.0186949Z + export TORCH_CUDA_ARCH_LIST 2025-10-10T00:52:57.0187308Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *s390x* ]] 2025-10-10T00:52:57.0187650Z + [[ 0 == \1 ]] 2025-10-10T00:52:57.0188170Z + [[ True == \1 ]] 2025-10-10T00:52:57.0188478Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 != *bazel* ]] 2025-10-10T00:52:57.0191443Z ++ realpath build/custom_test_artifacts 2025-10-10T00:52:57.0303481Z + CUSTOM_TEST_ARTIFACT_BUILD_DIR=/var/lib/jenkins/workspace/build/custom_test_artifacts 2025-10-10T00:52:57.0303979Z + [[ -n '' ]] 2025-10-10T00:52:57.0304225Z + echo 'Environment variables' 2025-10-10T00:52:57.0304509Z Environment variables 2025-10-10T00:52:57.0304744Z + env 2025-10-10T00:52:57.0436159Z GITHUB_WORKSPACE=/home/ec2-user/actions-runner/_work/pytorch/pytorch 2025-10-10T00:52:57.0436794Z CONTINUE_THROUGH_ERROR=True 2025-10-10T00:52:57.0437264Z BUILD_ENVIRONMENT=linux-jammy-cuda12.8-py3.10-gcc11-sm86 2025-10-10T00:52:57.0437891Z VLLM_TEST_HUGGING_FACE_TOKEN=*** 2025-10-10T00:52:57.0438204Z HOSTNAME=0d479bf7aa10 2025-10-10T00:52:57.0438902Z GITHUB_PATH=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/add_path_4e89ceb4-5358-45fa-a946-adf2005c4f96 2025-10-10T00:52:57.0439690Z GITHUB_ACTION=__run_2 2025-10-10T00:52:57.0439957Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=0 2025-10-10T00:52:57.0440244Z GITHUB_RUN_NUMBER=18872 2025-10-10T00:52:57.0440493Z TEST_CONFIG=slow 2025-10-10T00:52:57.0440740Z GITHUB_REPOSITORY_OWNER_ID=21003710 2025-10-10T00:52:57.0441090Z TORCH_NVCC_FLAGS=-Xfatbin -compress-all 2025-10-10T00:52:57.0441399Z SCCACHE_IDLE_TIMEOUT=0 2025-10-10T00:52:57.0441783Z SCRIBE_GRAPHQL_ACCESS_TOKEN=*** 2025-10-10T00:52:57.0442077Z GITHUB_TRIGGERING_ACTOR=pytorchmergebot 2025-10-10T00:52:57.0442382Z GITHUB_REF_TYPE=branch 2025-10-10T00:52:57.0442628Z TORCH_CUDA_ARCH_LIST=8.6 2025-10-10T00:52:57.0442921Z BASE_SHA=344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:52:57.0443230Z XLA_CUDA= 2025-10-10T00:52:57.0443453Z NCCL_LIB_DIR=/usr/local/cuda/lib64/ 2025-10-10T00:52:57.0444187Z HUGGING_FACE_HUB_TOKEN=*** 2025-10-10T00:52:57.0444597Z *** 2025-10-10T00:52:57.0444904Z GITHUB_REPOSITORY_ID=65600975 2025-10-10T00:52:57.0445294Z GITHUB_ACTIONS=true 2025-10-10T00:52:57.0445655Z NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T00:52:57.0446073Z SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-10-10T00:52:57.0446605Z SHA1=344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:52:57.0447152Z GITHUB_SHA=344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:52:57.0447652Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/slow.yml@refs/heads/main 2025-10-10T00:52:57.0448413Z UCC_HOME=/usr 2025-10-10T00:52:57.0448651Z TORCH_SERIALIZATION_DEBUG=1 2025-10-10T00:52:57.0448917Z VERBOSE_TEST_LOGS=False 2025-10-10T00:52:57.0449175Z GITHUB_REF=refs/heads/main 2025-10-10T00:52:57.0449433Z SHARD_NUMBER=2 2025-10-10T00:52:57.0449675Z GITHUB_REF_PROTECTED=true 2025-10-10T00:52:57.0449938Z HOME=/var/lib/jenkins 2025-10-10T00:52:57.0450217Z GITHUB_API_URL=https://api.github.com 2025-10-10T00:52:57.0450548Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2025-10-10T00:52:57.0450888Z UCX_COMMIT=7836b165abdbe468a2f607e7254011c07d788152 2025-10-10T00:52:57.0451211Z USE_SYSTEM_NCCL=1 2025-10-10T00:52:57.0451452Z NUM_TEST_SHARDS=3 2025-10-10T00:52:57.0451675Z UCX_HOME=/usr 2025-10-10T00:52:57.0452209Z GITHUB_STATE=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/save_state_4e89ceb4-5358-45fa-a946-adf2005c4f96 2025-10-10T00:52:57.0453009Z JOB_NAME=linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 2, 3, linux.g5.4xlarge.nvidia.gpu) 2025-10-10T00:52:57.0453802Z GITHUB_ENV=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_env_4e89ceb4-5358-45fa-a946-adf2005c4f96 2025-10-10T00:52:57.0454540Z GITHUB_EVENT_PATH=/home/ec2-user/actions-runner/_work/_temp/_github_workflow/event.json 2025-10-10T00:52:57.0455003Z GITHUB_EVENT_NAME=push 2025-10-10T00:52:57.0455239Z DASHBOARD_TAG= 2025-10-10T00:52:57.0455465Z GITHUB_RUN_ID=18392306083 2025-10-10T00:52:57.0455714Z INSTALLED_OPENBLAS= 2025-10-10T00:52:57.0456285Z GITHUB_STEP_SUMMARY=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/step_summary_4e89ceb4-5358-45fa-a946-adf2005c4f96 2025-10-10T00:52:57.0457070Z GITHUB_ACTOR=pytorchmergebot 2025-10-10T00:52:57.0457341Z PR_NUMBER= 2025-10-10T00:52:57.0457542Z DESIRED_CUDA=12.8.1 2025-10-10T00:52:57.0457777Z GITHUB_RUN_ATTEMPT=1 2025-10-10T00:52:57.0458005Z VALGRIND=ON 2025-10-10T00:52:57.0458236Z ANACONDA_PYTHON_VERSION=3.10 2025-10-10T00:52:57.0458561Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2025-10-10T00:52:57.0458897Z TERM=vt100 2025-10-10T00:52:57.0459101Z INSTALLED_VISION=yes 2025-10-10T00:52:57.0459332Z BRANCH=main 2025-10-10T00:52:57.0459550Z SCCACHE_REGION=us-east-1 2025-10-10T00:52:57.0459813Z OPENSSL_ROOT_DIR=/opt/openssl 2025-10-10T00:52:57.0460081Z CUDA_PATH=/usr/local/cuda 2025-10-10T00:52:57.0460567Z GITHUB_ACTION_PATH=/home/ec2-user/actions-runner/_work/pytorch/pytorch/./.github/actions/setup-linux 2025-10-10T00:52:57.0461106Z GITHUB_SERVER_URL=https://github.com 2025-10-10T00:52:57.0461452Z UCC_COMMIT=430e241bf5d38cbc73fc7a6b89155397232e3f96 2025-10-10T00:52:57.0461769Z REENABLED_ISSUES= 2025-10-10T00:52:57.0461986Z DOCS= 2025-10-10T00:52:57.0462181Z SHLVL=1 2025-10-10T00:52:57.0462383Z MAX_JOBS=14 2025-10-10T00:52:57.0462589Z GITHUB_ACTOR_ID=97764156 2025-10-10T00:52:57.0462918Z GITHUB_WORKFLOW_SHA=344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T00:52:57.0474924Z GITHUB_REF_NAME=main 2025-10-10T00:52:57.0475313Z XLA_CLANG_CACHE_S3_BUCKET_NAME=ossci-compiler-clang-cache-circleci-xla 2025-10-10T00:52:57.0475735Z GITHUB_JOB=test 2025-10-10T00:52:57.0475968Z NO_TEST_TIMEOUT=False 2025-10-10T00:52:57.0476219Z TD_DISTRIBUTED=False 2025-10-10T00:52:57.0476481Z GITHUB_REPOSITORY=pytorch/pytorch 2025-10-10T00:52:57.0476768Z GITHUB_RETENTION_DAYS=90 2025-10-10T00:52:57.0477019Z OPENSSL_DIR=/opt/openssl 2025-10-10T00:52:57.0477275Z GITHUB_ACTION_REPOSITORY= 2025-10-10T00:52:57.0477993Z PATH=/opt/cache/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-10-10T00:52:57.0478729Z GITHUB_BASE_REF= 2025-10-10T00:52:57.0478959Z INSTALLED_ACL= 2025-10-10T00:52:57.0479340Z ARTIFACTS_FILE_SUFFIX=test-slow-2-3-linux.g5.4xlarge.nvidia.gpu_52406799277 2025-10-10T00:52:57.0479768Z CI=true 2025-10-10T00:52:57.0479990Z GITHUB_REPOSITORY_OWNER=pytorch 2025-10-10T00:52:57.0480312Z RUST_LOG=sccache::server=error 2025-10-10T00:52:57.0480583Z JOB_ID=52406799277 2025-10-10T00:52:57.0480921Z GITHUB_HEAD_REF= 2025-10-10T00:52:57.0481149Z GITHUB_ACTION_REF= 2025-10-10T00:52:57.0481439Z SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2 2025-10-10T00:52:57.0481788Z TEST_SHOWLOCALS=False 2025-10-10T00:52:57.0482030Z GITHUB_WORKFLOW=slow 2025-10-10T00:52:57.0482290Z DEBIAN_FRONTEND=noninteractive 2025-10-10T00:52:57.0482886Z GITHUB_OUTPUT=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_output_4e89ceb4-5358-45fa-a946-adf2005c4f96 2025-10-10T00:52:57.0483473Z NO_TD=False 2025-10-10T00:52:57.0483700Z SKIP_SCCACHE_INITIALIZATION=1 2025-10-10T00:52:57.0484053Z NCCL_INCLUDE_DIR=/usr/local/cuda/include/ 2025-10-10T00:52:57.0484490Z OLDPWD=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/numba/cuda 2025-10-10T00:52:57.0484895Z _=/usr/bin/env 2025-10-10T00:52:57.0485113Z + echo 'Testing pytorch' 2025-10-10T00:52:57.0485366Z Testing pytorch 2025-10-10T00:52:57.0485598Z + export LANG=C.UTF-8 2025-10-10T00:52:57.0485836Z + LANG=C.UTF-8 2025-10-10T00:52:57.0486059Z + PR_NUMBER= 2025-10-10T00:52:57.0486293Z + [[ slow == \d\e\f\a\u\l\t ]] 2025-10-10T00:52:57.0486568Z + [[ slow == \d\i\s\t\r\i\b\u\t\e\d ]] 2025-10-10T00:52:57.0486857Z + [[ slow == \s\l\o\w ]] 2025-10-10T00:52:57.0487232Z + export PYTORCH_TEST_WITH_SLOW=1 2025-10-10T00:52:57.0487535Z + PYTORCH_TEST_WITH_SLOW=1 2025-10-10T00:52:57.0487811Z + export PYTORCH_TEST_SKIP_FAST=1 2025-10-10T00:52:57.0488108Z + PYTORCH_TEST_SKIP_FAST=1 2025-10-10T00:52:57.0488474Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *slow-gradcheck* ]] 2025-10-10T00:52:57.0488933Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *cuda* ]] 2025-10-10T00:52:57.0489322Z + export PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2025-10-10T00:52:57.0489761Z + PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2025-10-10T00:52:57.0490057Z + [[ slow == *crossref* ]] 2025-10-10T00:52:57.0490386Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *rocm* ]] 2025-10-10T00:52:57.0490797Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *xpu* ]] 2025-10-10T00:52:57.0491210Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 != *-bazel-* ]] 2025-10-10T00:52:57.0491572Z + pip_install ninja==1.10.2 2025-10-10T00:52:57.0491926Z + pip_install_pkg='python3 -m pip install --progress-bar off' 2025-10-10T00:52:57.0492369Z + python3 -m pip install --progress-bar off ninja==1.10.2 2025-10-10T00:52:57.4432311Z Collecting ninja==1.10.2 2025-10-10T00:52:57.4624281Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl.metadata (5.0 kB) 2025-10-10T00:52:57.4739671Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (108 kB) 2025-10-10T00:52:57.8530987Z Installing collected packages: ninja 2025-10-10T00:52:57.8531689Z Attempting uninstall: ninja 2025-10-10T00:52:57.8538626Z Found existing installation: ninja 1.11.1.4 2025-10-10T00:52:57.8563355Z Uninstalling ninja-1.11.1.4: 2025-10-10T00:52:57.8649342Z Successfully uninstalled ninja-1.11.1.4 2025-10-10T00:52:57.9060604Z Successfully installed ninja-1.10.2 2025-10-10T00:52:57.9668021Z + export PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-10-10T00:52:57.9669504Z + PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-10-10T00:52:57.9670405Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *aarch64* ]] 2025-10-10T00:52:57.9670834Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *asan* ]] 2025-10-10T00:52:57.9671245Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *-debug* ]] 2025-10-10T00:52:57.9671679Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 != *-bazel-* ]] 2025-10-10T00:52:57.9672245Z + echo 'We are not in debug mode: linux-jammy-cuda12.8-py3.10-gcc11-sm86. Expect the assertion to pass' 2025-10-10T00:52:57.9672922Z We are not in debug mode: linux-jammy-cuda12.8-py3.10-gcc11-sm86. Expect the assertion to pass 2025-10-10T00:52:57.9673781Z + cd test 2025-10-10T00:52:57.9674133Z + python -c 'import torch; torch._C._crash_if_debug_asserts_fail(424242)' 2025-10-10T00:52:59.6305224Z + [[ slow == \n\o\g\p\u\_\N\O\_\A\V\X\2 ]] 2025-10-10T00:52:59.6305606Z + [[ slow == \n\o\g\p\u\_\A\V\X\5\1\2 ]] 2025-10-10T00:52:59.6305938Z + [[ slow == \l\e\g\a\c\y\_\n\v\i\d\i\a\_\d\r\i\v\e\r ]] 2025-10-10T00:52:59.6309431Z + DYNAMO_BENCHMARK_FLAGS=() 2025-10-10T00:52:59.6309738Z + [[ slow == *pr_time_benchmarks* ]] 2025-10-10T00:52:59.6310047Z + [[ slow == *dynamo_eager* ]] 2025-10-10T00:52:59.6310331Z + [[ slow == *aot_eager* ]] 2025-10-10T00:52:59.6310613Z + [[ slow == *aot_inductor* ]] 2025-10-10T00:52:59.6310903Z + [[ slow == *max_autotune_inductor* ]] 2025-10-10T00:52:59.6311237Z + [[ slow == *inductor* ]] 2025-10-10T00:52:59.6311504Z + [[ slow == *dynamic* ]] 2025-10-10T00:52:59.6311763Z + [[ slow == *cpu* ]] 2025-10-10T00:52:59.6312054Z + DYNAMO_BENCHMARK_FLAGS+=(--device cuda) 2025-10-10T00:52:59.6343039Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *libtorch* ]] 2025-10-10T00:52:59.6343511Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *-bazel-* ]] 2025-10-10T00:52:59.6345883Z + cd test 2025-10-10T00:52:59.6346476Z + python -c 'import torch; print(torch.__config__.show())' 2025-10-10T00:53:01.3000471Z PyTorch built with: 2025-10-10T00:53:01.3000773Z - GCC 11.4 2025-10-10T00:53:01.3001011Z - C++ Version: 201703 2025-10-10T00:53:01.3001552Z - Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications 2025-10-10T00:53:01.3002225Z - Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d) 2025-10-10T00:53:01.3002664Z - OpenMP 201511 (a.k.a. OpenMP 4.5) 2025-10-10T00:53:01.3003384Z - LAPACK is enabled (usually provided by MKL) 2025-10-10T00:53:01.3003718Z - NNPACK is enabled 2025-10-10T00:53:01.3003981Z - CPU capability usage: AVX2 2025-10-10T00:53:01.3004269Z - CUDA Runtime 12.8 2025-10-10T00:53:01.3004612Z - NVCC architecture flags: -gencode;arch=compute_86,code=sm_86 2025-10-10T00:53:01.3005000Z - CuDNN 90.8 2025-10-10T00:53:01.3009306Z - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, COMMIT_SHA=344e6365a0068c2d2847fcec0c55dd53291d475e, CUDA_VERSION=12.8, CUDNN_VERSION=9.8.0, CXX_COMPILER=/opt/cache/bin/c++, CXX_FLAGS= -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON -DUSE_FBGEMM -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -DC10_NODEPRECATED -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -faligned-new -Werror -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, FORCE_FALLBACK_CUDA_MPI=1, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, TORCH_VERSION=2.10.0, USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=ON, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=ON, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, USE_XCCL=OFF, USE_XPU=OFF, 2025-10-10T00:53:01.3013657Z 2025-10-10T00:53:01.6760827Z + cd test 2025-10-10T00:53:01.6761189Z + python -c 'import torch; print(torch.__config__.parallel_info())' 2025-10-10T00:53:03.0139707Z ATen/Parallel: 2025-10-10T00:53:03.0140058Z at::get_num_threads() : 8 2025-10-10T00:53:03.0140367Z at::get_num_interop_threads() : 16 2025-10-10T00:53:03.0140686Z OpenMP 201511 (a.k.a. OpenMP 4.5) 2025-10-10T00:53:03.0140983Z omp_get_max_threads() : 8 2025-10-10T00:53:03.0141545Z Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications 2025-10-10T00:53:03.0142201Z mkl_get_max_threads() : 8 2025-10-10T00:53:03.0142617Z Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d) 2025-10-10T00:53:03.0143420Z std::thread::hardware_concurrency() : 16 2025-10-10T00:53:03.0143751Z Environment variables: 2025-10-10T00:53:03.0144018Z OMP_NUM_THREADS : [not set] 2025-10-10T00:53:03.0144303Z MKL_NUM_THREADS : [not set] 2025-10-10T00:53:03.0144588Z ATen parallel backend: OpenMP 2025-10-10T00:53:03.0144773Z 2025-10-10T00:53:03.3394197Z + [[ slow == *numpy_2* ]] 2025-10-10T00:53:03.3394677Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *aarch64* ]] 2025-10-10T00:53:03.3395050Z + [[ slow == *backward* ]] 2025-10-10T00:53:03.3395304Z + [[ slow == *xla* ]] 2025-10-10T00:53:03.3395566Z + [[ slow == *vllm* ]] 2025-10-10T00:53:03.3395841Z + [[ slow == *executorch* ]] 2025-10-10T00:53:03.3396133Z + [[ slow == \j\i\t\_\l\e\g\a\c\y ]] 2025-10-10T00:53:03.3396430Z + [[ slow == \q\u\a\n\t\i\z\a\t\i\o\n ]] 2025-10-10T00:53:03.3396806Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *libtorch* ]] 2025-10-10T00:53:03.3397199Z + [[ slow == distributed ]] 2025-10-10T00:53:03.3397471Z + [[ slow == *operator_benchmark* ]] 2025-10-10T00:53:03.3397809Z + [[ slow == *operator_microbenchmark* ]] 2025-10-10T00:53:03.3398141Z + [[ slow == *inductor_distributed* ]] 2025-10-10T00:53:03.3398747Z + [[ slow == *inductor-halide* ]] 2025-10-10T00:53:03.3399063Z + [[ slow == *inductor-triton-cpu* ]] 2025-10-10T00:53:03.3399401Z + [[ slow == *inductor-micro-benchmark* ]] 2025-10-10T00:53:03.3399728Z + [[ slow == *huggingface* ]] 2025-10-10T00:53:03.3400003Z + [[ slow == *timm* ]] 2025-10-10T00:53:03.3400244Z + [[ slow == cachebench ]] 2025-10-10T00:53:03.3400551Z + [[ slow == verify_cachebench ]] 2025-10-10T00:53:03.3400882Z + [[ slow == *torchbench* ]] 2025-10-10T00:53:03.3401235Z + [[ slow == *inductor_cpp_wrapper* ]] 2025-10-10T00:53:03.3401897Z + [[ slow == *inductor* ]] 2025-10-10T00:53:03.3402169Z + [[ slow == *einops* ]] 2025-10-10T00:53:03.3402439Z + [[ slow == *dynamo_wrapped* ]] 2025-10-10T00:53:03.3402783Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *rocm* ]] 2025-10-10T00:53:03.3403119Z + [[ 2 == 1 ]] 2025-10-10T00:53:03.3403343Z + [[ 2 == 2 ]] 2025-10-10T00:53:03.3403563Z + [[ 3 -gt 1 ]] 2025-10-10T00:53:03.3403792Z + install_torchvision 2025-10-10T00:53:03.3404032Z + local orig_preload 2025-10-10T00:53:03.3404272Z + local commit 2025-10-10T00:53:03.3404509Z ++ get_pinned_commit vision 2025-10-10T00:53:03.3404803Z ++ cat .github/ci_commit_pins/vision.txt 2025-10-10T00:53:03.3419475Z + commit=966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-10-10T00:53:03.3419942Z + orig_preload= 2025-10-10T00:53:03.3420256Z + '[' -n '' ']' 2025-10-10T00:53:03.3420680Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *cuda* ]] 2025-10-10T00:53:03.3421033Z + export FORCE_CUDA=1 2025-10-10T00:53:03.3421275Z + FORCE_CUDA=1 2025-10-10T00:53:03.3421508Z + export WITH_CUDA=1 2025-10-10T00:53:03.3421745Z + WITH_CUDA=1 2025-10-10T00:53:03.3422300Z + pip_build_and_install git+https://github.com/pytorch/vision.git@966da7e46f65d6d49df3e31214470a4fe5cc8e66 dist/vision 2025-10-10T00:53:03.3423132Z + local build_target=git+https://github.com/pytorch/vision.git@966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-10-10T00:53:03.3423677Z + local wheel_dir=dist/vision 2025-10-10T00:53:03.3423954Z + local found_whl=0 2025-10-10T00:53:03.3424208Z + for file in "${wheel_dir}"/*.whl 2025-10-10T00:53:03.3424509Z + [[ -f dist/vision/*.whl ]] 2025-10-10T00:53:03.3424767Z + '[' 0 == 0 ']' 2025-10-10T00:53:03.3425456Z + python3 -m pip wheel --no-build-isolation --no-deps --no-use-pep517 -w dist/vision git+https://github.com/pytorch/vision.git@966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-10-10T00:53:03.6731813Z Collecting git+https://github.com/pytorch/vision.git@966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-10-10T00:53:03.6736015Z Cloning https://github.com/pytorch/vision.git (to revision 966da7e46f65d6d49df3e31214470a4fe5cc8e66) to /tmp/pip-req-build-jzf6vlkq 2025-10-10T00:53:03.6900746Z Running command git clone --filter=blob:none --quiet https://github.com/pytorch/vision.git /tmp/pip-req-build-jzf6vlkq 2025-10-10T00:53:05.4331643Z Running command git rev-parse -q --verify 'sha^966da7e46f65d6d49df3e31214470a4fe5cc8e66' 2025-10-10T00:53:05.4365563Z Running command git fetch -q https://github.com/pytorch/vision.git 966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-10-10T00:53:05.5801680Z Running command git checkout -q 966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-10-10T00:53:05.9264341Z Resolved https://github.com/pytorch/vision.git to commit 966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-10-10T00:53:08.0460742Z Preparing metadata (setup.py) ... [?25l- \ | done 2025-10-10T00:53:08.0496778Z [?25hBuilding wheels for collected packages: torchvision 2025-10-10T00:53:08.0564237Z  DEPRECATION: Building 'torchvision' using the legacy setup.py bdist_wheel mechanism, which will be removed in a future version. pip 25.3 will enforce this behaviour change. A possible replacement is to use the standardized build interface by setting the `--use-pep517` option, (possibly combined with `--no-build-isolation`), or adding a `pyproject.toml` file to the source tree of 'torchvision'. Discussion can be found at https://github.com/pypa/pip/issues/6334 2025-10-10T00:54:24.6823258Z  Building wheel for torchvision (setup.py) ... [?25l- \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - done 2025-10-10T00:54:24.6852475Z [?25h Created wheel for torchvision: filename=torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl size=1791092 sha256=f6c2f907f3b5467e103360fe9be01600c49623ec247a99eb86da7288bcfb143c 2025-10-10T00:54:24.6854919Z Stored in directory: /var/lib/jenkins/.cache/pip/wheels/9c/9d/3e/42fa2d5ac6ba44a90363f8fff0fa9e712e24d4f977637c81cb 2025-10-10T00:54:24.6889884Z Successfully built torchvision 2025-10-10T00:54:24.8007193Z + for file in "${wheel_dir}"/*.whl 2025-10-10T00:54:24.8008238Z + pip_install_whl dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl 2025-10-10T00:54:24.8009367Z + args=('dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl') 2025-10-10T00:54:24.8009872Z + local args 2025-10-10T00:54:24.8010271Z + [[ dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl == *\ * ]] 2025-10-10T00:54:24.8010738Z + for path in "${args[@]}" 2025-10-10T00:54:24.8011198Z + echo 'Installing dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl' 2025-10-10T00:54:24.8011857Z Installing dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl 2025-10-10T00:54:24.8012614Z + python3 -mpip install --no-index --no-deps dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl 2025-10-10T00:54:25.1430699Z Processing ./dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl 2025-10-10T00:54:25.1529366Z Installing collected packages: torchvision 2025-10-10T00:54:25.6131784Z Successfully installed torchvision-0.22.0a0+966da7e 2025-10-10T00:54:25.6531556Z + '[' -n '' ']' 2025-10-10T00:54:25.6531836Z + test_python_shard 2 2025-10-10T00:54:25.6532124Z + [[ -z 3 ]] 2025-10-10T00:54:25.6532851Z + python test/run_test.py --exclude-jit-executor --exclude-distributed-tests --exclude-quantization-tests --shard 2 3 --verbose --upload-artifacts-while-running 2025-10-10T00:54:30.6201879Z Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2025-10-10T00:54:30.6811860Z Ignoring disabled issues: [''] 2025-10-10T00:54:30.6923487Z Found test times from artifacts 2025-10-10T00:54:30.7383079Z Found test times from artifacts 2025-10-10T00:54:30.7397210Z Running all tests 2025-10-10T00:54:30.7917708Z Running parallel tests on 3 processes 2025-10-10T00:54:30.7937498Z Name: tests to run (est. time: 63.44min) 2025-10-10T00:54:30.7937823Z Serial tests (0): 2025-10-10T00:54:30.7938080Z Parallel tests (297): 2025-10-10T00:54:30.7938383Z inductor/test_dependencies 1/1 2025-10-10T00:54:30.7938666Z test_ops 1/1 2025-10-10T00:54:30.7938917Z test_torchfuzz_repros 1/1 2025-10-10T00:54:30.7939636Z test_opaque_obj 1/1 2025-10-10T00:54:30.7939881Z test_testing 1/1 2025-10-10T00:54:30.7940127Z test_public_bindings 1/1 2025-10-10T00:54:30.7940418Z inductor/test_aot_inductor 1/1 2025-10-10T00:54:30.7940722Z inductor/test_torchinductor 1/1 2025-10-10T00:54:30.7941059Z inductor/test_torchinductor_opinfo 2/11 2025-10-10T00:54:30.7941401Z inductor/test_torchinductor_opinfo 5/11 2025-10-10T00:54:30.7941734Z inductor/test_torchinductor_opinfo 6/11 2025-10-10T00:54:30.7942066Z inductor/test_torchinductor_opinfo 8/11 2025-10-10T00:54:30.7942399Z inductor/test_torchinductor_opinfo 11/11 2025-10-10T00:54:30.7942721Z inductor/test_static_cuda_launcher 1/1 2025-10-10T00:54:30.7943058Z inductor/test_cooperative_reductions 1/1 2025-10-10T00:54:30.7943377Z inductor/test_async_compile 1/1 2025-10-10T00:54:30.7943678Z inductor/test_kernel_benchmark 1/1 2025-10-10T00:54:30.7943979Z inductor/test_cuda_repro 1/1 2025-10-10T00:54:30.7944258Z dynamo/test_callback 1/1 2025-10-10T00:54:30.7944528Z inductor/test_fp8 1/1 2025-10-10T00:54:30.7944826Z inductor/test_torchinductor_dynamic_shapes 1/2 2025-10-10T00:54:30.7945164Z inductor/test_analysis 1/1 2025-10-10T00:54:30.7945444Z inductor/test_triton_syntax 1/1 2025-10-10T00:54:30.7945759Z inductor/test_triton_extension_backend 1/1 2025-10-10T00:54:30.7946074Z inductor/test_utils 1/1 2025-10-10T00:54:30.7946360Z inductor/test_coordinate_descent_tuner 1/1 2025-10-10T00:54:30.7946682Z inductor/test_inplace_padding 1/1 2025-10-10T00:54:30.7947014Z inductor/test_template_heuristics_registry 1/1 2025-10-10T00:54:30.7947361Z inductor/test_select_algorithm 1/1 2025-10-10T00:54:30.7947827Z inductor/test_extension_backend 1/1 2025-10-10T00:54:30.7948141Z inductor/test_inductor_scheduler 1/1 2025-10-10T00:54:30.7948445Z inductor/test_padding 1/1 2025-10-10T00:54:30.7948726Z inductor/test_codegen_triton 1/1 2025-10-10T00:54:30.7949079Z inductor/test_torchinductor_codegen_dynamic_shapes 1/2 2025-10-10T00:54:30.7949488Z export/test_export_training_ir_to_run_decomp 1/1 2025-10-10T00:54:30.7949834Z inductor/test_indexing 1/1 2025-10-10T00:54:30.7950106Z inductor/test_minifier 1/1 2025-10-10T00:54:30.7950382Z inductor/test_perf 1/1 2025-10-10T00:54:30.7950638Z inductor/test_pad_mm 1/1 2025-10-10T00:54:30.7950933Z inductor/test_inductor_annotations 1/1 2025-10-10T00:54:30.7951254Z inductor/test_ck_backend 1/1 2025-10-10T00:54:30.7951606Z inductor/test_inductor_utils 1/1 2025-10-10T00:54:30.7951904Z inductor/test_op_completeness 1/1 2025-10-10T00:54:30.7952212Z inductor/test_multi_kernel 1/1 2025-10-10T00:54:30.7952513Z inductor/test_autoheuristic 1/1 2025-10-10T00:54:30.7952808Z export/test_serdes 1/1 2025-10-10T00:54:30.7953076Z dynamo/test_deque_reconstruct 1/1 2025-10-10T00:54:30.7953395Z inductor/test_cuda_select_algorithm 1/1 2025-10-10T00:54:30.7953720Z export/test_strict_export_v2 1/1 2025-10-10T00:54:30.7954035Z inductor/test_deterministic 1/1 2025-10-10T00:54:30.7954327Z inductor/test_flex_decoding 1/1 2025-10-10T00:54:30.7954636Z export/test_unflatten_training_ir 1/1 2025-10-10T00:54:30.7954965Z inductor/test_aot_inductor_arrayref 1/1 2025-10-10T00:54:30.7955291Z dynamo/test_fx_passes_pre_grad 1/1 2025-10-10T00:54:30.7955599Z inductor/test_aot_inductor_windows 1/1 2025-10-10T00:54:30.7955920Z inductor/test_compiled_autograd 1/2 2025-10-10T00:54:30.7956233Z inductor/test_metrics 1/1 2025-10-10T00:54:30.7956532Z inductor/test_custom_post_grad_passes 1/1 2025-10-10T00:54:30.7956856Z inductor/test_aot_inductor_package 1/1 2025-10-10T00:54:30.7957188Z inductor/test_provenance_tracing 1/1 2025-10-10T00:54:30.7957498Z inductor/test_fx_fusion 1/1 2025-10-10T00:54:30.7957789Z inductor/test_loop_ordering 1/1 2025-10-10T00:54:30.7958108Z export/test_functionalized_assertions 1/1 2025-10-10T00:54:30.7958435Z inductor/test_segmented_tree 1/1 2025-10-10T00:54:30.7958834Z inductor/test_compiled_optimizers 1/1 2025-10-10T00:54:30.7959169Z inductor/test_decompose_mem_bound_mm 1/1 2025-10-10T00:54:30.7959478Z dynamo/test_base_output 1/1 2025-10-10T00:54:30.7959754Z dynamo/test_backends 1/1 2025-10-10T00:54:30.7960038Z dynamo/test_fx_graph_runnable 1/1 2025-10-10T00:54:30.7960347Z inductor/test_compile_worker 1/1 2025-10-10T00:54:30.7960660Z inductor/test_move_constructors_to_cuda 1/1 2025-10-10T00:54:30.7961002Z inductor/test_subgraph_choice 1/1 2025-10-10T00:54:30.7961307Z export/test_export_strict 1/1 2025-10-10T00:54:30.7961606Z inductor/test_cutedsl_template 1/1 2025-10-10T00:54:30.7961916Z dynamo/test_inline_and_install 1/1 2025-10-10T00:54:30.7962210Z export/test_tree_utils 1/1 2025-10-10T00:54:30.7962488Z dynamo/test_recompiles 1/1 2025-10-10T00:54:30.7962764Z dynamo/test_einops 1/1 2025-10-10T00:54:30.7963016Z inductor/test_foreach 1/1 2025-10-10T00:54:30.7963295Z inductor/test_minifier_utils 1/1 2025-10-10T00:54:30.7963600Z dynamo/test_sdpa 1/1 2025-10-10T00:54:30.7963869Z inductor/test_compile_subprocess 1/1 2025-10-10T00:54:30.7964171Z export/test_cpp_serdes 1/1 2025-10-10T00:54:30.7964449Z inductor/test_debug_trace 1/1 2025-10-10T00:54:30.7964734Z inductor/test_memory 1/1 2025-10-10T00:54:30.7965012Z dynamo/test_frame_init 1/1 2025-10-10T00:54:30.7965310Z inductor/test_kernel_optimization 1/1 2025-10-10T00:54:30.7965631Z inductor/test_combo_kernels 1/1 2025-10-10T00:54:30.7965923Z inductor/test_inplacing_pass 1/1 2025-10-10T00:54:30.7966235Z dynamo/test_skip_non_tensor 1/1 2025-10-10T00:54:30.7966546Z inductor/test_op_dtype_prop 1/1 2025-10-10T00:54:30.7966933Z dynamo/test_reconstruct 1/1 2025-10-10T00:54:30.7967300Z export/test_dynamic_shapes 1/1 2025-10-10T00:54:30.7967598Z inductor/test_remote_cache 1/1 2025-10-10T00:54:30.7967884Z dynamo/test_interop 1/1 2025-10-10T00:54:30.7968148Z inductor/test_device_assert 1/1 2025-10-10T00:54:30.7968450Z inductor/test_smoke 1/1 2025-10-10T00:54:30.7968729Z dynamo/test_skip_guard_eval_unsafe 1/1 2025-10-10T00:54:30.7969037Z export/test_tools 1/1 2025-10-10T00:54:30.7969305Z inductor/test_gpu_cpp_wrapper 1/1 2025-10-10T00:54:30.7969645Z export/test_export_with_inline_and_install 1/1 2025-10-10T00:54:30.7969988Z export/test_serialize 1/1 2025-10-10T00:54:30.7970267Z dynamo/test_functions 1/1 2025-10-10T00:54:30.7970545Z inductor/test_benchmarking 1/1 2025-10-10T00:54:30.7970850Z inductor/test_quantization 1/1 2025-10-10T00:54:30.7971168Z inductor/test_aot_inductor_custom_ops 1/1 2025-10-10T00:54:30.7971510Z inductor/test_scatter_optimization 1/1 2025-10-10T00:54:30.7971878Z inductor/test_group_batch_fusion 1/1 2025-10-10T00:54:30.7972545Z inductor/test_split_cat_fx_passes 1/1 2025-10-10T00:54:30.7972902Z dynamo/test_view 1/1 2025-10-10T00:54:30.7984559Z dynamo/test_fx_annotate 1/1 2025-10-10T00:54:30.7984859Z inductor/test_control_deps 1/1 2025-10-10T00:54:30.7985178Z dynamo/test_pre_dispatch 1/1 2025-10-10T00:54:30.7985460Z dynamo/test_subgraphs 1/1 2025-10-10T00:54:30.7985763Z inductor/test_mkldnn_pattern_matcher 1/1 2025-10-10T00:54:30.7986091Z dynamo/test_decorators 1/1 2025-10-10T00:54:30.7986367Z dynamo/test_pgo 1/1 2025-10-10T00:54:30.7986622Z inductor/test_cutlass_evt 1/1 2025-10-10T00:54:30.7986919Z dynamo/test_buffers_override 1/1 2025-10-10T00:54:30.7987223Z inductor/test_online_softmax 1/1 2025-10-10T00:54:30.7987536Z test_model_exports_to_core_aten 1/1 2025-10-10T00:54:30.7987842Z inductor/test_helion_kernels 1/1 2025-10-10T00:54:30.7988157Z inductor/test_aot_inductor_utils 1/1 2025-10-10T00:54:30.7988471Z export/test_package 1/1 2025-10-10T00:54:30.7988748Z dynamo/test_ctx_manager 1/1 2025-10-10T00:54:30.7989035Z inductor/test_cudagraph_trees 1/1 2025-10-10T00:54:30.7989345Z inductor/test_block_analysis 1/1 2025-10-10T00:54:30.7989656Z dynamo/test_autograd_function 1/1 2025-10-10T00:54:30.7990080Z dynamo/test_nops 1/1 2025-10-10T00:54:30.7990334Z dynamo/test_config 1/1 2025-10-10T00:54:30.7990610Z inductor/test_control_flow 1/1 2025-10-10T00:54:30.7990901Z export/test_db 1/1 2025-10-10T00:54:30.7991176Z inductor/test_unbacked_symints 1/1 2025-10-10T00:54:30.7991525Z inductor/test_fused_attention 1/1 2025-10-10T00:54:30.7991833Z dynamo/test_export_mutations 1/1 2025-10-10T00:54:30.7992129Z inductor/test_config 1/1 2025-10-10T00:54:30.7992418Z dynamo/test_guard_serialization 1/1 2025-10-10T00:54:30.7992746Z inductor/test_graph_transform_observer 1/1 2025-10-10T00:54:30.7993074Z dynamo/test_unittest 1/1 2025-10-10T00:54:30.7993352Z inductor/test_cache 1/1 2025-10-10T00:54:30.7993622Z dynamo/test_after_aot 1/1 2025-10-10T00:54:30.7993895Z inductor/test_compile 1/1 2025-10-10T00:54:30.7994182Z export/test_export_opinfo 1/1 2025-10-10T00:54:30.7994481Z inductor/test_custom_lowering 1/1 2025-10-10T00:54:30.7994799Z dynamo/test_graph_region_tracker 1/1 2025-10-10T00:54:30.7995099Z dynamo/test_dicts 1/1 2025-10-10T00:54:30.7995364Z inductor/test_fuzzer 1/1 2025-10-10T00:54:30.7995634Z dynamo/test_modules 1/1 2025-10-10T00:54:30.7995907Z dynamo/test_metrics_context 1/1 2025-10-10T00:54:30.7996203Z dynamo/test_install_free_tensors 1/1 2025-10-10T00:54:30.7996519Z inductor/test_memory_planning 1/1 2025-10-10T00:54:30.7996824Z inductor/test_ordered_set 1/1 2025-10-10T00:54:30.7997137Z inductor/test_split_cat_fx_aten_passes 1/1 2025-10-10T00:54:30.7997482Z dynamo/test_activation_checkpointing 1/1 2025-10-10T00:54:30.7997813Z dynamo/test_compiler_bisector 1/1 2025-10-10T00:54:30.7998214Z dynamo/test_aot_compile 1/1 2025-10-10T00:54:30.7998941Z dynamo/test_modes 1/1 2025-10-10T00:54:30.7999306Z inductor/test_auto_functionalize 1/1 2025-10-10T00:54:30.7999815Z inductor/test_torchinductor_codegen_config_overrides 1/1 2025-10-10T00:54:30.8000318Z dynamo/test_profiler 1/1 2025-10-10T00:54:30.8000691Z dynamo/test_global 1/1 2025-10-10T00:54:30.8001043Z inductor/test_inductor_freezing 1/1 2025-10-10T00:54:30.8001491Z dynamo/test_model_output 1/1 2025-10-10T00:54:30.8001858Z export/test_torchbind 1/1 2025-10-10T00:54:30.8002175Z dynamo/test_nested_graph_breaks 1/1 2025-10-10T00:54:30.8002492Z dynamo/test_backward_higher_order_ops 1/1 2025-10-10T00:54:30.8002811Z export/test_passes 1/1 2025-10-10T00:54:30.8003084Z inductor/test_torchbind 1/1 2025-10-10T00:54:30.8003391Z inductor/test_custom_partitioner_fn 1/1 2025-10-10T00:54:30.8003701Z inductor/test_alignment 1/1 2025-10-10T00:54:30.8003981Z dynamo/test_sources 1/1 2025-10-10T00:54:30.8004255Z dynamo/test_resume 1/1 2025-10-10T00:54:30.8004524Z dynamo/test_debug_utils 1/1 2025-10-10T00:54:30.8004787Z export/test_swap 1/1 2025-10-10T00:54:30.8005063Z dynamo/test_aot_autograd_cache 1/1 2025-10-10T00:54:30.8005377Z inductor/test_binary_folding 1/1 2025-10-10T00:54:30.8005669Z dynamo/test_base_hop 1/1 2025-10-10T00:54:30.8005922Z dynamo/test_list 1/1 2025-10-10T00:54:30.8006176Z export/test_unflatten 1/1 2025-10-10T00:54:30.8006461Z inductor/test_needs_exact_strides 1/1 2025-10-10T00:54:30.8006777Z dynamo/test_verify_correctness 1/1 2025-10-10T00:54:30.8007134Z export/test_export 1/1 2025-10-10T00:54:30.8007406Z inductor/test_minifier_isolate 1/1 2025-10-10T00:54:30.8007700Z dynamo/test_logging 1/1 2025-10-10T00:54:30.8007967Z dynamo/test_deviceguard 1/1 2025-10-10T00:54:30.8008238Z dynamo/test_aot_autograd 1/1 2025-10-10T00:54:30.8008539Z inductor/test_augmented_graph_helper 1/1 2025-10-10T00:54:30.8008873Z dynamo/test_cudagraphs 1/1 2025-10-10T00:54:30.8009145Z inductor/test_caching 1/1 2025-10-10T00:54:30.8009409Z export/test_upgrader 1/1 2025-10-10T00:54:30.8009671Z dynamo/test_sets 1/1 2025-10-10T00:54:30.8009924Z dynamo/test_unspec 1/1 2025-10-10T00:54:30.8010191Z dynamo/test_python_dispatcher 1/1 2025-10-10T00:54:30.8010660Z dynamo/test_optimizers 1/1 2025-10-10T00:54:30.8010942Z dynamo/test_flat_apply 1/1 2025-10-10T00:54:30.8011219Z dynamo/test_higher_order_ops 1/1 2025-10-10T00:54:30.8011509Z export/test_nativert 1/1 2025-10-10T00:54:30.8011769Z inductor/test_cpu_repro 1/1 2025-10-10T00:54:30.8012063Z dynamo/test_graph_deduplication 1/1 2025-10-10T00:54:30.8012366Z dynamo/test_export 1/1 2025-10-10T00:54:30.8012636Z dynamo/test_error_messages 1/1 2025-10-10T00:54:30.8012906Z export/test_hop 1/1 2025-10-10T00:54:30.8013208Z dynamo/test_cudagraphs_expandable_segments 1/1 2025-10-10T00:54:30.8013555Z dynamo/test_recompile_ux 1/1 2025-10-10T00:54:30.8013847Z inductor/test_mmdecomp 1/1 2025-10-10T00:54:30.8014133Z dynamo/test_precompile_context 1/1 2025-10-10T00:54:30.8014442Z dynamo/test_bytecode_utils 1/1 2025-10-10T00:54:30.8014735Z export/test_pass_infra 1/1 2025-10-10T00:54:30.8015014Z dynamo/test_guard_manager 1/1 2025-10-10T00:54:30.8015294Z dynamo/test_minifier 1/1 2025-10-10T00:54:30.8015564Z export/test_converter 1/1 2025-10-10T00:54:30.8015848Z export/test_experimental 1/1 2025-10-10T00:54:30.8016146Z dynamo/test_input_attr_tracking 1/1 2025-10-10T00:54:30.8016429Z dynamo/test_exc 1/1 2025-10-10T00:54:30.8016678Z dynamo/test_hooks 1/1 2025-10-10T00:54:30.8016942Z dynamo/test_trace_rules 1/1 2025-10-10T00:54:30.8017218Z dynamo/test_exceptions 1/1 2025-10-10T00:54:30.8017486Z export/test_schema 1/1 2025-10-10T00:54:30.8017749Z inductor/test_mps_basic 1/1 2025-10-10T00:54:30.8018088Z inductor/test_cudagraph_trees_expandable_segments 1/1 2025-10-10T00:54:30.8018594Z dynamo/test_subclasses 1/1 2025-10-10T00:54:30.8018863Z dynamo/test_repros 1/1 2025-10-10T00:54:30.8019127Z dynamo/test_reorder_logs 1/1 2025-10-10T00:54:30.8019408Z dynamo/test_generator 1/1 2025-10-10T00:54:30.8019694Z export/test_lift_unlift 1/1 2025-10-10T00:54:30.8019969Z export/test_verifier 1/1 2025-10-10T00:54:30.8020254Z profiler/test_profiler 1/1 2025-10-10T00:54:30.8020518Z dynamo/test_misc 1/1 2025-10-10T00:54:30.8020779Z export/test_draft_export 1/1 2025-10-10T00:54:30.8021062Z export/test_sparse 1/1 2025-10-10T00:54:30.8021329Z dynamo/test_comptime 1/1 2025-10-10T00:54:30.8021608Z dynamo/test_python_autograd 1/1 2025-10-10T00:54:30.8021905Z functorch/test_rearrange 1/1 2025-10-10T00:54:30.8022181Z functorch/test_parsing 1/1 2025-10-10T00:54:30.8022445Z test_package 1/1 2025-10-10T00:54:30.8022697Z test_comparison_utils 1/1 2025-10-10T00:54:30.8022965Z test_mkl_verbose 1/1 2025-10-10T00:54:30.8023220Z functorch/test_ac_logging 1/1 2025-10-10T00:54:30.8023511Z test_mkldnn_verbose 1/1 2025-10-10T00:54:30.8023780Z profiler/test_kineto 1/1 2025-10-10T00:54:30.8024050Z test_matmul_cuda 1/1 2025-10-10T00:54:30.8024292Z test_transformers 1/1 2025-10-10T00:54:30.8024543Z test_meta 1/1 2025-10-10T00:54:30.8024775Z test_license 1/1 2025-10-10T00:54:30.8025039Z test_utils_config_module 1/1 2025-10-10T00:54:30.8025299Z test_decomp 1/16 2025-10-10T00:54:30.8025530Z test_decomp 6/16 2025-10-10T00:54:30.8025761Z test_decomp 7/16 2025-10-10T00:54:30.8025990Z test_decomp 10/16 2025-10-10T00:54:30.8026238Z test_decomp 15/16 2025-10-10T00:54:30.8026474Z test_decomp 16/16 2025-10-10T00:54:30.8026716Z xpu/test_conv 1/1 2025-10-10T00:54:30.8026960Z functorch/test_ops 2/2 2025-10-10T00:54:30.8027224Z test_datapipe 1/1 2025-10-10T00:54:30.8027470Z lazy/test_generator 1/1 2025-10-10T00:54:30.8027770Z torch_np/numpy_tests/lib/test_type_check 1/1 2025-10-10T00:54:30.8028090Z lazy/test_debug_util 1/1 2025-10-10T00:54:30.8028370Z test_jit_llga_fuser 1/1 2025-10-10T00:54:30.8028631Z test_numa_binding 1/1 2025-10-10T00:54:30.8028926Z torch_np/numpy_tests/lib/test_histograms 1/1 2025-10-10T00:54:30.8029267Z benchmark_utils/test_benchmark_utils 1/1 2025-10-10T00:54:30.8029628Z torch_np/numpy_tests/core/test_scalarmath 1/1 2025-10-10T00:54:30.8030046Z test_indexing 1/1 2025-10-10T00:54:30.8030306Z profiler/test_torch_tidy 1/1 2025-10-10T00:54:30.8030584Z nn/test_module_hooks 1/1 2025-10-10T00:54:30.8030871Z functorch/test_aotdispatch 1/1 2025-10-10T00:54:30.8031171Z nn/test_load_state_dict 1/1 2025-10-10T00:54:30.8031481Z torch_np/numpy_tests/linalg/test_linalg 1/1 2025-10-10T00:54:30.8031806Z test_shape_ops 1/1 2025-10-10T00:54:30.8032107Z torch_np/numpy_tests/core/test_shape_base 1/1 2025-10-10T00:54:30.8032471Z torch_np/numpy_tests/core/test_dtype 1/1 2025-10-10T00:54:30.8032785Z test_unary_ufuncs 1/1 2025-10-10T00:54:30.8033035Z optim/test_optim 1/1 2025-10-10T00:54:30.8033295Z test_sparse_csr 1/2 2025-10-10T00:54:30.8033557Z test_serialization 1/1 2025-10-10T00:54:30.8033855Z torch_np/numpy_tests/lib/test_twodim_base 1/1 2025-10-10T00:54:30.8034179Z test_function_schema 1/1 2025-10-10T00:54:30.8034456Z functorch/test_vmap 1/1 2025-10-10T00:54:30.8034769Z torch_np/numpy_tests/lib/test_shape_base_ 1/1 2025-10-10T00:54:30.8035135Z torch_np/numpy_tests/fft/test_pocketfft 1/1 2025-10-10T00:54:30.8035457Z test_scatter_gather_ops 1/1 2025-10-10T00:54:30.8035746Z torch_np/test_ndarray_methods 1/1 2025-10-10T00:54:30.8036034Z test_view_ops 1/1 2025-10-10T00:54:30.8036310Z torch_np/numpy_tests/core/test_dlpack 1/1 2025-10-10T00:54:30.8036653Z torch_np/numpy_tests/core/test_getlimits 1/1 2025-10-10T00:54:30.8036979Z test_accelerator 1/1 2025-10-10T00:54:30.8037241Z lazy/test_reuse_ir 1/1 2025-10-10T00:54:30.8037537Z torch_np/numpy_tests/lib/test_index_tricks 1/1 2025-10-10T00:54:30.8037965Z nn/test_init 1/1 2025-10-10T00:54:30.8038255Z torch_np/numpy_tests/core/test_numerictypes 1/1 2025-10-10T00:54:30.8038582Z test_type_promotion 1/1 2025-10-10T00:54:30.8038885Z torch_np/numpy_tests/core/test_scalar_methods 1/1 2025-10-10T00:54:30.8039237Z torch_np/numpy_tests/fft/test_helper 1/1 2025-10-10T00:54:30.8039555Z torch_np/test_function_base 1/1 2025-10-10T00:54:30.8039847Z profiler/test_profiler_tree 1/1 2025-10-10T00:54:30.8040148Z functorch/test_eager_transforms 1/1 2025-10-10T00:54:30.8040431Z test_sparse 1/1 2025-10-10T00:54:30.8040668Z Name: excluded (est. time: 0.0min) 2025-10-10T00:54:30.8040934Z Serial tests (0): 2025-10-10T00:54:30.8041161Z Parallel tests (0): 2025-10-10T00:54:30.8043435Z Running inductor/test_dependencies 1/1 ... [2025-10-10 00:54:30.804019] 2025-10-10T00:54:30.8043868Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:54:30.8046771Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_dependencies.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:54:30.804362] 2025-10-10T00:54:39.1839467Z 2025-10-10T00:54:39.1840506Z inductor/test_dependencies 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_dependencies_1.1_86844299a4665816_.log 2025-10-10T00:54:39.1841294Z Running 0 items in this shard: 2025-10-10T00:54:39.1841485Z 2025-10-10T00:54:39.4706451Z Uploading artifacts took 0.29 seconds 2025-10-10T00:54:39.4710093Z Running test_ops 1/1 ... [2025-10-10 00:54:39.470721] 2025-10-10T00:54:39.4710465Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:54:39.4714297Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:54:39.471094] 2025-10-10T00:54:54.8638884Z 2025-10-10T00:54:54.8639543Z test_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_1.1_a290887c7b969568_.log 2025-10-10T00:54:54.8640147Z Running 0 items in this shard: 2025-10-10T00:54:54.8640333Z 2025-10-10T00:54:54.8643842Z Running test_torchfuzz_repros 1/1 ... [2025-10-10 00:54:54.864020] 2025-10-10T00:54:54.8644265Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:54:54.8648030Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_torchfuzz_repros.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:54:54.864445] 2025-10-10T00:54:58.0856920Z 2025-10-10T00:54:58.0858486Z test_torchfuzz_repros 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_torchfuzz_repros_1.1_2bc72c91a6a54132_.log 2025-10-10T00:54:58.0859765Z Running 0 items in this shard: 2025-10-10T00:54:58.0860081Z 2025-10-10T00:54:58.0860541Z Running test_opaque_obj 1/1 ... [2025-10-10 00:54:58.085766] 2025-10-10T00:54:58.0860927Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:54:58.0865137Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_opaque_obj.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:54:58.086132] 2025-10-10T00:55:01.2568625Z 2025-10-10T00:55:01.2569571Z test_opaque_obj 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_opaque_obj_1.1_8b7b66c8590c136e_.log 2025-10-10T00:55:01.2570237Z Running 0 items in this shard: 2025-10-10T00:55:01.2570423Z 2025-10-10T00:55:01.2573429Z Running test_testing 1/1 ... [2025-10-10 00:55:01.257015] 2025-10-10T00:55:01.2573818Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:55:01.2577268Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_testing.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:55:01.257392] 2025-10-10T00:55:06.4819218Z 2025-10-10T00:55:06.4820076Z test_testing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_testing_1.1_9d31ddad2b6fecdc_.log 2025-10-10T00:55:06.4820758Z Running 0 items in this shard: 2025-10-10T00:55:06.4820941Z 2025-10-10T00:55:06.4823574Z Running test_public_bindings 1/1 ... [2025-10-10 00:55:06.482032] 2025-10-10T00:55:06.4823980Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:55:06.4827305Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_public_bindings.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:55:06.482411] 2025-10-10T00:55:09.7030725Z 2025-10-10T00:55:09.7032040Z test_public_bindings 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_public_bindings_1.1_531123992d7ab0b9_.log 2025-10-10T00:55:09.7033128Z Running 0 items in this shard: 2025-10-10T00:55:09.7033319Z 2025-10-10T00:55:09.7033975Z Running inductor/test_aot_inductor 1/1 ... [2025-10-10 00:55:09.703120] 2025-10-10T00:55:09.7034406Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:55:09.7038086Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:55:09.703488] 2025-10-10T00:55:17.3324551Z 2025-10-10T00:55:17.3325573Z inductor/test_aot_inductor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_1.1_ab2d17034a493267_.log 2025-10-10T00:55:17.3326346Z Running 0 items in this shard: 2025-10-10T00:55:17.3326529Z 2025-10-10T00:55:17.3328282Z Running inductor/test_torchinductor 1/1 ... [2025-10-10 00:55:17.332530] 2025-10-10T00:55:17.3328728Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:55:17.3333048Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:55:17.332947] 2025-10-10T00:55:24.8611773Z 2025-10-10T00:55:24.8612748Z inductor/test_torchinductor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_1.1_2ef4abe840cd09c4_.log 2025-10-10T00:55:24.8613778Z Running 1 items in this shard: test/inductor/test_torchinductor.py::GPUTests::test_large_block_sizes_cuda 2025-10-10T00:55:24.8614217Z 2025-10-10T00:55:24.8615225Z Running inductor/test_torchinductor_opinfo 2/11 ... [2025-10-10 00:55:24.861248] 2025-10-10T00:55:24.8615687Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:55:24.8619781Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'serial', '--shard-id=2', '--num-shards=11', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:55:24.861634] 2025-10-10T00:55:33.8932842Z 2025-10-10T00:55:33.8933738Z inductor/test_torchinductor_opinfo 2/11 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_2.11_27d27947181c8c8b_.log 2025-10-10T00:55:33.8934549Z Running 0 items in this shard: 2025-10-10T00:55:33.8934732Z 2025-10-10T00:55:33.8937021Z Running inductor/test_torchinductor_opinfo 5/11 ... [2025-10-10 00:55:33.893410] 2025-10-10T00:55:33.8937483Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:55:33.8942046Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'serial', '--shard-id=5', '--num-shards=11', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:55:33.893772] 2025-10-10T00:55:42.9249422Z 2025-10-10T00:55:42.9250410Z inductor/test_torchinductor_opinfo 5/11 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_5.11_32aafc5ce9df02d7_.log 2025-10-10T00:55:42.9251408Z Running 0 items in this shard: 2025-10-10T00:55:42.9251605Z 2025-10-10T00:55:42.9254043Z Running inductor/test_torchinductor_opinfo 6/11 ... [2025-10-10 00:55:42.925059] 2025-10-10T00:55:42.9254515Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:55:42.9258747Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'serial', '--shard-id=6', '--num-shards=11', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:55:42.925505] 2025-10-10T00:55:51.9068044Z 2025-10-10T00:55:51.9069138Z inductor/test_torchinductor_opinfo 6/11 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_6.11_ca27bac7090278d9_.log 2025-10-10T00:55:51.9070156Z Running 0 items in this shard: 2025-10-10T00:55:51.9070362Z 2025-10-10T00:55:51.9072393Z Running inductor/test_torchinductor_opinfo 8/11 ... [2025-10-10 00:55:51.906914] 2025-10-10T00:55:51.9072855Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:55:51.9076784Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'serial', '--shard-id=8', '--num-shards=11', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:55:51.907284] 2025-10-10T00:56:00.9377202Z 2025-10-10T00:56:00.9378356Z inductor/test_torchinductor_opinfo 8/11 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_8.11_eee00d2276de4801_.log 2025-10-10T00:56:00.9379164Z Running 0 items in this shard: 2025-10-10T00:56:00.9379345Z 2025-10-10T00:56:00.9382108Z Running inductor/test_torchinductor_opinfo 11/11 ... [2025-10-10 00:56:00.937834] 2025-10-10T00:56:00.9383063Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:56:00.9386770Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'serial', '--shard-id=11', '--num-shards=11', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:56:00.938272] 2025-10-10T00:56:09.9690021Z 2025-10-10T00:56:09.9690911Z inductor/test_torchinductor_opinfo 11/11 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_11.11_a2dda46409c7457c_.log 2025-10-10T00:56:09.9691735Z Running 0 items in this shard: 2025-10-10T00:56:09.9691964Z 2025-10-10T00:56:09.9694791Z Running inductor/test_static_cuda_launcher 1/1 ... [2025-10-10 00:56:09.969164] 2025-10-10T00:56:09.9695255Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:56:09.9699376Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_static_cuda_launcher.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:56:09.969556] 2025-10-10T00:56:16.6969115Z 2025-10-10T00:56:16.6970070Z inductor/test_static_cuda_launcher 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_static_cuda_launcher_1.1_ac8e94e2359fb972_.log 2025-10-10T00:56:16.6970883Z Running 0 items in this shard: 2025-10-10T00:56:16.6971072Z 2025-10-10T00:56:16.6973298Z Running inductor/test_cooperative_reductions 1/1 ... [2025-10-10 00:56:16.697013] 2025-10-10T00:56:16.6974216Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:56:16.6977501Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cooperative_reductions.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:56:16.697391] 2025-10-10T00:56:23.5246462Z 2025-10-10T00:56:23.5247637Z inductor/test_cooperative_reductions 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cooperative_reductions_1.1_a72977059c179a93_.log 2025-10-10T00:56:23.5248462Z Running 0 items in this shard: 2025-10-10T00:56:23.5248646Z 2025-10-10T00:56:23.5250914Z Running inductor/test_async_compile 1/1 ... [2025-10-10 00:56:23.524759] 2025-10-10T00:56:23.5251338Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:56:23.5254973Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_async_compile.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:56:23.525153] 2025-10-10T00:56:30.2024658Z 2025-10-10T00:56:30.2025805Z inductor/test_async_compile 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_async_compile_1.1_28bead571da43287_.log 2025-10-10T00:56:30.2026759Z Running 0 items in this shard: 2025-10-10T00:56:30.2027019Z 2025-10-10T00:56:30.2031341Z Running inductor/test_kernel_benchmark 1/1 ... [2025-10-10 00:56:30.202790] 2025-10-10T00:56:30.2031806Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:56:30.2035008Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_kernel_benchmark.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:56:30.203127] 2025-10-10T00:56:36.8809650Z 2025-10-10T00:56:36.8810651Z inductor/test_kernel_benchmark 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_kernel_benchmark_1.1_e20112867c7299ff_.log 2025-10-10T00:56:36.8811429Z Running 0 items in this shard: 2025-10-10T00:56:36.8811615Z 2025-10-10T00:56:36.8813319Z Running inductor/test_cuda_repro 1/1 ... [2025-10-10 00:56:36.881038] 2025-10-10T00:56:36.8814065Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:56:36.8816890Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cuda_repro.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:56:36.881369] 2025-10-10T00:56:44.2600073Z 2025-10-10T00:56:44.2601089Z inductor/test_cuda_repro 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cuda_repro_1.1_1c4a1bb51e1f44f1_.log 2025-10-10T00:56:44.2601817Z Running 0 items in this shard: 2025-10-10T00:56:44.2602031Z 2025-10-10T00:56:44.2604292Z Running dynamo/test_callback 1/1 ... [2025-10-10 00:56:44.260130] 2025-10-10T00:56:44.2604699Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:56:44.2608487Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_callback.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:56:44.260499] 2025-10-10T00:56:51.0877784Z 2025-10-10T00:56:51.0878600Z dynamo/test_callback 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_callback_1.1_ec30c8e2a2630ac1_.log 2025-10-10T00:56:51.0879302Z Running 0 items in this shard: 2025-10-10T00:56:51.0879492Z 2025-10-10T00:56:51.0881815Z Running inductor/test_fp8 1/1 ... [2025-10-10 00:56:51.087906] 2025-10-10T00:56:51.0882214Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:56:51.0886109Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_fp8.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:56:51.088261] 2025-10-10T00:56:57.9161932Z 2025-10-10T00:56:57.9162890Z inductor/test_fp8 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_fp8_1.1_dccbc2d4b2904e41_.log 2025-10-10T00:56:57.9163563Z Running 0 items in this shard: 2025-10-10T00:56:57.9166055Z 2025-10-10T00:56:57.9169678Z Running inductor/test_torchinductor_dynamic_shapes 1/2 ... [2025-10-10 00:56:57.916519] 2025-10-10T00:56:57.9170427Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:56:57.9173020Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_dynamic_shapes.py', '-m', 'serial', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:56:57.916938] 2025-10-10T00:57:05.5961766Z 2025-10-10T00:57:05.5962782Z inductor/test_torchinductor_dynamic_shapes 1/2 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_dynamic_shapes_1.2_3153119e2a46dd4e_.log 2025-10-10T00:57:05.5964669Z Running 2 items in this shard: test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_large_block_sizes_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_large_block_sizes_dynamic_shapes_cuda 2025-10-10T00:57:05.5976680Z 2025-10-10T00:57:05.5976995Z Running inductor/test_analysis 1/1 ... [2025-10-10 00:57:05.596247] 2025-10-10T00:57:05.5977456Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:57:05.5978547Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_analysis.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:57:05.596611] 2025-10-10T00:57:12.6234679Z 2025-10-10T00:57:12.6235512Z inductor/test_analysis 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_analysis_1.1_6c0f12a0794524ba_.log 2025-10-10T00:57:12.6237362Z Running 0 items in this shard: 2025-10-10T00:57:12.6237557Z 2025-10-10T00:57:12.6238706Z Running inductor/test_triton_syntax 1/1 ... [2025-10-10 00:57:12.623580] 2025-10-10T00:57:12.6239140Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:57:12.6242702Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_triton_syntax.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:57:12.623931] 2025-10-10T00:57:19.4004009Z 2025-10-10T00:57:19.4005019Z inductor/test_triton_syntax 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_triton_syntax_1.1_12d372c47ed78b3f_.log 2025-10-10T00:57:19.4005767Z Running 0 items in this shard: 2025-10-10T00:57:19.4005953Z 2025-10-10T00:57:19.4008609Z Running inductor/test_triton_extension_backend 1/1 ... [2025-10-10 00:57:19.400556] 2025-10-10T00:57:19.4009100Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:57:19.4012785Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_triton_extension_backend.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:57:19.400935] 2025-10-10T00:57:26.7797384Z 2025-10-10T00:57:26.7798269Z inductor/test_triton_extension_backend 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_triton_extension_backend_1.1_99d06730dc11f751_.log 2025-10-10T00:57:26.7799332Z Running 0 items in this shard: 2025-10-10T00:57:26.7799870Z 2025-10-10T00:57:26.7802683Z Running inductor/test_utils 1/1 ... [2025-10-10 00:57:26.779792] 2025-10-10T00:57:26.7803127Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:57:26.7805377Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_utils.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:57:26.780160] 2025-10-10T00:57:30.3013361Z 2025-10-10T00:57:30.3014163Z inductor/test_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_utils_1.1_bc93c486a0357eda_.log 2025-10-10T00:57:30.3015026Z Running 0 items in this shard: 2025-10-10T00:57:30.3015280Z 2025-10-10T00:57:30.3017514Z Running inductor/test_coordinate_descent_tuner 1/1 ... [2025-10-10 00:57:30.301419] 2025-10-10T00:57:30.3018120Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:57:30.3021114Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_coordinate_descent_tuner.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:57:30.301749] 2025-10-10T00:57:37.0790405Z 2025-10-10T00:57:37.0791400Z inductor/test_coordinate_descent_tuner 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_coordinate_descent_tuner_1.1_2bd1307ad4f49e17_.log 2025-10-10T00:57:37.0792250Z Running 0 items in this shard: 2025-10-10T00:57:37.0792440Z 2025-10-10T00:57:37.0793022Z Running inductor/test_inplace_padding 1/1 ... [2025-10-10 00:57:37.079017] 2025-10-10T00:57:37.0793451Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:57:37.0797253Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_inplace_padding.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:57:37.079388] 2025-10-10T00:57:44.4080802Z 2025-10-10T00:57:44.4081706Z inductor/test_inplace_padding 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_inplace_padding_1.1_f0da5bd278d946af_.log 2025-10-10T00:57:44.4083066Z Running 1 items in this shard: test/inductor/test_inplace_padding.py::InplacePaddingTest::test_linear_and_cel 2025-10-10T00:57:44.4083510Z 2025-10-10T00:57:44.4084451Z Running inductor/test_template_heuristics_registry 1/1 ... [2025-10-10 00:57:44.408180] 2025-10-10T00:57:44.4087898Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:57:44.4088971Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_template_heuristics_registry.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:57:44.408512] 2025-10-10T00:57:49.0312994Z 2025-10-10T00:57:49.0314005Z inductor/test_template_heuristics_registry 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_template_heuristics_registry_1.1_d5ca3c0d864dab48_.log 2025-10-10T00:57:49.0314885Z Running 0 items in this shard: 2025-10-10T00:57:49.0315092Z 2025-10-10T00:57:49.0316992Z Running inductor/test_select_algorithm 1/1 ... [2025-10-10 00:57:49.031394] 2025-10-10T00:57:49.0317533Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:57:49.0320524Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_select_algorithm.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:57:49.031722] 2025-10-10T00:57:55.7091847Z 2025-10-10T00:57:55.7093530Z inductor/test_select_algorithm 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_select_algorithm_1.1_83e526ed83f06358_.log 2025-10-10T00:57:55.7094782Z Running 0 items in this shard: 2025-10-10T00:57:55.7095082Z 2025-10-10T00:57:55.7095422Z Running inductor/test_extension_backend 1/1 ... [2025-10-10 00:57:55.709176] 2025-10-10T00:57:55.7096027Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:57:55.7100156Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_extension_backend.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:57:55.709619] 2025-10-10T00:58:03.0381362Z 2025-10-10T00:58:03.0383526Z inductor/test_extension_backend 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_extension_backend_1.1_255ea3527a1f2060_.log 2025-10-10T00:58:03.0384433Z Running 0 items in this shard: 2025-10-10T00:58:03.0384699Z 2025-10-10T00:58:03.0385060Z Running inductor/test_inductor_scheduler 1/1 ... [2025-10-10 00:58:03.037955] 2025-10-10T00:58:03.0385536Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:58:03.0386847Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_inductor_scheduler.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:58:03.038322] 2025-10-10T00:58:10.0179336Z 2025-10-10T00:58:10.0180451Z inductor/test_inductor_scheduler 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_inductor_scheduler_1.1_3756edb8ff2c895c_.log 2025-10-10T00:58:10.0181242Z Running 0 items in this shard: 2025-10-10T00:58:10.0181432Z 2025-10-10T00:58:10.0183251Z Running inductor/test_padding 1/1 ... [2025-10-10 00:58:10.018011] 2025-10-10T00:58:10.0183676Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:58:10.0186799Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_padding.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:58:10.018355] 2025-10-10T00:58:16.8456521Z 2025-10-10T00:58:16.8457566Z inductor/test_padding 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_padding_1.1_58bb0e4694895dda_.log 2025-10-10T00:58:16.8458942Z Running 1 items in this shard: test/inductor/test_padding.py::PaddingTest::test_nobias_LinearAndSoftmax_codegen 2025-10-10T00:58:16.8459492Z 2025-10-10T00:58:16.8462766Z Running inductor/test_codegen_triton 1/1 ... [2025-10-10 00:58:16.845969] 2025-10-10T00:58:16.8463208Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:58:16.8466848Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_codegen_triton.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:58:16.846318] 2025-10-10T00:58:23.4736927Z 2025-10-10T00:58:23.4738478Z inductor/test_codegen_triton 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_codegen_triton_1.1_48a804e657ff6a4c_.log 2025-10-10T00:58:23.4739595Z Running 0 items in this shard: 2025-10-10T00:58:23.4739779Z 2025-10-10T00:58:23.4740400Z Running inductor/test_torchinductor_codegen_dynamic_shapes 1/2 ... [2025-10-10 00:58:23.473747] 2025-10-10T00:58:23.4743890Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:58:23.4744995Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_codegen_dynamic_shapes.py', '-m', 'serial', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:58:23.474099] 2025-10-10T00:58:31.4033397Z 2025-10-10T00:58:31.4034713Z inductor/test_torchinductor_codegen_dynamic_shapes 1/2 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_codegen_dynamic_shapes_1.2_3a27e00f8710b930_.log 2025-10-10T00:58:31.4035648Z Running 0 items in this shard: 2025-10-10T00:58:31.4035836Z 2025-10-10T00:58:31.4038021Z Running export/test_export_training_ir_to_run_decomp 1/1 ... [2025-10-10 00:58:31.403478] 2025-10-10T00:58:31.4038519Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:58:31.4041914Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_export_training_ir_to_run_decomp.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:58:31.403823] 2025-10-10T00:58:38.4316186Z 2025-10-10T00:58:38.4317738Z export/test_export_training_ir_to_run_decomp 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_export_training_ir_to_run_decomp_1.1_34018bf7d12e1028_.log 2025-10-10T00:58:38.4319227Z Running 0 items in this shard: 2025-10-10T00:58:38.4319527Z 2025-10-10T00:58:38.4321502Z Running inductor/test_indexing 1/1 ... [2025-10-10 00:58:38.431741] 2025-10-10T00:58:38.4322300Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:58:38.4326057Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_indexing.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:58:38.432184] 2025-10-10T00:58:45.1094470Z 2025-10-10T00:58:45.1095244Z inductor/test_indexing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_indexing_1.1_928d9225831fb74b_.log 2025-10-10T00:58:45.1096068Z Running 0 items in this shard: 2025-10-10T00:58:45.1096280Z 2025-10-10T00:58:45.1098117Z Running inductor/test_minifier 1/1 ... [2025-10-10 00:58:45.109471] 2025-10-10T00:58:45.1098767Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:58:45.1102483Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_minifier.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:58:45.109896] 2025-10-10T00:58:51.9371032Z 2025-10-10T00:58:51.9371993Z inductor/test_minifier 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_minifier_1.1_21f38f15d7c8d51c_.log 2025-10-10T00:58:51.9372730Z Running 0 items in this shard: 2025-10-10T00:58:51.9372915Z 2025-10-10T00:58:51.9375458Z Running inductor/test_perf 1/1 ... [2025-10-10 00:58:51.937231] 2025-10-10T00:58:51.9375869Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:58:51.9378984Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_perf.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:58:51.937561] 2025-10-10T00:58:58.7655298Z 2025-10-10T00:58:58.7656482Z inductor/test_perf 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_perf_1.1_f99cd6efcb248e75_.log 2025-10-10T00:58:58.7657210Z Running 0 items in this shard: 2025-10-10T00:58:58.7657407Z 2025-10-10T00:58:58.7658457Z Running inductor/test_pad_mm 1/1 ... [2025-10-10 00:58:58.765552] 2025-10-10T00:58:58.7659001Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:58:58.7663186Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_pad_mm.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:58:58.765905] 2025-10-10T00:59:05.5447870Z 2025-10-10T00:59:05.5448943Z inductor/test_pad_mm 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_pad_mm_1.1_8bd42738a715a2f4_.log 2025-10-10T00:59:05.5449645Z Running 0 items in this shard: 2025-10-10T00:59:05.5449840Z 2025-10-10T00:59:05.5452244Z Running inductor/test_inductor_annotations 1/1 ... [2025-10-10 00:59:05.544879] 2025-10-10T00:59:05.5452730Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:59:05.5455967Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_inductor_annotations.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:59:05.545233] 2025-10-10T00:59:12.3726263Z 2025-10-10T00:59:12.3727240Z inductor/test_inductor_annotations 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_inductor_annotations_1.1_5e8539a8796a1261_.log 2025-10-10T00:59:12.3728035Z Running 0 items in this shard: 2025-10-10T00:59:12.3728224Z 2025-10-10T00:59:12.3728759Z Running inductor/test_ck_backend 1/1 ... [2025-10-10 00:59:12.372472] 2025-10-10T00:59:12.3729171Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:59:12.3731986Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_ck_backend.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:59:12.372859] 2025-10-10T00:59:19.2001709Z 2025-10-10T00:59:19.2002633Z inductor/test_ck_backend 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_ck_backend_1.1_a9a51f6c723d1e69_.log 2025-10-10T00:59:19.2003358Z Running 0 items in this shard: 2025-10-10T00:59:19.2003553Z 2025-10-10T00:59:19.2008056Z Running inductor/test_inductor_utils 1/1 ... [2025-10-10 00:59:19.200325] 2025-10-10T00:59:19.2008512Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:59:19.2010805Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_inductor_utils.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:59:19.200737] 2025-10-10T00:59:22.3710927Z 2025-10-10T00:59:22.3712218Z inductor/test_inductor_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_inductor_utils_1.1_9e536dd58eb7f3fe_.log 2025-10-10T00:59:22.3713098Z Running 0 items in this shard: 2025-10-10T00:59:22.3713299Z 2025-10-10T00:59:22.3715045Z Running inductor/test_op_completeness 1/1 ... [2025-10-10 00:59:22.371203] 2025-10-10T00:59:22.3715573Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:59:22.3718536Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_op_completeness.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:59:22.371533] 2025-10-10T00:59:25.9425711Z 2025-10-10T00:59:25.9426529Z inductor/test_op_completeness 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_op_completeness_1.1_dda0816a3578d294_.log 2025-10-10T00:59:25.9427340Z Running 0 items in this shard: 2025-10-10T00:59:25.9427528Z 2025-10-10T00:59:25.9429720Z Running inductor/test_multi_kernel 1/1 ... [2025-10-10 00:59:25.942689] 2025-10-10T00:59:25.9430159Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:59:25.9433988Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_multi_kernel.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:59:25.943022] 2025-10-10T00:59:32.7200145Z 2025-10-10T00:59:32.7202162Z inductor/test_multi_kernel 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_multi_kernel_1.1_0d87f3b493366d85_.log 2025-10-10T00:59:32.7203424Z Running 0 items in this shard: 2025-10-10T00:59:32.7203717Z 2025-10-10T00:59:32.7204043Z Running inductor/test_autoheuristic 1/1 ... [2025-10-10 00:59:32.720070] 2025-10-10T00:59:32.7204653Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:59:32.7209075Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_autoheuristic.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:59:32.720514] 2025-10-10T00:59:39.8987032Z 2025-10-10T00:59:39.8988458Z inductor/test_autoheuristic 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_autoheuristic_1.1_9740c07f37095603_.log 2025-10-10T00:59:39.8989946Z Running 0 items in this shard: 2025-10-10T00:59:39.8990379Z 2025-10-10T00:59:39.8991300Z Running export/test_serdes 1/1 ... [2025-10-10 00:59:39.898679] 2025-10-10T00:59:39.8991852Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:59:39.8994528Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_serdes.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:59:39.899043] 2025-10-10T00:59:46.8771865Z 2025-10-10T00:59:46.8772963Z export/test_serdes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_serdes_1.1_38a488be1c228665_.log 2025-10-10T00:59:46.8773653Z Running 0 items in this shard: 2025-10-10T00:59:46.8773840Z 2025-10-10T00:59:46.8774780Z Running dynamo/test_deque_reconstruct 1/1 ... [2025-10-10 00:59:46.877151] 2025-10-10T00:59:46.8775259Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:59:46.8778542Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_deque_reconstruct.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:59:46.877485] 2025-10-10T00:59:50.0482522Z 2025-10-10T00:59:50.0484148Z dynamo/test_deque_reconstruct 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_deque_reconstruct_1.1_c8d5fe5ab5327b72_.log 2025-10-10T00:59:50.0485589Z Running 0 items in this shard: 2025-10-10T00:59:50.0485775Z 2025-10-10T00:59:50.0488149Z Running inductor/test_cuda_select_algorithm 1/1 ... [2025-10-10 00:59:50.048509] 2025-10-10T00:59:50.0488614Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:59:50.0491989Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cuda_select_algorithm.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:59:50.048840] 2025-10-10T00:59:57.4276804Z 2025-10-10T00:59:57.4277794Z inductor/test_cuda_select_algorithm 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cuda_select_algorithm_1.1_753ed3b429a73664_.log 2025-10-10T00:59:57.4278607Z Running 0 items in this shard: 2025-10-10T00:59:57.4278813Z 2025-10-10T00:59:57.4282797Z Running export/test_strict_export_v2 1/1 ... [2025-10-10 00:59:57.427869] 2025-10-10T00:59:57.4283348Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T00:59:57.4286094Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_strict_export_v2.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 00:59:57.428206] 2025-10-10T01:00:04.3065621Z 2025-10-10T01:00:04.3066817Z export/test_strict_export_v2 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_strict_export_v2_1.1_91b992c3521eba99_.log 2025-10-10T01:00:04.3067571Z Running 0 items in this shard: 2025-10-10T01:00:04.3067764Z 2025-10-10T01:00:04.3070013Z Running inductor/test_deterministic 1/1 ... [2025-10-10 01:00:04.306653] 2025-10-10T01:00:04.3070453Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:00:04.3074043Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_deterministic.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:00:04.307040] 2025-10-10T01:00:11.0843322Z 2025-10-10T01:00:11.0844159Z inductor/test_deterministic 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_deterministic_1.1_58a0a0d408e8c482_.log 2025-10-10T01:00:11.0845217Z Running 0 items in this shard: 2025-10-10T01:00:11.0845457Z 2025-10-10T01:00:11.0845680Z Running inductor/test_flex_decoding 1/1 ... [2025-10-10 01:00:11.083996] 2025-10-10T01:00:11.0846194Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:00:11.0847540Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_flex_decoding.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:00:11.084351] 2025-10-10T01:00:18.2126035Z 2025-10-10T01:00:18.2126941Z inductor/test_flex_decoding 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_flex_decoding_1.1_f91f8e38d95283e7_.log 2025-10-10T01:00:18.2129489Z Running 4 items in this shard: test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_pow_2_headdim_head_dim_121_float16_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_pow_2_headdim_head_dim_17_float16_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_pow_2_headdim_head_dim_24_float16_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_pow_2_headdim_head_dim_94_float16_cuda_float16 2025-10-10T01:00:18.2131351Z 2025-10-10T01:00:18.2131600Z Running export/test_unflatten_training_ir 1/1 ... [2025-10-10 01:00:18.212490] 2025-10-10T01:00:18.2132044Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:00:18.2133454Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_unflatten_training_ir.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:00:18.212879] 2025-10-10T01:00:21.4336677Z 2025-10-10T01:00:21.4337735Z export/test_unflatten_training_ir 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_unflatten_training_ir_1.1_a765755a6e7d8f4a_.log 2025-10-10T01:00:21.4338717Z Running 0 items in this shard: 2025-10-10T01:00:21.4338929Z 2025-10-10T01:00:21.4341826Z Running inductor/test_aot_inductor_arrayref 1/1 ... [2025-10-10 01:00:21.433749] 2025-10-10T01:00:21.4342297Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:00:21.4344743Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor_arrayref.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:00:21.434095] 2025-10-10T01:00:28.9134685Z 2025-10-10T01:00:28.9135853Z inductor/test_aot_inductor_arrayref 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_arrayref_1.1_5de7dff66346f4c6_.log 2025-10-10T01:00:28.9136667Z Running 0 items in this shard: 2025-10-10T01:00:28.9136864Z 2025-10-10T01:00:28.9138343Z Running dynamo/test_fx_passes_pre_grad 1/1 ... [2025-10-10 01:00:28.913525] 2025-10-10T01:00:28.9138914Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:00:28.9142968Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_fx_passes_pre_grad.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:00:28.913854] 2025-10-10T01:00:32.0849872Z 2025-10-10T01:00:32.0851057Z dynamo/test_fx_passes_pre_grad 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_fx_passes_pre_grad_1.1_853ee5fd75a2e1ff_.log 2025-10-10T01:00:32.0852023Z Running 0 items in this shard: 2025-10-10T01:00:32.0852213Z 2025-10-10T01:00:32.0853976Z Running inductor/test_aot_inductor_windows 1/1 ... [2025-10-10 01:00:32.085076] 2025-10-10T01:00:32.0854544Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:00:32.0857724Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor_windows.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:00:32.085422] 2025-10-10T01:00:38.9135001Z 2025-10-10T01:00:38.9136069Z inductor/test_aot_inductor_windows 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_windows_1.1_292693d894285bbe_.log 2025-10-10T01:00:38.9136898Z Running 0 items in this shard: 2025-10-10T01:00:38.9137083Z 2025-10-10T01:00:38.9142449Z Running inductor/test_compiled_autograd 1/2 ... [2025-10-10 01:00:38.913902] 2025-10-10T01:00:38.9142899Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:00:38.9146260Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_compiled_autograd.py', '-m', 'serial', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:00:38.914278] 2025-10-10T01:00:47.4943637Z 2025-10-10T01:00:47.4944752Z inductor/test_compiled_autograd 1/2 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_compiled_autograd_1.2_c61db975b7b0ad5e_.log 2025-10-10T01:00:47.4945546Z Running 0 items in this shard: 2025-10-10T01:00:47.4945736Z 2025-10-10T01:00:47.4947494Z Running inductor/test_metrics 1/1 ... [2025-10-10 01:00:47.494431] 2025-10-10T01:00:47.4948278Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:00:47.4951335Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_metrics.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:00:47.494789] 2025-10-10T01:00:54.3241999Z 2025-10-10T01:00:54.3243196Z inductor/test_metrics 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_metrics_1.1_368586a87e605888_.log 2025-10-10T01:00:54.3244086Z Running 0 items in this shard: 2025-10-10T01:00:54.3244272Z 2025-10-10T01:00:54.3244581Z Running inductor/test_custom_post_grad_passes 1/1 ... [2025-10-10 01:00:54.324043] 2025-10-10T01:00:54.3245204Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:00:54.3246879Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_custom_post_grad_passes.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:00:54.324393] 2025-10-10T01:01:01.1021417Z 2025-10-10T01:01:01.1022326Z inductor/test_custom_post_grad_passes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_custom_post_grad_passes_1.1_6392b39a1bd376f3_.log 2025-10-10T01:01:01.1023306Z Running 0 items in this shard: 2025-10-10T01:01:01.1023496Z 2025-10-10T01:01:01.1025881Z Running inductor/test_aot_inductor_package 1/1 ... [2025-10-10 01:01:01.102221] 2025-10-10T01:01:01.1026405Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:01:01.1030294Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor_package.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:01:01.102587] 2025-10-10T01:01:07.9298755Z 2025-10-10T01:01:07.9299416Z inductor/test_aot_inductor_package 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_package_1.1_19fa980ee4863f37_.log 2025-10-10T01:01:07.9300213Z Running 0 items in this shard: 2025-10-10T01:01:07.9300403Z 2025-10-10T01:01:07.9304187Z Running inductor/test_provenance_tracing 1/1 ... [2025-10-10 01:01:07.930051] 2025-10-10T01:01:07.9304766Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:01:07.9307711Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_provenance_tracing.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:01:07.930389] 2025-10-10T01:01:14.7583830Z 2025-10-10T01:01:14.7584699Z inductor/test_provenance_tracing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_provenance_tracing_1.1_d9075b58c4ec6bbc_.log 2025-10-10T01:01:14.7585557Z Running 0 items in this shard: 2025-10-10T01:01:14.7585822Z 2025-10-10T01:01:14.7587579Z Running inductor/test_fx_fusion 1/1 ... [2025-10-10 01:01:14.758485] 2025-10-10T01:01:14.7588009Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:01:14.7591589Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_fx_fusion.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:01:14.758838] 2025-10-10T01:01:19.3822880Z 2025-10-10T01:01:19.3823752Z inductor/test_fx_fusion 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_fx_fusion_1.1_b67642a4a0de600d_.log 2025-10-10T01:01:19.3824483Z Running 0 items in this shard: 2025-10-10T01:01:19.3824676Z 2025-10-10T01:01:19.3826704Z Running inductor/test_loop_ordering 1/1 ... [2025-10-10 01:01:19.382381] 2025-10-10T01:01:19.3827459Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:01:19.3830419Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_loop_ordering.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:01:19.382716] 2025-10-10T01:01:26.0600631Z 2025-10-10T01:01:26.0601967Z inductor/test_loop_ordering 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_loop_ordering_1.1_e7749674ccf3fa1e_.log 2025-10-10T01:01:26.0603252Z Running 0 items in this shard: 2025-10-10T01:01:26.0603572Z 2025-10-10T01:01:26.0605073Z Running export/test_functionalized_assertions 1/1 ... [2025-10-10 01:01:26.060176] 2025-10-10T01:01:26.0605748Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:01:26.0610085Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_functionalized_assertions.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:01:26.060610] 2025-10-10T01:01:29.2813199Z 2025-10-10T01:01:29.2814645Z export/test_functionalized_assertions 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_functionalized_assertions_1.1_6b07771784c40230_.log 2025-10-10T01:01:29.2816151Z Running 0 items in this shard: 2025-10-10T01:01:29.2816427Z 2025-10-10T01:01:29.2816817Z Running inductor/test_segmented_tree 1/1 ... [2025-10-10 01:01:29.281394] 2025-10-10T01:01:29.2817467Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:01:29.2822148Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_segmented_tree.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:01:29.281817] 2025-10-10T01:01:32.5027042Z 2025-10-10T01:01:32.5028376Z inductor/test_segmented_tree 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_segmented_tree_1.1_1db7f12bde08ea58_.log 2025-10-10T01:01:32.5029838Z Running 0 items in this shard: 2025-10-10T01:01:32.5030163Z 2025-10-10T01:01:32.5031424Z Running inductor/test_compiled_optimizers 1/1 ... [2025-10-10 01:01:32.502819] 2025-10-10T01:01:32.5032070Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:01:32.5036482Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_compiled_optimizers.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:01:32.503261] 2025-10-10T01:01:41.0349760Z 2025-10-10T01:01:41.0350780Z inductor/test_compiled_optimizers 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_compiled_optimizers_1.1_2eda2e723a1555d4_.log 2025-10-10T01:01:41.0351612Z Running 0 items in this shard: 2025-10-10T01:01:41.0351794Z 2025-10-10T01:01:41.0353700Z Running inductor/test_decompose_mem_bound_mm 1/1 ... [2025-10-10 01:01:41.035058] 2025-10-10T01:01:41.0354237Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:01:41.0357448Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_decompose_mem_bound_mm.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:01:41.035402] 2025-10-10T01:01:47.8637983Z 2025-10-10T01:01:47.8639638Z inductor/test_decompose_mem_bound_mm 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_decompose_mem_bound_mm_1.1_7dfe8ee45c6c8d04_.log 2025-10-10T01:01:47.8640456Z Running 0 items in this shard: 2025-10-10T01:01:47.8640660Z 2025-10-10T01:01:47.8641245Z Running dynamo/test_base_output 1/1 ... [2025-10-10 01:01:47.863792] 2025-10-10T01:01:47.8642120Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:01:47.8644874Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_base_output.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:01:47.864152] 2025-10-10T01:01:51.2353253Z 2025-10-10T01:01:51.2354303Z dynamo/test_base_output 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_base_output_1.1_d74408fd8046e317_.log 2025-10-10T01:01:51.2355030Z Running 0 items in this shard: 2025-10-10T01:01:51.2355254Z 2025-10-10T01:01:51.2357065Z Running dynamo/test_backends 1/1 ... [2025-10-10 01:01:51.235412] 2025-10-10T01:01:51.2357590Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:01:51.2361105Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_backends.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:01:51.235761] 2025-10-10T01:01:58.2631176Z 2025-10-10T01:01:58.2632029Z dynamo/test_backends 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_backends_1.1_fd5e0c4be04e2890_.log 2025-10-10T01:01:58.2632882Z Running 0 items in this shard: 2025-10-10T01:01:58.2633109Z 2025-10-10T01:01:58.2637148Z Running dynamo/test_fx_graph_runnable 1/1 ... [2025-10-10 01:01:58.263408] 2025-10-10T01:01:58.2637693Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:01:58.2641273Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_fx_graph_runnable.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:01:58.263736] 2025-10-10T01:02:05.1913608Z 2025-10-10T01:02:05.1914836Z dynamo/test_fx_graph_runnable 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_fx_graph_runnable_1.1_bd2f1fdee4aa4ab5_.log 2025-10-10T01:02:05.1915623Z Running 0 items in this shard: 2025-10-10T01:02:05.1915812Z 2025-10-10T01:02:05.1916644Z Running inductor/test_compile_worker 1/1 ... [2025-10-10 01:02:05.191385] 2025-10-10T01:02:05.1917184Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:02:05.1920795Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_compile_worker.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:02:05.191720] 2025-10-10T01:02:12.0189986Z 2025-10-10T01:02:12.0191295Z inductor/test_compile_worker 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_compile_worker_1.1_696a6569d36d6303_.log 2025-10-10T01:02:12.0192709Z Running 0 items in this shard: 2025-10-10T01:02:12.0192993Z 2025-10-10T01:02:12.0194047Z Running inductor/test_move_constructors_to_cuda 1/1 ... [2025-10-10 01:02:12.019087] 2025-10-10T01:02:12.0194723Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:02:12.0199802Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_move_constructors_to_cuda.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:02:12.019590] 2025-10-10T01:02:18.8474825Z 2025-10-10T01:02:18.8476154Z inductor/test_move_constructors_to_cuda 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_move_constructors_to_cuda_1.1_2230c6148ed3f6a3_.log 2025-10-10T01:02:18.8477686Z Running 0 items in this shard: 2025-10-10T01:02:18.8477983Z 2025-10-10T01:02:18.8478391Z Running inductor/test_subgraph_choice 1/1 ... [2025-10-10 01:02:18.847528] 2025-10-10T01:02:18.8479387Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:02:18.8483840Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_subgraph_choice.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:02:18.847985] 2025-10-10T01:02:25.4754081Z 2025-10-10T01:02:25.4755361Z inductor/test_subgraph_choice 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_subgraph_choice_1.1_c90408b39edde7b1_.log 2025-10-10T01:02:25.4756727Z Running 0 items in this shard: 2025-10-10T01:02:25.4757039Z 2025-10-10T01:02:25.4757847Z Running export/test_export_strict 1/1 ... [2025-10-10 01:02:25.475489] 2025-10-10T01:02:25.4758467Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:02:25.4763032Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_export_strict.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:02:25.475922] 2025-10-10T01:02:32.4042916Z 2025-10-10T01:02:32.4044356Z export/test_export_strict 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_export_strict_1.1_c706f27fec9b2874_.log 2025-10-10T01:02:32.4045651Z Running 0 items in this shard: 2025-10-10T01:02:32.4045960Z 2025-10-10T01:02:32.4046653Z Running inductor/test_cutedsl_template 1/1 ... [2025-10-10 01:02:32.404391] 2025-10-10T01:02:32.4047191Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:02:32.4052649Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cutedsl_template.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:02:32.404827] 2025-10-10T01:02:39.1831258Z 2025-10-10T01:02:39.1832500Z inductor/test_cutedsl_template 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cutedsl_template_1.1_52ff6371b196bb76_.log 2025-10-10T01:02:39.1833617Z Running 0 items in this shard: 2025-10-10T01:02:39.1833901Z 2025-10-10T01:02:39.1834265Z Running dynamo/test_inline_and_install 1/1 ... [2025-10-10 01:02:39.182957] 2025-10-10T01:02:39.1834973Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:02:39.1837317Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_inline_and_install.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:02:39.183354] 2025-10-10T01:02:43.1564150Z 2025-10-10T01:02:43.1564993Z dynamo/test_inline_and_install 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_inline_and_install_1.1_d898458fd215403d_.log 2025-10-10T01:02:43.1565808Z Running 0 items in this shard: 2025-10-10T01:02:43.1565996Z 2025-10-10T01:02:43.1569828Z Running export/test_tree_utils 1/1 ... [2025-10-10 01:02:43.156657] 2025-10-10T01:02:43.1570256Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:02:43.1573714Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_tree_utils.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:02:43.157016] 2025-10-10T01:02:46.3279746Z 2025-10-10T01:02:46.3280540Z export/test_tree_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_tree_utils_1.1_7f35ac8ca390c8f7_.log 2025-10-10T01:02:46.3281252Z Running 0 items in this shard: 2025-10-10T01:02:46.3281444Z 2025-10-10T01:02:46.3284203Z Running dynamo/test_recompiles 1/1 ... [2025-10-10 01:02:46.328127] 2025-10-10T01:02:46.3284904Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:02:46.3288275Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_recompiles.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:02:46.328486] 2025-10-10T01:02:49.6993772Z 2025-10-10T01:02:49.6994744Z dynamo/test_recompiles 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_recompiles_1.1_602bfd9804f296f2_.log 2025-10-10T01:02:49.6995519Z Running 0 items in this shard: 2025-10-10T01:02:49.6995772Z 2025-10-10T01:02:49.6998111Z Running dynamo/test_einops 1/1 ... [2025-10-10 01:02:49.699493] 2025-10-10T01:02:49.6998788Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:02:49.7003042Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_einops.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:02:49.699894] 2025-10-10T01:02:52.8703622Z 2025-10-10T01:02:52.8704602Z dynamo/test_einops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_einops_1.1_f716b5119420a8f3_.log 2025-10-10T01:02:52.8705285Z Running 0 items in this shard: 2025-10-10T01:02:52.8705476Z 2025-10-10T01:02:52.8707257Z Running inductor/test_foreach 1/1 ... [2025-10-10 01:02:52.870393] 2025-10-10T01:02:52.8707772Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:02:52.8711153Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_foreach.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:02:52.870721] 2025-10-10T01:03:00.2992343Z 2025-10-10T01:03:00.2993494Z inductor/test_foreach 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_foreach_1.1_a5086df914c27c80_.log 2025-10-10T01:03:00.2994224Z Running 0 items in this shard: 2025-10-10T01:03:00.2994416Z 2025-10-10T01:03:00.2996815Z Running inductor/test_minifier_utils 1/1 ... [2025-10-10 01:03:00.299228] 2025-10-10T01:03:00.2997260Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:03:00.2999880Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_minifier_utils.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:03:00.299600] 2025-10-10T01:03:03.6707459Z 2025-10-10T01:03:03.6708436Z inductor/test_minifier_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_minifier_utils_1.1_bbab12902e300986_.log 2025-10-10T01:03:03.6709238Z Running 0 items in this shard: 2025-10-10T01:03:03.6709425Z 2025-10-10T01:03:03.6711768Z Running dynamo/test_sdpa 1/1 ... [2025-10-10 01:03:03.670781] 2025-10-10T01:03:03.6712252Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:03:03.6715332Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_sdpa.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:03:03.671127] 2025-10-10T01:03:07.0421527Z 2025-10-10T01:03:07.0422479Z dynamo/test_sdpa 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_sdpa_1.1_1001b4ea05f1d684_.log 2025-10-10T01:03:07.0423149Z Running 0 items in this shard: 2025-10-10T01:03:07.0423366Z 2025-10-10T01:03:07.0425503Z Running inductor/test_compile_subprocess 1/1 ... [2025-10-10 01:03:07.042244] 2025-10-10T01:03:07.0425973Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:03:07.0429078Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_compile_subprocess.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:03:07.042568] 2025-10-10T01:03:14.5222713Z 2025-10-10T01:03:14.5223756Z inductor/test_compile_subprocess 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_compile_subprocess_1.1_d4ac5c4eb9ab9c0c_.log 2025-10-10T01:03:14.5224955Z Running 1 items in this shard: test/inductor/test_compile_subprocess.py::GPUTests::test_large_block_sizes_cuda 2025-10-10T01:03:14.5225431Z 2025-10-10T01:03:14.5225693Z Running export/test_cpp_serdes 1/1 ... [2025-10-10 01:03:14.522299] 2025-10-10T01:03:14.5226127Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:03:14.5230175Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_cpp_serdes.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:03:14.522658] 2025-10-10T01:03:21.5510389Z 2025-10-10T01:03:21.5511204Z export/test_cpp_serdes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_cpp_serdes_1.1_86a1441da7c7f0aa_.log 2025-10-10T01:03:21.5511910Z Running 0 items in this shard: 2025-10-10T01:03:21.5512103Z 2025-10-10T01:03:21.5514628Z Running inductor/test_debug_trace 1/1 ... [2025-10-10 01:03:21.551151] 2025-10-10T01:03:21.5515066Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:03:21.5518805Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_debug_trace.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:03:21.551494] 2025-10-10T01:03:28.8809891Z 2025-10-10T01:03:28.8810704Z inductor/test_debug_trace 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_debug_trace_1.1_2e9e05d1909b21ee_.log 2025-10-10T01:03:28.8811469Z Running 0 items in this shard: 2025-10-10T01:03:28.8811657Z 2025-10-10T01:03:28.8814701Z Running inductor/test_memory 1/1 ... [2025-10-10 01:03:28.881169] 2025-10-10T01:03:28.8815118Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:03:28.8818455Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_memory.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:03:28.881511] 2025-10-10T01:03:35.5086997Z 2025-10-10T01:03:35.5088809Z inductor/test_memory 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_memory_1.1_8a3ccfe8335f2942_.log 2025-10-10T01:03:35.5090375Z Running 1 items in this shard: test/inductor/test_memory.py::TestOperatorReorderForPeakMemory::test_fusion_acc_large_reads 2025-10-10T01:03:35.5090887Z 2025-10-10T01:03:35.5091095Z Running dynamo/test_frame_init 1/1 ... [2025-10-10 01:03:35.508847] 2025-10-10T01:03:35.5091486Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:03:35.5095557Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_frame_init.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:03:35.509246] 2025-10-10T01:03:38.6798672Z 2025-10-10T01:03:38.6799573Z dynamo/test_frame_init 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_frame_init_1.1_a259ebe95855bee9_.log 2025-10-10T01:03:38.6800302Z Running 0 items in this shard: 2025-10-10T01:03:38.6800485Z 2025-10-10T01:03:38.6802131Z Running inductor/test_kernel_optimization 1/1 ... [2025-10-10 01:03:38.679814] 2025-10-10T01:03:38.6802614Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:03:38.6805438Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_kernel_optimization.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:03:38.680189] 2025-10-10T01:03:45.5073848Z 2025-10-10T01:03:45.5075587Z inductor/test_kernel_optimization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_kernel_optimization_1.1_875b521802e1a907_.log 2025-10-10T01:03:45.5077422Z Running 1 items in this shard: test/inductor/test_kernel_optimization.py::TestKernelOptimization::test_einsum_to_pointwise 2025-10-10T01:03:45.5078339Z 2025-10-10T01:03:45.5079009Z Running inductor/test_combo_kernels 1/1 ... [2025-10-10 01:03:45.507638] 2025-10-10T01:03:45.5079655Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:03:45.5084291Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_combo_kernels.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:03:45.508070] 2025-10-10T01:03:52.8372782Z 2025-10-10T01:03:52.8373706Z inductor/test_combo_kernels 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_combo_kernels_1.1_6591cab9fa30e71f_.log 2025-10-10T01:03:52.8374452Z Running 0 items in this shard: 2025-10-10T01:03:52.8374635Z 2025-10-10T01:03:52.8376359Z Running inductor/test_inplacing_pass 1/1 ... [2025-10-10 01:03:52.837324] 2025-10-10T01:03:52.8376815Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:03:52.8380141Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_inplacing_pass.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:03:52.837658] 2025-10-10T01:03:59.4662228Z 2025-10-10T01:03:59.4663303Z inductor/test_inplacing_pass 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_inplacing_pass_1.1_28ade21d154ae588_.log 2025-10-10T01:03:59.4664061Z Running 0 items in this shard: 2025-10-10T01:03:59.4664252Z 2025-10-10T01:03:59.4665898Z Running dynamo/test_skip_non_tensor 1/1 ... [2025-10-10 01:03:59.466273] 2025-10-10T01:03:59.4666323Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:03:59.4669481Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_skip_non_tensor.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:03:59.466608] 2025-10-10T01:04:02.8378356Z 2025-10-10T01:04:02.8379513Z dynamo/test_skip_non_tensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_skip_non_tensor_1.1_c910ca93aaa343e7_.log 2025-10-10T01:04:02.8380265Z Running 0 items in this shard: 2025-10-10T01:04:02.8380481Z 2025-10-10T01:04:02.8380690Z Running inductor/test_op_dtype_prop 1/1 ... [2025-10-10 01:04:02.837764] 2025-10-10T01:04:02.8381114Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:04:02.8384372Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_op_dtype_prop.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:04:02.838103] 2025-10-10T01:04:10.6179411Z 2025-10-10T01:04:10.6180469Z inductor/test_op_dtype_prop 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_op_dtype_prop_1.1_f48fbff8ae399887_.log 2025-10-10T01:04:10.6181239Z Running 0 items in this shard: 2025-10-10T01:04:10.6181435Z 2025-10-10T01:04:10.6183119Z Running dynamo/test_reconstruct 1/1 ... [2025-10-10 01:04:10.618039] 2025-10-10T01:04:10.6183609Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:04:10.6186747Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_reconstruct.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:04:10.618367] 2025-10-10T01:04:17.4464608Z 2025-10-10T01:04:17.4465360Z dynamo/test_reconstruct 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_reconstruct_1.1_bf0a71b14c1b6d65_.log 2025-10-10T01:04:17.4466073Z Running 0 items in this shard: 2025-10-10T01:04:17.4466268Z 2025-10-10T01:04:17.4468367Z Running export/test_dynamic_shapes 1/1 ... [2025-10-10 01:04:17.446532] 2025-10-10T01:04:17.4468803Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:04:17.4472078Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_dynamic_shapes.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:04:17.446862] 2025-10-10T01:04:20.6177293Z 2025-10-10T01:04:20.6178305Z export/test_dynamic_shapes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_dynamic_shapes_1.1_2eca05b0a663d698_.log 2025-10-10T01:04:20.6179119Z Running 0 items in this shard: 2025-10-10T01:04:20.6179305Z 2025-10-10T01:04:20.6180525Z Running inductor/test_remote_cache 1/1 ... [2025-10-10 01:04:20.617752] 2025-10-10T01:04:20.6180953Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:04:20.6185515Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_remote_cache.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:04:20.618079] 2025-10-10T01:04:23.8393773Z 2025-10-10T01:04:23.8394829Z inductor/test_remote_cache 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_remote_cache_1.1_7fb455d90679ab6b_.log 2025-10-10T01:04:23.8395704Z Running 0 items in this shard: 2025-10-10T01:04:23.8395907Z 2025-10-10T01:04:23.8396354Z Running dynamo/test_interop 1/1 ... [2025-10-10 01:04:23.839354] 2025-10-10T01:04:23.8396754Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:04:23.8400351Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_interop.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:04:23.839675] 2025-10-10T01:04:27.2112823Z 2025-10-10T01:04:27.2113778Z dynamo/test_interop 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_interop_1.1_7d4750a428fbe66e_.log 2025-10-10T01:04:27.2114547Z Running 0 items in this shard: 2025-10-10T01:04:27.2114763Z 2025-10-10T01:04:27.2116699Z Running inductor/test_device_assert 1/1 ... [2025-10-10 01:04:27.211347] 2025-10-10T01:04:27.2117193Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:04:27.2120524Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_device_assert.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:04:27.211681] 2025-10-10T01:04:34.0397398Z 2025-10-10T01:04:34.0398954Z inductor/test_device_assert 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_device_assert_1.1_5f5c8031fe7fd613_.log 2025-10-10T01:04:34.0399835Z Running 0 items in this shard: 2025-10-10T01:04:34.0400030Z 2025-10-10T01:04:34.0400212Z Running inductor/test_smoke 1/1 ... [2025-10-10 01:04:34.039666] 2025-10-10T01:04:34.0400710Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:04:34.0404111Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_smoke.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:04:34.040013] 2025-10-10T01:04:40.6175323Z 2025-10-10T01:04:40.6176144Z inductor/test_smoke 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_smoke_1.1_36c7ed0ea59e0c17_.log 2025-10-10T01:04:40.6176867Z 2025-10-10T01:04:40.6179401Z Running dynamo/test_skip_guard_eval_unsafe 1/1 ... [2025-10-10 01:04:40.617552] 2025-10-10T01:04:40.6179929Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:04:40.6182816Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_skip_guard_eval_unsafe.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:04:40.617886] 2025-10-10T01:04:43.9888534Z 2025-10-10T01:04:43.9889614Z dynamo/test_skip_guard_eval_unsafe 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_skip_guard_eval_unsafe_1.1_5ad321e463e6138c_.log 2025-10-10T01:04:43.9890560Z Running 0 items in this shard: 2025-10-10T01:04:43.9890791Z 2025-10-10T01:04:43.9892698Z Running export/test_tools 1/1 ... [2025-10-10 01:04:43.988944] 2025-10-10T01:04:43.9893117Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:04:43.9896894Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_tools.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:04:43.989300] 2025-10-10T01:04:47.3604620Z 2025-10-10T01:04:47.3605585Z export/test_tools 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_tools_1.1_674db0ccef8f5279_.log 2025-10-10T01:04:47.3606275Z Running 0 items in this shard: 2025-10-10T01:04:47.3606518Z 2025-10-10T01:04:47.3608495Z Running inductor/test_gpu_cpp_wrapper 1/1 ... [2025-10-10 01:04:47.360561] 2025-10-10T01:04:47.3608941Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:04:47.3612274Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_gpu_cpp_wrapper.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:04:47.360894] 2025-10-10T01:04:55.0901134Z 2025-10-10T01:04:55.0902179Z inductor/test_gpu_cpp_wrapper 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_gpu_cpp_wrapper_1.1_a5d4f1de56f24715_.log 2025-10-10T01:04:55.0903022Z Running 0 items in this shard: 2025-10-10T01:04:55.0903208Z 2025-10-10T01:04:55.0906057Z Running export/test_export_with_inline_and_install 1/1 ... [2025-10-10 01:04:55.090211] 2025-10-10T01:04:55.0906554Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:04:55.0910236Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_export_with_inline_and_install.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:04:55.090548] 2025-10-10T01:05:02.0692357Z 2025-10-10T01:05:02.0693445Z export/test_export_with_inline_and_install 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_export_with_inline_and_install_1.1_34da667e96b6ae57_.log 2025-10-10T01:05:02.0694299Z Running 0 items in this shard: 2025-10-10T01:05:02.0694495Z 2025-10-10T01:05:02.0696543Z Running export/test_serialize 1/1 ... [2025-10-10 01:05:02.069367] 2025-10-10T01:05:02.0697013Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:05:02.0701511Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_serialize.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:05:02.069742] 2025-10-10T01:05:08.8976452Z 2025-10-10T01:05:08.8977699Z export/test_serialize 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_serialize_1.1_f9d33d6cf6abfd6d_.log 2025-10-10T01:05:08.8978417Z Running 0 items in this shard: 2025-10-10T01:05:08.8978600Z 2025-10-10T01:05:08.8982982Z Running dynamo/test_functions 1/1 ... [2025-10-10 01:05:08.897931] 2025-10-10T01:05:08.8983549Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:05:08.8986976Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_functions.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:05:08.898307] 2025-10-10T01:05:15.8256753Z 2025-10-10T01:05:15.8257768Z dynamo/test_functions 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_functions_1.1_4f466802dc375aa7_.log 2025-10-10T01:05:15.8258489Z Running 0 items in this shard: 2025-10-10T01:05:15.8258672Z 2025-10-10T01:05:15.8259835Z Running inductor/test_benchmarking 1/1 ... [2025-10-10 01:05:15.825706] 2025-10-10T01:05:15.8260331Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:05:15.8264039Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_benchmarking.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:05:15.826040] 2025-10-10T01:05:22.6033040Z 2025-10-10T01:05:22.6034710Z inductor/test_benchmarking 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_benchmarking_1.1_fffb1721d5aea822_.log 2025-10-10T01:05:22.6035479Z Running 0 items in this shard: 2025-10-10T01:05:22.6035663Z 2025-10-10T01:05:22.6036401Z Running inductor/test_quantization 1/1 ... [2025-10-10 01:05:22.603366] 2025-10-10T01:05:22.6036982Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:05:22.6041403Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_quantization.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:05:22.603715] 2025-10-10T01:05:29.3820447Z 2025-10-10T01:05:29.3821652Z inductor/test_quantization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_quantization_1.1_e36eaddf111f1674_.log 2025-10-10T01:05:29.3822426Z Running 0 items in this shard: 2025-10-10T01:05:29.3822615Z 2025-10-10T01:05:29.3825904Z Running inductor/test_aot_inductor_custom_ops 1/1 ... [2025-10-10 01:05:29.382133] 2025-10-10T01:05:29.3826409Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:05:29.3829076Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor_custom_ops.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:05:29.382500] 2025-10-10T01:05:36.7623575Z 2025-10-10T01:05:36.7624573Z inductor/test_aot_inductor_custom_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_custom_ops_1.1_a1b0b9e83de4eaa8_.log 2025-10-10T01:05:36.7625565Z Running 0 items in this shard: 2025-10-10T01:05:36.7625765Z 2025-10-10T01:05:36.7628015Z Running inductor/test_scatter_optimization 1/1 ... [2025-10-10 01:05:36.762444] 2025-10-10T01:05:36.7628571Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:05:36.7632071Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_scatter_optimization.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:05:36.762828] 2025-10-10T01:05:43.5922841Z 2025-10-10T01:05:43.5923898Z inductor/test_scatter_optimization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_scatter_optimization_1.1_4f5fbb2443ce81e7_.log 2025-10-10T01:05:43.5924707Z Running 0 items in this shard: 2025-10-10T01:05:43.5924895Z 2025-10-10T01:05:43.5931695Z Running inductor/test_group_batch_fusion 1/1 ... [2025-10-10 01:05:43.592715] 2025-10-10T01:05:43.5932308Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:05:43.5937596Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_group_batch_fusion.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:05:43.593116] 2025-10-10T01:05:50.5230867Z 2025-10-10T01:05:50.5231934Z inductor/test_group_batch_fusion 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_group_batch_fusion_1.1_617d3421dd6ff8fc_.log 2025-10-10T01:05:50.5232746Z Running 0 items in this shard: 2025-10-10T01:05:50.5232940Z 2025-10-10T01:05:50.5235748Z Running inductor/test_split_cat_fx_passes 1/1 ... [2025-10-10 01:05:50.523180] 2025-10-10T01:05:50.5236219Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:05:50.5239841Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_split_cat_fx_passes.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:05:50.523549] 2025-10-10T01:05:57.3533884Z 2025-10-10T01:05:57.3534931Z inductor/test_split_cat_fx_passes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_split_cat_fx_passes_1.1_432d448a30c72404_.log 2025-10-10T01:05:57.3535697Z Running 0 items in this shard: 2025-10-10T01:05:57.3535910Z 2025-10-10T01:05:57.3539078Z Running dynamo/test_view 1/1 ... [2025-10-10 01:05:57.353556] 2025-10-10T01:05:57.3539476Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:05:57.3542948Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_view.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:05:57.353933] 2025-10-10T01:06:00.5242722Z 2025-10-10T01:06:00.5243602Z dynamo/test_view 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_view_1.1_08f3df60b96c4425_.log 2025-10-10T01:06:00.5244295Z Running 0 items in this shard: 2025-10-10T01:06:00.5244483Z 2025-10-10T01:06:00.5247491Z Running dynamo/test_fx_annotate 1/1 ... [2025-10-10 01:06:00.524400] 2025-10-10T01:06:00.5248099Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:06:00.5251990Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_fx_annotate.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:06:00.524797] 2025-10-10T01:06:07.3538976Z 2025-10-10T01:06:07.3539936Z dynamo/test_fx_annotate 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_fx_annotate_1.1_fda785a6cccff9bd_.log 2025-10-10T01:06:07.3540661Z Running 0 items in this shard: 2025-10-10T01:06:07.3540855Z 2025-10-10T01:06:07.3543400Z Running inductor/test_control_deps 1/1 ... [2025-10-10 01:06:07.354006] 2025-10-10T01:06:07.3543866Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:06:07.3547402Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_control_deps.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:06:07.354363] 2025-10-10T01:06:14.1317526Z 2025-10-10T01:06:14.1318734Z inductor/test_control_deps 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_control_deps_1.1_f992cb451f92b502_.log 2025-10-10T01:06:14.1319940Z Running 0 items in this shard: 2025-10-10T01:06:14.1320207Z 2025-10-10T01:06:14.1322758Z Running dynamo/test_pre_dispatch 1/1 ... [2025-10-10 01:06:14.131966] 2025-10-10T01:06:14.1323346Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:06:14.1327555Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_pre_dispatch.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:06:14.132392] 2025-10-10T01:06:17.3035742Z 2025-10-10T01:06:17.3036986Z dynamo/test_pre_dispatch 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_pre_dispatch_1.1_b28ca5840659f754_.log 2025-10-10T01:06:17.3037925Z Running 0 items in this shard: 2025-10-10T01:06:17.3038112Z 2025-10-10T01:06:17.3041735Z Running dynamo/test_subgraphs 1/1 ... [2025-10-10 01:06:17.303855] 2025-10-10T01:06:17.3053524Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:06:17.3054530Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_subgraphs.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:06:17.304221] 2025-10-10T01:06:20.6752311Z 2025-10-10T01:06:20.6753878Z dynamo/test_subgraphs 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_subgraphs_1.1_b700ddb9c915a228_.log 2025-10-10T01:06:20.6754995Z Running 0 items in this shard: 2025-10-10T01:06:20.6755317Z 2025-10-10T01:06:20.6757407Z Running inductor/test_mkldnn_pattern_matcher 1/1 ... [2025-10-10 01:06:20.675386] 2025-10-10T01:06:20.6758072Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:06:20.6762077Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_mkldnn_pattern_matcher.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:06:20.675829] 2025-10-10T01:06:27.7542862Z 2025-10-10T01:06:27.7543893Z inductor/test_mkldnn_pattern_matcher 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_mkldnn_pattern_matcher_1.1_061658ea68b99485_.log 2025-10-10T01:06:27.7544721Z Running 0 items in this shard: 2025-10-10T01:06:27.7544907Z 2025-10-10T01:06:27.7546607Z Running dynamo/test_decorators 1/1 ... [2025-10-10 01:06:27.754361] 2025-10-10T01:06:27.7547026Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:06:27.7550779Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_decorators.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:06:27.754732] 2025-10-10T01:06:31.1770472Z 2025-10-10T01:06:31.1771500Z dynamo/test_decorators 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_decorators_1.1_a5f9b702d0f35d3d_.log 2025-10-10T01:06:31.1772226Z Running 0 items in this shard: 2025-10-10T01:06:31.1772415Z 2025-10-10T01:06:31.1774683Z Running dynamo/test_pgo 1/1 ... [2025-10-10 01:06:31.177129] 2025-10-10T01:06:31.1775184Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:06:31.1778658Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_pgo.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:06:31.177498] 2025-10-10T01:06:34.5488039Z 2025-10-10T01:06:34.5489788Z dynamo/test_pgo 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_pgo_1.1_266517fadd492868_.log 2025-10-10T01:06:34.5491644Z Running 0 items in this shard: 2025-10-10T01:06:34.5491950Z 2025-10-10T01:06:34.5492188Z Running inductor/test_cutlass_evt 1/1 ... [2025-10-10 01:06:34.548844] 2025-10-10T01:06:34.5492633Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:06:34.5495975Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cutlass_evt.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:06:34.549236] 2025-10-10T01:06:41.3759116Z 2025-10-10T01:06:41.3760016Z inductor/test_cutlass_evt 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cutlass_evt_1.1_6014a37fda186289_.log 2025-10-10T01:06:41.3760821Z Running 0 items in this shard: 2025-10-10T01:06:41.3761108Z 2025-10-10T01:06:41.3764066Z Running dynamo/test_buffers_override 1/1 ... [2025-10-10 01:06:41.376007] 2025-10-10T01:06:41.3764609Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:06:41.3767797Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_buffers_override.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:06:41.376385] 2025-10-10T01:06:44.5467523Z 2025-10-10T01:06:44.5468942Z dynamo/test_buffers_override 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_buffers_override_1.1_a875e247e8fe79d6_.log 2025-10-10T01:06:44.5469694Z Running 0 items in this shard: 2025-10-10T01:06:44.5469874Z 2025-10-10T01:06:44.5471574Z Running inductor/test_online_softmax 1/1 ... [2025-10-10 01:06:44.546853] 2025-10-10T01:06:44.5472010Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:06:44.5475574Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_online_softmax.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:06:44.547224] 2025-10-10T01:06:51.3746725Z 2025-10-10T01:06:51.3747781Z inductor/test_online_softmax 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_online_softmax_1.1_f92e4c9d93f8dff8_.log 2025-10-10T01:06:51.3748551Z Running 0 items in this shard: 2025-10-10T01:06:51.3748746Z 2025-10-10T01:06:51.3750754Z Running test_model_exports_to_core_aten 1/1 ... [2025-10-10 01:06:51.374762] 2025-10-10T01:06:51.3751226Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:06:51.3754963Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_model_exports_to_core_aten.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:06:51.375120] 2025-10-10T01:06:54.8962959Z 2025-10-10T01:06:54.8963984Z test_model_exports_to_core_aten 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_model_exports_to_core_aten_1.1_5eee626c1d181ea9_.log 2025-10-10T01:06:54.8964757Z Running 0 items in this shard: 2025-10-10T01:06:54.8964943Z 2025-10-10T01:06:54.8966627Z Running inductor/test_helion_kernels 1/1 ... [2025-10-10 01:06:54.896376] 2025-10-10T01:06:54.8967154Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:06:54.8970889Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_helion_kernels.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:06:54.896738] 2025-10-10T01:07:01.6748865Z 2025-10-10T01:07:01.6749796Z inductor/test_helion_kernels 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_helion_kernels_1.1_50b9ef0405d03e5b_.log 2025-10-10T01:07:01.6750986Z Running 0 items in this shard: 2025-10-10T01:07:01.6751173Z 2025-10-10T01:07:01.6752962Z Running inductor/test_aot_inductor_utils 1/1 ... [2025-10-10 01:07:01.674969] 2025-10-10T01:07:01.6753547Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:07:01.6757239Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor_utils.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:07:01.675319] 2025-10-10T01:07:08.4537536Z 2025-10-10T01:07:08.4538553Z inductor/test_aot_inductor_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_utils_1.1_bbc230082af08e51_.log 2025-10-10T01:07:08.4539346Z Running 0 items in this shard: 2025-10-10T01:07:08.4539548Z 2025-10-10T01:07:08.4540352Z Running export/test_package 1/1 ... [2025-10-10 01:07:08.453726] 2025-10-10T01:07:08.4541311Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:07:08.4544323Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_package.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:07:08.454075] 2025-10-10T01:07:11.6251072Z 2025-10-10T01:07:11.6252190Z export/test_package 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_package_1.1_f570fed3cdc48302_.log 2025-10-10T01:07:11.6253793Z Running 0 items in this shard: 2025-10-10T01:07:11.6254082Z 2025-10-10T01:07:11.6257830Z Running dynamo/test_ctx_manager 1/1 ... [2025-10-10 01:07:11.625459] 2025-10-10T01:07:11.6258259Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:07:11.6262527Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_ctx_manager.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:07:11.625896] 2025-10-10T01:07:18.5543321Z 2025-10-10T01:07:18.5544275Z dynamo/test_ctx_manager 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_ctx_manager_1.1_8af0268dcaa93e75_.log 2025-10-10T01:07:18.5545006Z Running 0 items in this shard: 2025-10-10T01:07:18.5545198Z 2025-10-10T01:07:18.5547581Z Running inductor/test_cudagraph_trees 1/1 ... [2025-10-10 01:07:18.554422] 2025-10-10T01:07:18.5548038Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:07:18.5551501Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cudagraph_trees.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:07:18.554798] 2025-10-10T01:07:25.3834569Z 2025-10-10T01:07:25.3835411Z inductor/test_cudagraph_trees 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cudagraph_trees_1.1_36197e1b3d795a8a_.log 2025-10-10T01:07:25.3836186Z Running 0 items in this shard: 2025-10-10T01:07:25.3836379Z 2025-10-10T01:07:25.3840514Z Running inductor/test_block_analysis 1/1 ... [2025-10-10 01:07:25.383763] 2025-10-10T01:07:25.3840964Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:07:25.3844757Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_block_analysis.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:07:25.384147] 2025-10-10T01:07:32.2114962Z 2025-10-10T01:07:32.2115977Z inductor/test_block_analysis 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_block_analysis_1.1_8cf98248b21bf3e8_.log 2025-10-10T01:07:32.2117158Z Running 0 items in this shard: 2025-10-10T01:07:32.2117355Z 2025-10-10T01:07:32.2118307Z Running dynamo/test_autograd_function 1/1 ... [2025-10-10 01:07:32.211571] 2025-10-10T01:07:32.2118753Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:07:32.2122973Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_autograd_function.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:07:32.211958] 2025-10-10T01:07:39.0388471Z 2025-10-10T01:07:39.0389452Z dynamo/test_autograd_function 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_autograd_function_1.1_cca3560f02bb7ebc_.log 2025-10-10T01:07:39.0390224Z Running 0 items in this shard: 2025-10-10T01:07:39.0390428Z 2025-10-10T01:07:39.0395439Z Running dynamo/test_nops 1/1 ... [2025-10-10 01:07:39.039207] 2025-10-10T01:07:39.0395849Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:07:39.0399406Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_nops.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:07:39.039574] 2025-10-10T01:07:42.4104594Z 2025-10-10T01:07:42.4105610Z dynamo/test_nops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_nops_1.1_8a38f8b5813cdc19_.log 2025-10-10T01:07:42.4106307Z Running 0 items in this shard: 2025-10-10T01:07:42.4106496Z 2025-10-10T01:07:42.4109377Z Running dynamo/test_config 1/1 ... [2025-10-10 01:07:42.410585] 2025-10-10T01:07:42.4109904Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:07:42.4113459Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_config.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:07:42.410968] 2025-10-10T01:07:45.7829260Z 2025-10-10T01:07:45.7830469Z dynamo/test_config 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_config_1.1_7222ccd924774084_.log 2025-10-10T01:07:45.7831147Z Running 0 items in this shard: 2025-10-10T01:07:45.7831340Z 2025-10-10T01:07:45.7836239Z Running inductor/test_control_flow 1/1 ... [2025-10-10 01:07:45.783293] 2025-10-10T01:07:45.7836836Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:07:45.7840400Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_control_flow.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:07:45.783660] 2025-10-10T01:07:52.7127615Z 2025-10-10T01:07:52.7128631Z inductor/test_control_flow 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_control_flow_1.1_68ec3f826aff6336_.log 2025-10-10T01:07:52.7129413Z Running 0 items in this shard: 2025-10-10T01:07:52.7129615Z 2025-10-10T01:07:52.7131866Z Running export/test_db 1/1 ... [2025-10-10 01:07:52.712856] 2025-10-10T01:07:52.7132274Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:07:52.7135748Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_db.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:07:52.713217] 2025-10-10T01:07:55.8834039Z 2025-10-10T01:07:55.8834965Z export/test_db 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_db_1.1_d0b2865c718ec5bb_.log 2025-10-10T01:07:55.8835637Z Running 0 items in this shard: 2025-10-10T01:07:55.8835824Z 2025-10-10T01:07:55.8838473Z Running inductor/test_unbacked_symints 1/1 ... [2025-10-10 01:07:55.883535] 2025-10-10T01:07:55.8839347Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:07:55.8842150Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_unbacked_symints.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:07:55.883893] 2025-10-10T01:08:02.9115572Z 2025-10-10T01:08:02.9116630Z inductor/test_unbacked_symints 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_unbacked_symints_1.1_64bff71131014153_.log 2025-10-10T01:08:02.9117451Z Running 0 items in this shard: 2025-10-10T01:08:02.9117660Z 2025-10-10T01:08:02.9119434Z Running inductor/test_fused_attention 1/1 ... [2025-10-10 01:08:02.911624] 2025-10-10T01:08:02.9119945Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:08:02.9123574Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_fused_attention.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:08:02.911991] 2025-10-10T01:08:09.7395687Z 2025-10-10T01:08:09.7399650Z inductor/test_fused_attention 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_fused_attention_1.1_9fa22b5ab214ad5c_.log 2025-10-10T01:08:09.7400576Z Running 0 items in this shard: 2025-10-10T01:08:09.7400822Z 2025-10-10T01:08:09.7401154Z Running dynamo/test_export_mutations 1/1 ... [2025-10-10 01:08:09.739553] 2025-10-10T01:08:09.7402155Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:08:09.7403470Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_export_mutations.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:08:09.739934] 2025-10-10T01:08:13.1108011Z 2025-10-10T01:08:13.1109046Z dynamo/test_export_mutations 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_export_mutations_1.1_24946587e6c48654_.log 2025-10-10T01:08:13.1109809Z Running 0 items in this shard: 2025-10-10T01:08:13.1109997Z 2025-10-10T01:08:13.1112356Z Running inductor/test_config 1/1 ... [2025-10-10 01:08:13.110920] 2025-10-10T01:08:13.1112915Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:08:13.1116566Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_config.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:08:13.111271] 2025-10-10T01:08:19.8888236Z 2025-10-10T01:08:19.8889378Z inductor/test_config 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_config_1.1_afa4ffd6000fd00a_.log 2025-10-10T01:08:19.8890193Z Running 0 items in this shard: 2025-10-10T01:08:19.8890473Z 2025-10-10T01:08:19.8890780Z Running dynamo/test_guard_serialization 1/1 ... [2025-10-10 01:08:19.888729] 2025-10-10T01:08:19.8891375Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:08:19.8894707Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_guard_serialization.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:08:19.889096] 2025-10-10T01:08:26.7172708Z 2025-10-10T01:08:26.7173824Z dynamo/test_guard_serialization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_guard_serialization_1.1_029a3c89ee56da9f_.log 2025-10-10T01:08:26.7174609Z Running 0 items in this shard: 2025-10-10T01:08:26.7174796Z 2025-10-10T01:08:26.7175711Z Running inductor/test_graph_transform_observer 1/1 ... [2025-10-10 01:08:26.717266] 2025-10-10T01:08:26.7176607Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:08:26.7179625Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_graph_transform_observer.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:08:26.717621] 2025-10-10T01:08:33.5462066Z 2025-10-10T01:08:33.5463203Z inductor/test_graph_transform_observer 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_graph_transform_observer_1.1_5cc53b82a0adea3d_.log 2025-10-10T01:08:33.5464078Z Running 0 items in this shard: 2025-10-10T01:08:33.5464273Z 2025-10-10T01:08:33.5465900Z Running dynamo/test_unittest 1/1 ... [2025-10-10 01:08:33.546304] 2025-10-10T01:08:33.5466311Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:08:33.5469886Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_unittest.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:08:33.546667] 2025-10-10T01:08:36.9173118Z 2025-10-10T01:08:36.9174087Z dynamo/test_unittest 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_unittest_1.1_ea31b9b08c93d789_.log 2025-10-10T01:08:36.9174797Z Running 0 items in this shard: 2025-10-10T01:08:36.9174984Z 2025-10-10T01:08:36.9177296Z Running inductor/test_cache 1/1 ... [2025-10-10 01:08:36.917449] 2025-10-10T01:08:36.9177712Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:08:36.9181904Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cache.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:08:36.917803] 2025-10-10T01:08:40.1884056Z 2025-10-10T01:08:40.1885804Z inductor/test_cache 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cache_1.1_63f053c389aa26d9_.log 2025-10-10T01:08:40.1887267Z Running 0 items in this shard: 2025-10-10T01:08:40.1887651Z 2025-10-10T01:08:40.1888575Z Running dynamo/test_after_aot 1/1 ... [2025-10-10 01:08:40.188518] 2025-10-10T01:08:40.1889028Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:08:40.1892269Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_after_aot.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:08:40.188884] 2025-10-10T01:08:43.5594236Z 2025-10-10T01:08:43.5594978Z dynamo/test_after_aot 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_after_aot_1.1_c155ea426d214603_.log 2025-10-10T01:08:43.5595685Z Running 0 items in this shard: 2025-10-10T01:08:43.5595896Z 2025-10-10T01:08:43.5599123Z Running inductor/test_compile 1/1 ... [2025-10-10 01:08:43.559580] 2025-10-10T01:08:43.5599548Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:08:43.5603508Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_compile.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:08:43.559992] 2025-10-10T01:08:50.2371487Z 2025-10-10T01:08:50.2372456Z inductor/test_compile 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_compile_1.1_8ea83f9bfd18d05a_.log 2025-10-10T01:08:50.2373170Z Running 0 items in this shard: 2025-10-10T01:08:50.2373354Z 2025-10-10T01:08:50.2373598Z Running export/test_export_opinfo 1/1 ... [2025-10-10 01:08:50.237086] 2025-10-10T01:08:50.2374015Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:08:50.2378078Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_export_opinfo.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:08:50.237447] 2025-10-10T01:08:54.9108791Z 2025-10-10T01:08:54.9109745Z export/test_export_opinfo 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_export_opinfo_1.1_e16b3c6cd7888b49_.log 2025-10-10T01:08:54.9110490Z Running 0 items in this shard: 2025-10-10T01:08:54.9110679Z 2025-10-10T01:08:54.9112850Z Running inductor/test_custom_lowering 1/1 ... [2025-10-10 01:08:54.910967] 2025-10-10T01:08:54.9113322Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:08:54.9117114Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_custom_lowering.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:08:54.911348] 2025-10-10T01:09:01.6904319Z 2025-10-10T01:09:01.6905273Z inductor/test_custom_lowering 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_custom_lowering_1.1_fcd0bceaa7572fd4_.log 2025-10-10T01:09:01.6906030Z Running 0 items in this shard: 2025-10-10T01:09:01.6906219Z 2025-10-10T01:09:01.6908606Z Running dynamo/test_graph_region_tracker 1/1 ... [2025-10-10 01:09:01.690522] 2025-10-10T01:09:01.6909061Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:09:01.6912971Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_graph_region_tracker.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:09:01.690893] 2025-10-10T01:09:05.0626094Z 2025-10-10T01:09:05.0627132Z dynamo/test_graph_region_tracker 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_graph_region_tracker_1.1_64b79f2ed725064b_.log 2025-10-10T01:09:05.0627943Z Running 0 items in this shard: 2025-10-10T01:09:05.0628131Z 2025-10-10T01:09:05.0629851Z Running dynamo/test_dicts 1/1 ... [2025-10-10 01:09:05.062687] 2025-10-10T01:09:05.0630255Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:09:05.0633970Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_dicts.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:09:05.063055] 2025-10-10T01:09:08.5846172Z 2025-10-10T01:09:08.5847213Z dynamo/test_dicts 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_dicts_1.1_37a8d4e5c2055746_.log 2025-10-10T01:09:08.5847864Z Running 0 items in this shard: 2025-10-10T01:09:08.5848049Z 2025-10-10T01:09:08.5850734Z Running inductor/test_fuzzer 1/1 ... [2025-10-10 01:09:08.584760] 2025-10-10T01:09:08.5851164Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:09:08.5854826Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_fuzzer.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:09:08.585135] 2025-10-10T01:09:15.3129602Z 2025-10-10T01:09:15.3130505Z inductor/test_fuzzer 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_fuzzer_1.1_ff0415bf194a3291_.log 2025-10-10T01:09:15.3131218Z Running 0 items in this shard: 2025-10-10T01:09:15.3131413Z 2025-10-10T01:09:15.3135159Z Running dynamo/test_modules 1/1 ... [2025-10-10 01:09:15.313200] 2025-10-10T01:09:15.3135560Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:09:15.3139916Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_modules.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:09:15.313628] 2025-10-10T01:09:22.4415000Z 2025-10-10T01:09:22.4415764Z dynamo/test_modules 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_modules_1.1_91040889e26f427c_.log 2025-10-10T01:09:22.4416445Z Running 0 items in this shard: 2025-10-10T01:09:22.4416640Z 2025-10-10T01:09:22.4419500Z Running dynamo/test_metrics_context 1/1 ... [2025-10-10 01:09:22.441606] 2025-10-10T01:09:22.4419924Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:09:22.4423416Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_metrics_context.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:09:22.441979] 2025-10-10T01:09:25.6124502Z 2025-10-10T01:09:25.6125464Z dynamo/test_metrics_context 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_metrics_context_1.1_d7a7ab83aa180c96_.log 2025-10-10T01:09:25.6126264Z Running 0 items in this shard: 2025-10-10T01:09:25.6126459Z 2025-10-10T01:09:25.6128891Z Running dynamo/test_install_free_tensors 1/1 ... [2025-10-10 01:09:25.612573] 2025-10-10T01:09:25.6129366Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:09:25.6133797Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_install_free_tensors.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:09:25.612953] 2025-10-10T01:09:28.9841427Z 2025-10-10T01:09:28.9842506Z dynamo/test_install_free_tensors 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_install_free_tensors_1.1_353e80a51212289c_.log 2025-10-10T01:09:28.9843357Z Running 0 items in this shard: 2025-10-10T01:09:28.9843548Z 2025-10-10T01:09:28.9844484Z Running inductor/test_memory_planning 1/1 ... [2025-10-10 01:09:28.984130] 2025-10-10T01:09:28.9844985Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:09:28.9848908Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_memory_planning.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:09:28.984524] 2025-10-10T01:09:36.3619673Z 2025-10-10T01:09:36.3620814Z inductor/test_memory_planning 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_memory_planning_1.1_11d097bb98f41602_.log 2025-10-10T01:09:36.3621586Z Running 0 items in this shard: 2025-10-10T01:09:36.3621777Z 2025-10-10T01:09:36.3623444Z Running inductor/test_ordered_set 1/1 ... [2025-10-10 01:09:36.362036] 2025-10-10T01:09:36.3623865Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:09:36.3627912Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_ordered_set.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:09:36.362440] 2025-10-10T01:09:39.6828162Z 2025-10-10T01:09:39.6829175Z inductor/test_ordered_set 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_ordered_set_1.1_8b17c9f23a4ad726_.log 2025-10-10T01:09:39.6829909Z Running 0 items in this shard: 2025-10-10T01:09:39.6830096Z 2025-10-10T01:09:39.6832490Z Running inductor/test_split_cat_fx_aten_passes 1/1 ... [2025-10-10 01:09:39.682967] 2025-10-10T01:09:39.6832994Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:09:39.6836779Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_split_cat_fx_aten_passes.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:09:39.683346] 2025-10-10T01:09:46.5104696Z 2025-10-10T01:09:46.5106025Z inductor/test_split_cat_fx_aten_passes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_split_cat_fx_aten_passes_1.1_995a123916a0dee9_.log 2025-10-10T01:09:46.5106950Z Running 0 items in this shard: 2025-10-10T01:09:46.5107145Z 2025-10-10T01:09:46.5109361Z Running dynamo/test_activation_checkpointing 1/1 ... [2025-10-10 01:09:46.510625] 2025-10-10T01:09:46.5109982Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:09:46.5114540Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_activation_checkpointing.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:09:46.511020] 2025-10-10T01:09:53.5391030Z 2025-10-10T01:09:53.5392381Z dynamo/test_activation_checkpointing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_activation_checkpointing_1.1_33e757f29d78aa3e_.log 2025-10-10T01:09:53.5393207Z Running 0 items in this shard: 2025-10-10T01:09:53.5393395Z 2025-10-10T01:09:53.5394991Z Running dynamo/test_compiler_bisector 1/1 ... [2025-10-10 01:09:53.539202] 2025-10-10T01:09:53.5395568Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:09:53.5400379Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_compiler_bisector.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:09:53.539575] 2025-10-10T01:10:00.3670097Z 2025-10-10T01:10:00.3671178Z dynamo/test_compiler_bisector 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_compiler_bisector_1.1_d811bb94219d5c18_.log 2025-10-10T01:10:00.3672173Z Running 0 items in this shard: 2025-10-10T01:10:00.3672354Z 2025-10-10T01:10:00.3674084Z Running dynamo/test_aot_compile 1/1 ... [2025-10-10 01:10:00.367067] 2025-10-10T01:10:00.3674503Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:10:00.3678066Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_aot_compile.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:10:00.367446] 2025-10-10T01:10:03.7389403Z 2025-10-10T01:10:03.7390471Z dynamo/test_aot_compile 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_aot_compile_1.1_707a03b1c52226d9_.log 2025-10-10T01:10:03.7391190Z Running 0 items in this shard: 2025-10-10T01:10:03.7391390Z 2025-10-10T01:10:03.7391852Z Running dynamo/test_modes 1/1 ... [2025-10-10 01:10:03.738766] 2025-10-10T01:10:03.7392236Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:10:03.7396036Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_modes.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:10:03.739134] 2025-10-10T01:10:10.6179459Z 2025-10-10T01:10:10.6180294Z dynamo/test_modes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_modes_1.1_97b8a2c162a9bae1_.log 2025-10-10T01:10:10.6180962Z Running 0 items in this shard: 2025-10-10T01:10:10.6181150Z 2025-10-10T01:10:10.6183573Z Running inductor/test_auto_functionalize 1/1 ... [2025-10-10 01:10:10.618050] 2025-10-10T01:10:10.6184035Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:10:10.6187860Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_auto_functionalize.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:10:10.618439] 2025-10-10T01:10:14.0410005Z 2025-10-10T01:10:14.0411341Z inductor/test_auto_functionalize 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_auto_functionalize_1.1_e064c39f6c122394_.log 2025-10-10T01:10:14.0412144Z Running 0 items in this shard: 2025-10-10T01:10:14.0412328Z 2025-10-10T01:10:14.0413564Z Running inductor/test_torchinductor_codegen_config_overrides 1/1 ... [2025-10-10 01:10:14.041074] 2025-10-10T01:10:14.0414279Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:10:14.0418375Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_codegen_config_overrides.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:10:14.041438] 2025-10-10T01:10:20.8199974Z 2025-10-10T01:10:20.8201247Z inductor/test_torchinductor_codegen_config_overrides 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_codegen_config_overrides_1.1_96747b8b0bb4f6b1_.log 2025-10-10T01:10:20.8202191Z Running 0 items in this shard: 2025-10-10T01:10:20.8202383Z 2025-10-10T01:10:20.8204939Z Running dynamo/test_profiler 1/1 ... [2025-10-10 01:10:20.820189] 2025-10-10T01:10:20.8205406Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:10:20.8209661Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_profiler.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:10:20.820562] 2025-10-10T01:10:24.1914550Z 2025-10-10T01:10:24.1915591Z dynamo/test_profiler 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_profiler_1.1_0aeeb6348edbd2f6_.log 2025-10-10T01:10:24.1916324Z Running 0 items in this shard: 2025-10-10T01:10:24.1916514Z 2025-10-10T01:10:24.1918548Z Running dynamo/test_global 1/1 ... [2025-10-10 01:10:24.191563] 2025-10-10T01:10:24.1919023Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:10:24.1923164Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_global.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:10:24.191952] 2025-10-10T01:10:27.6132956Z 2025-10-10T01:10:27.6133881Z dynamo/test_global 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_global_1.1_55c8d690278f14ce_.log 2025-10-10T01:10:27.6134590Z Running 0 items in this shard: 2025-10-10T01:10:27.6134782Z 2025-10-10T01:10:27.6138145Z Running inductor/test_inductor_freezing 1/1 ... [2025-10-10 01:10:27.613452] 2025-10-10T01:10:27.6138711Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:10:27.6141936Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_inductor_freezing.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:10:27.613816] 2025-10-10T01:10:34.9934816Z 2025-10-10T01:10:34.9945363Z inductor/test_inductor_freezing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_inductor_freezing_1.1_af7fe6b4124975f9_.log 2025-10-10T01:10:34.9946196Z Running 0 items in this shard: 2025-10-10T01:10:34.9946390Z 2025-10-10T01:10:34.9946626Z Running dynamo/test_model_output 1/1 ... [2025-10-10 01:10:34.993616] 2025-10-10T01:10:34.9947035Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:10:34.9948020Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_model_output.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:10:34.993983] 2025-10-10T01:10:38.9155309Z 2025-10-10T01:10:38.9156100Z dynamo/test_model_output 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_model_output_1.1_8b0b27e549cfe40b_.log 2025-10-10T01:10:38.9156831Z Running 0 items in this shard: 2025-10-10T01:10:38.9157025Z 2025-10-10T01:10:38.9158570Z Running export/test_torchbind 1/1 ... [2025-10-10 01:10:38.915553] 2025-10-10T01:10:38.9158970Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:10:38.9163755Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_torchbind.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:10:38.915930] 2025-10-10T01:10:45.7438609Z 2025-10-10T01:10:45.7439596Z export/test_torchbind 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_torchbind_1.1_29905478f8b2fa76_.log 2025-10-10T01:10:45.7440355Z Running 0 items in this shard: 2025-10-10T01:10:45.7440549Z 2025-10-10T01:10:45.7442391Z Running dynamo/test_nested_graph_breaks 1/1 ... [2025-10-10 01:10:45.743924] 2025-10-10T01:10:45.7443011Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:10:45.7446766Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_nested_graph_breaks.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:10:45.744295] 2025-10-10T01:10:49.1655604Z 2025-10-10T01:10:49.1657015Z dynamo/test_nested_graph_breaks 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_nested_graph_breaks_1.1_4f9ce98bfa3c98ad_.log 2025-10-10T01:10:49.1657807Z Running 0 items in this shard: 2025-10-10T01:10:49.1657992Z 2025-10-10T01:10:49.1658640Z Running dynamo/test_backward_higher_order_ops 1/1 ... [2025-10-10 01:10:49.165569] 2025-10-10T01:10:49.1659100Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:10:49.1663240Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_backward_higher_order_ops.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:10:49.165952] 2025-10-10T01:10:52.5366021Z 2025-10-10T01:10:52.5366996Z dynamo/test_backward_higher_order_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_backward_higher_order_ops_1.1_1ee262ea4923d2bc_.log 2025-10-10T01:10:52.5367886Z Running 0 items in this shard: 2025-10-10T01:10:52.5369828Z 2025-10-10T01:10:52.5370036Z Running export/test_passes 1/1 ... [2025-10-10 01:10:52.536716] 2025-10-10T01:10:52.5370438Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:10:52.5374009Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_passes.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:10:52.537075] 2025-10-10T01:10:57.4613641Z 2025-10-10T01:10:57.4614342Z export/test_passes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_passes_1.1_cdd42d00b3e5d75e_.log 2025-10-10T01:10:57.4615028Z Running 0 items in this shard: 2025-10-10T01:10:57.4615230Z 2025-10-10T01:10:57.4617701Z Running inductor/test_torchbind 1/1 ... [2025-10-10 01:10:57.461430] 2025-10-10T01:10:57.4618138Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:10:57.4621289Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchbind.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:10:57.461796] 2025-10-10T01:11:04.2400807Z 2025-10-10T01:11:04.2401810Z inductor/test_torchbind 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchbind_1.1_12ea036a9ef561ae_.log 2025-10-10T01:11:04.2402536Z Running 0 items in this shard: 2025-10-10T01:11:04.2402723Z 2025-10-10T01:11:04.2404527Z Running inductor/test_custom_partitioner_fn 1/1 ... [2025-10-10 01:11:04.240155] 2025-10-10T01:11:04.2404979Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:11:04.2408533Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_custom_partitioner_fn.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:11:04.240498] 2025-10-10T01:11:11.0679547Z 2025-10-10T01:11:11.0680452Z inductor/test_custom_partitioner_fn 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_custom_partitioner_fn_1.1_32e3a895042458ce_.log 2025-10-10T01:11:11.0681304Z Running 0 items in this shard: 2025-10-10T01:11:11.0681487Z 2025-10-10T01:11:11.0682962Z Running inductor/test_alignment 1/1 ... [2025-10-10 01:11:11.067949] 2025-10-10T01:11:11.0683424Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:11:11.0686153Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_alignment.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:11:11.068290] 2025-10-10T01:11:18.3974744Z 2025-10-10T01:11:18.3976022Z inductor/test_alignment 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_alignment_1.1_294af1c25d8ef692_.log 2025-10-10T01:11:18.3976779Z Running 0 items in this shard: 2025-10-10T01:11:18.3976985Z 2025-10-10T01:11:18.3978658Z Running dynamo/test_sources 1/1 ... [2025-10-10 01:11:18.397459] 2025-10-10T01:11:18.3979083Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:11:18.3982514Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_sources.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:11:18.397862] 2025-10-10T01:11:21.5688308Z 2025-10-10T01:11:21.5689125Z dynamo/test_sources 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_sources_1.1_f0a2850bee915e22_.log 2025-10-10T01:11:21.5689827Z Running 0 items in this shard: 2025-10-10T01:11:21.5690014Z 2025-10-10T01:11:21.5692836Z Running dynamo/test_resume 1/1 ... [2025-10-10 01:11:21.568974] 2025-10-10T01:11:21.5693232Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:11:21.5696976Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_resume.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:11:21.569348] 2025-10-10T01:11:24.7404703Z 2025-10-10T01:11:24.7405721Z dynamo/test_resume 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_resume_1.1_e95342750787047f_.log 2025-10-10T01:11:24.7406398Z Running 0 items in this shard: 2025-10-10T01:11:24.7406602Z 2025-10-10T01:11:24.7408250Z Running dynamo/test_debug_utils 1/1 ... [2025-10-10 01:11:24.740424] 2025-10-10T01:11:24.7408789Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:11:24.7411591Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_debug_utils.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:11:24.740786] 2025-10-10T01:11:28.6637392Z 2025-10-10T01:11:28.6638221Z dynamo/test_debug_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_debug_utils_1.1_149e5bb7f0d00d3c_.log 2025-10-10T01:11:28.6639425Z Running 0 items in this shard: 2025-10-10T01:11:28.6639618Z 2025-10-10T01:11:28.6643342Z Running export/test_swap 1/1 ... [2025-10-10 01:11:28.663999] 2025-10-10T01:11:28.6643757Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:11:28.6647086Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_swap.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:11:28.664386] 2025-10-10T01:11:31.8356082Z 2025-10-10T01:11:31.8358062Z export/test_swap 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_swap_1.1_eae475bae73f52d8_.log 2025-10-10T01:11:31.8359056Z Running 0 items in this shard: 2025-10-10T01:11:31.8359249Z 2025-10-10T01:11:31.8359673Z Running dynamo/test_aot_autograd_cache 1/1 ... [2025-10-10 01:11:31.835644] 2025-10-10T01:11:31.8360099Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:11:31.8363751Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_aot_autograd_cache.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:11:31.836007] 2025-10-10T01:11:38.7141834Z 2025-10-10T01:11:38.7142832Z dynamo/test_aot_autograd_cache 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_aot_autograd_cache_1.1_c9fc592fb1e9cfca_.log 2025-10-10T01:11:38.7144237Z Running 0 items in this shard: 2025-10-10T01:11:38.7144443Z 2025-10-10T01:11:38.7145336Z Running inductor/test_binary_folding 1/1 ... [2025-10-10 01:11:38.714242] 2025-10-10T01:11:38.7145766Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:11:38.7149830Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_binary_folding.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:11:38.714626] 2025-10-10T01:11:46.1430087Z 2025-10-10T01:11:46.1431060Z inductor/test_binary_folding 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_binary_folding_1.1_50a87f671e977c21_.log 2025-10-10T01:11:46.1431827Z Running 0 items in this shard: 2025-10-10T01:11:46.1432013Z 2025-10-10T01:11:46.1434256Z Running dynamo/test_base_hop 1/1 ... [2025-10-10 01:11:46.143080] 2025-10-10T01:11:46.1437347Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:11:46.1438336Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_base_hop.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:11:46.143455] 2025-10-10T01:11:49.5146201Z 2025-10-10T01:11:49.5147273Z dynamo/test_base_hop 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_base_hop_1.1_c59e586c1ba8f88d_.log 2025-10-10T01:11:49.5147969Z Running 0 items in this shard: 2025-10-10T01:11:49.5148152Z 2025-10-10T01:11:49.5150380Z Running dynamo/test_list 1/1 ... [2025-10-10 01:11:49.514749] 2025-10-10T01:11:49.5150847Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:11:49.5154455Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_list.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:11:49.515117] 2025-10-10T01:11:52.8860697Z 2025-10-10T01:11:52.8861814Z dynamo/test_list 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_list_1.1_565a77db26f170c2_.log 2025-10-10T01:11:52.8863344Z Running 0 items in this shard: 2025-10-10T01:11:52.8863543Z 2025-10-10T01:11:52.8865437Z Running export/test_unflatten 1/1 ... [2025-10-10 01:11:52.886165] 2025-10-10T01:11:52.8866012Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:11:52.8869599Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_unflatten.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:11:52.886544] 2025-10-10T01:11:56.0572996Z 2025-10-10T01:11:56.0574171Z export/test_unflatten 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_unflatten_1.1_3699c48a5529a289_.log 2025-10-10T01:11:56.0574898Z Running 0 items in this shard: 2025-10-10T01:11:56.0575086Z 2025-10-10T01:11:56.0576483Z Running inductor/test_needs_exact_strides 1/1 ... [2025-10-10 01:11:56.057348] 2025-10-10T01:11:56.0577107Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:11:56.0581030Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_needs_exact_strides.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:11:56.057720] 2025-10-10T01:12:02.8363228Z 2025-10-10T01:12:02.8364675Z inductor/test_needs_exact_strides 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_needs_exact_strides_1.1_e12db4474a6455f2_.log 2025-10-10T01:12:02.8365496Z Running 0 items in this shard: 2025-10-10T01:12:02.8365683Z 2025-10-10T01:12:02.8367688Z Running dynamo/test_verify_correctness 1/1 ... [2025-10-10 01:12:02.836406] 2025-10-10T01:12:02.8368278Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:12:02.8371884Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_verify_correctness.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:12:02.836792] 2025-10-10T01:12:06.2077622Z 2025-10-10T01:12:06.2078651Z dynamo/test_verify_correctness 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_verify_correctness_1.1_602a54c0909bb888_.log 2025-10-10T01:12:06.2079431Z Running 0 items in this shard: 2025-10-10T01:12:06.2079616Z 2025-10-10T01:12:06.2081367Z Running export/test_export 1/1 ... [2025-10-10 01:12:06.207854] 2025-10-10T01:12:06.2084664Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:12:06.2085654Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_export.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:12:06.208220] 2025-10-10T01:12:13.2367170Z 2025-10-10T01:12:13.2368097Z export/test_export 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_export_1.1_724dc9b2e00543d9_.log 2025-10-10T01:12:13.2368809Z Running 0 items in this shard: 2025-10-10T01:12:13.2368996Z 2025-10-10T01:12:13.2371348Z Running inductor/test_minifier_isolate 1/1 ... [2025-10-10 01:12:13.236786] 2025-10-10T01:12:13.2371809Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:12:13.2375232Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_minifier_isolate.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:12:13.237160] 2025-10-10T01:12:20.0650477Z 2025-10-10T01:12:20.0651574Z inductor/test_minifier_isolate 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_minifier_isolate_1.1_7ae75bb0ee2fb4b8_.log 2025-10-10T01:12:20.0652430Z Running 0 items in this shard: 2025-10-10T01:12:20.0653095Z 2025-10-10T01:12:20.0657393Z Running dynamo/test_logging 1/1 ... [2025-10-10 01:12:20.065432] 2025-10-10T01:12:20.0657810Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:12:20.0661507Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_logging.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:12:20.065812] 2025-10-10T01:12:26.9443500Z 2025-10-10T01:12:26.9444472Z dynamo/test_logging 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_logging_1.1_39f690b2466cf642_.log 2025-10-10T01:12:26.9445246Z Running 0 items in this shard: 2025-10-10T01:12:26.9445436Z 2025-10-10T01:12:26.9449182Z Running dynamo/test_deviceguard 1/1 ... [2025-10-10 01:12:26.944581] 2025-10-10T01:12:26.9449620Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:12:26.9453293Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_deviceguard.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:12:26.944964] 2025-10-10T01:12:30.6666748Z 2025-10-10T01:12:30.6667719Z dynamo/test_deviceguard 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_deviceguard_1.1_9c4ce377a52b32cc_.log 2025-10-10T01:12:30.6668447Z Running 0 items in this shard: 2025-10-10T01:12:30.6668645Z 2025-10-10T01:12:30.6671430Z Running dynamo/test_aot_autograd 1/1 ... [2025-10-10 01:12:30.666785] 2025-10-10T01:12:30.6672265Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:12:30.6675030Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_aot_autograd.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:12:30.667147] 2025-10-10T01:12:34.0885370Z 2025-10-10T01:12:34.0886571Z dynamo/test_aot_autograd 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_aot_autograd_1.1_ffd5f025d73e0db2_.log 2025-10-10T01:12:34.0887520Z Running 0 items in this shard: 2025-10-10T01:12:34.0887759Z 2025-10-10T01:12:34.0892255Z Running inductor/test_augmented_graph_helper 1/1 ... [2025-10-10 01:12:34.088883] 2025-10-10T01:12:34.0892762Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:12:34.0896460Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_augmented_graph_helper.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:12:34.089263] 2025-10-10T01:12:37.3108006Z 2025-10-10T01:12:37.3109150Z inductor/test_augmented_graph_helper 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_augmented_graph_helper_1.1_83972a4b6596f628_.log 2025-10-10T01:12:37.3109999Z Running 0 items in this shard: 2025-10-10T01:12:37.3110187Z 2025-10-10T01:12:37.3112454Z Running dynamo/test_cudagraphs 1/1 ... [2025-10-10 01:12:37.310935] 2025-10-10T01:12:37.3112872Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:12:37.3116797Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_cudagraphs.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:12:37.311314] 2025-10-10T01:12:40.6826500Z 2025-10-10T01:12:40.6827495Z dynamo/test_cudagraphs 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_cudagraphs_1.1_3c21fc8a1903df8c_.log 2025-10-10T01:12:40.6828227Z Running 0 items in this shard: 2025-10-10T01:12:40.6828421Z 2025-10-10T01:12:40.6830504Z Running inductor/test_caching 1/1 ... [2025-10-10 01:12:40.682734] 2025-10-10T01:12:40.6831353Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:12:40.6834681Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_caching.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:12:40.683119] 2025-10-10T01:12:43.8538665Z 2025-10-10T01:12:43.8539702Z inductor/test_caching 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_caching_1.1_934c7ab345d15da7_.log 2025-10-10T01:12:43.8540411Z Running 0 items in this shard: 2025-10-10T01:12:43.8540598Z 2025-10-10T01:12:43.8542376Z Running export/test_upgrader 1/1 ... [2025-10-10 01:12:43.853959] 2025-10-10T01:12:43.8542793Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:12:43.8546362Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_upgrader.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:12:43.854325] 2025-10-10T01:12:47.0754000Z 2025-10-10T01:12:47.0754938Z export/test_upgrader 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_upgrader_1.1_a584e0b480bb7748_.log 2025-10-10T01:12:47.0755653Z Running 0 items in this shard: 2025-10-10T01:12:47.0755844Z 2025-10-10T01:12:47.0758236Z Running dynamo/test_sets 1/1 ... [2025-10-10 01:12:47.075510] 2025-10-10T01:12:47.0758638Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:12:47.0762793Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_sets.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:12:47.075912] 2025-10-10T01:12:50.5473662Z 2025-10-10T01:12:50.5474550Z dynamo/test_sets 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_sets_1.1_a22f3589f545c3b1_.log 2025-10-10T01:12:50.5475237Z Running 0 items in this shard: 2025-10-10T01:12:50.5475423Z 2025-10-10T01:12:50.5477769Z Running dynamo/test_unspec 1/1 ... [2025-10-10 01:12:50.547492] 2025-10-10T01:12:50.5478166Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:12:50.5481653Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_unspec.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:12:50.547856] 2025-10-10T01:12:54.4702391Z 2025-10-10T01:12:54.4703390Z dynamo/test_unspec 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_unspec_1.1_cc72ce2c42a7b79d_.log 2025-10-10T01:12:54.4704071Z Running 0 items in this shard: 2025-10-10T01:12:54.4704267Z 2025-10-10T01:12:54.4706563Z Running dynamo/test_python_dispatcher 1/1 ... [2025-10-10 01:12:54.470330] 2025-10-10T01:12:54.4707027Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:12:54.4710862Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_python_dispatcher.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:12:54.470739] 2025-10-10T01:12:58.1928652Z 2025-10-10T01:12:58.1929775Z dynamo/test_python_dispatcher 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_python_dispatcher_1.1_442c1b7022abc30b_.log 2025-10-10T01:12:58.1930575Z Running 0 items in this shard: 2025-10-10T01:12:58.1930759Z 2025-10-10T01:12:58.1932706Z Running dynamo/test_optimizers 1/1 ... [2025-10-10 01:12:58.192976] 2025-10-10T01:12:58.1933272Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:12:58.1936495Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_optimizers.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:12:58.193318] 2025-10-10T01:13:01.5646628Z 2025-10-10T01:13:01.5648003Z dynamo/test_optimizers 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_optimizers_1.1_74e86154988e6b15_.log 2025-10-10T01:13:01.5648818Z Running 0 items in this shard: 2025-10-10T01:13:01.5649062Z 2025-10-10T01:13:01.5651059Z Running dynamo/test_flat_apply 1/1 ... [2025-10-10 01:13:01.564808] 2025-10-10T01:13:01.5651485Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:13:01.5654566Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_flat_apply.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:13:01.565141] 2025-10-10T01:13:04.9364430Z 2025-10-10T01:13:04.9365451Z dynamo/test_flat_apply 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_flat_apply_1.1_69cc1488eab16041_.log 2025-10-10T01:13:04.9366257Z Running 0 items in this shard: 2025-10-10T01:13:04.9366456Z 2025-10-10T01:13:04.9368564Z Running dynamo/test_higher_order_ops 1/1 ... [2025-10-10 01:13:04.936553] 2025-10-10T01:13:04.9369023Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:13:04.9372645Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_higher_order_ops.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:13:04.936902] 2025-10-10T01:13:12.5664029Z 2025-10-10T01:13:12.5664967Z dynamo/test_higher_order_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_higher_order_ops_1.1_fde87208f5bc3db9_.log 2025-10-10T01:13:12.5665756Z Running 0 items in this shard: 2025-10-10T01:13:12.5665955Z 2025-10-10T01:13:12.5671287Z Running export/test_nativert 1/1 ... [2025-10-10 01:13:12.566838] 2025-10-10T01:13:12.5671696Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:13:12.5676093Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_nativert.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:13:12.567217] 2025-10-10T01:13:19.7450769Z 2025-10-10T01:13:19.7451871Z export/test_nativert 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_nativert_1.1_2b7aabf5b8a15e01_.log 2025-10-10T01:13:19.7453007Z Running 0 items in this shard: 2025-10-10T01:13:19.7453292Z 2025-10-10T01:13:19.7455108Z Running inductor/test_cpu_repro 1/1 ... [2025-10-10 01:13:19.745157] 2025-10-10T01:13:19.7455748Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:13:19.7459628Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cpu_repro.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:13:19.745573] 2025-10-10T01:13:27.2250247Z 2025-10-10T01:13:27.2251021Z inductor/test_cpu_repro 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cpu_repro_1.1_4d201b196322547c_.log 2025-10-10T01:13:27.2251735Z Running 0 items in this shard: 2025-10-10T01:13:27.2251919Z 2025-10-10T01:13:27.2254041Z Running dynamo/test_graph_deduplication 1/1 ... [2025-10-10 01:13:27.224978] 2025-10-10T01:13:27.2254499Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:13:27.2257184Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_graph_deduplication.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:13:27.225342] 2025-10-10T01:13:30.5963926Z 2025-10-10T01:13:30.5964943Z dynamo/test_graph_deduplication 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_graph_deduplication_1.1_ac4062b893bd7e24_.log 2025-10-10T01:13:30.5965733Z Running 0 items in this shard: 2025-10-10T01:13:30.5965917Z 2025-10-10T01:13:30.5969038Z Running dynamo/test_export 1/1 ... [2025-10-10 01:13:30.596568] 2025-10-10T01:13:30.5969436Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:13:30.5973026Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_export.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:13:30.596942] 2025-10-10T01:13:34.6187705Z 2025-10-10T01:13:34.6188654Z dynamo/test_export 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_export_1.1_81862530a67f5ce9_.log 2025-10-10T01:13:34.6189391Z Running 0 items in this shard: 2025-10-10T01:13:34.6189598Z 2025-10-10T01:13:34.6191676Z Running dynamo/test_error_messages 1/1 ... [2025-10-10 01:13:34.618860] 2025-10-10T01:13:34.6192257Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:13:34.6196383Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_error_messages.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:13:34.619258] 2025-10-10T01:13:37.9901239Z 2025-10-10T01:13:37.9902790Z dynamo/test_error_messages 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_error_messages_1.1_b759ce749620e66c_.log 2025-10-10T01:13:37.9903644Z Running 0 items in this shard: 2025-10-10T01:13:37.9903892Z 2025-10-10T01:13:37.9906036Z Running export/test_hop 1/1 ... [2025-10-10 01:13:37.990256] 2025-10-10T01:13:37.9906502Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:13:37.9910019Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_hop.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:13:37.990642] 2025-10-10T01:13:42.4632659Z 2025-10-10T01:13:42.4633598Z export/test_hop 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_hop_1.1_4ba4ac4a2bfe39f3_.log 2025-10-10T01:13:42.4634306Z Running 0 items in this shard: 2025-10-10T01:13:42.4634564Z 2025-10-10T01:13:42.4636869Z Running dynamo/test_cudagraphs_expandable_segments 1/1 ... [2025-10-10 01:13:42.463378] 2025-10-10T01:13:42.4637546Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:13:42.4641394Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_cudagraphs_expandable_segments.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:13:42.463772] 2025-10-10T01:13:46.2356396Z 2025-10-10T01:13:46.2357481Z dynamo/test_cudagraphs_expandable_segments 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_cudagraphs_expandable_segments_1.1_97154a2c7fe6c4f3_.log 2025-10-10T01:13:46.2358342Z Running 0 items in this shard: 2025-10-10T01:13:46.2358522Z 2025-10-10T01:13:46.2362159Z Running dynamo/test_recompile_ux 1/1 ... [2025-10-10 01:13:46.235928] 2025-10-10T01:13:46.2362612Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:13:46.2366352Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_recompile_ux.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:13:46.236282] 2025-10-10T01:13:49.6069155Z 2025-10-10T01:13:49.6070165Z dynamo/test_recompile_ux 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_recompile_ux_1.1_9a97e2c3c96dcdb5_.log 2025-10-10T01:13:49.6070880Z Running 0 items in this shard: 2025-10-10T01:13:49.6071063Z 2025-10-10T01:13:49.6073210Z Running inductor/test_mmdecomp 1/1 ... [2025-10-10 01:13:49.607023] 2025-10-10T01:13:49.6073614Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:13:49.6076814Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_mmdecomp.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:13:49.607385] 2025-10-10T01:13:56.6359208Z 2025-10-10T01:13:56.6360733Z inductor/test_mmdecomp 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_mmdecomp_1.1_fee16a1a1a3043d0_.log 2025-10-10T01:13:56.6361569Z Running 0 items in this shard: 2025-10-10T01:13:56.6361822Z 2025-10-10T01:13:56.6363547Z Running dynamo/test_precompile_context 1/1 ... [2025-10-10 01:13:56.635989] 2025-10-10T01:13:56.6364127Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:13:56.6367864Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_precompile_context.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:13:56.636390] 2025-10-10T01:14:03.4639756Z 2025-10-10T01:14:03.4641105Z dynamo/test_precompile_context 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_precompile_context_1.1_fca9df25493b29ac_.log 2025-10-10T01:14:03.4642012Z Running 0 items in this shard: 2025-10-10T01:14:03.4642220Z 2025-10-10T01:14:03.4643135Z Running dynamo/test_bytecode_utils 1/1 ... [2025-10-10 01:14:03.464015] 2025-10-10T01:14:03.4643570Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:14:03.4647424Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_bytecode_utils.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:14:03.464389] 2025-10-10T01:14:06.8355641Z 2025-10-10T01:14:06.8356532Z dynamo/test_bytecode_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_bytecode_utils_1.1_61bea430f17802bb_.log 2025-10-10T01:14:06.8357349Z Running 0 items in this shard: 2025-10-10T01:14:06.8357605Z 2025-10-10T01:14:06.8360022Z Running export/test_pass_infra 1/1 ... [2025-10-10 01:14:06.835683] 2025-10-10T01:14:06.8360445Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:14:06.8364414Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_pass_infra.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:14:06.836035] 2025-10-10T01:14:10.0068546Z 2025-10-10T01:14:10.0069613Z export/test_pass_infra 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_pass_infra_1.1_524fe1187e997278_.log 2025-10-10T01:14:10.0070373Z Running 0 items in this shard: 2025-10-10T01:14:10.0070559Z 2025-10-10T01:14:10.0072113Z Running dynamo/test_guard_manager 1/1 ... [2025-10-10 01:14:10.006922] 2025-10-10T01:14:10.0072694Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:14:10.0076358Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_guard_manager.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:14:10.007268] 2025-10-10T01:14:13.2281554Z 2025-10-10T01:14:13.2282787Z dynamo/test_guard_manager 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_guard_manager_1.1_b259dedf9d0333a0_.log 2025-10-10T01:14:13.2284817Z Running 0 items in this shard: 2025-10-10T01:14:13.2285036Z 2025-10-10T01:14:13.2285231Z Running dynamo/test_minifier 1/1 ... [2025-10-10 01:14:13.228235] 2025-10-10T01:14:13.2285678Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:14:13.2289482Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_minifier.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:14:13.228590] 2025-10-10T01:14:16.9503097Z 2025-10-10T01:14:16.9504123Z dynamo/test_minifier 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_minifier_1.1_b7d446ff20d6291d_.log 2025-10-10T01:14:16.9504816Z Running 0 items in this shard: 2025-10-10T01:14:16.9505033Z 2025-10-10T01:14:16.9507298Z Running export/test_converter 1/1 ... [2025-10-10 01:14:16.950403] 2025-10-10T01:14:16.9507715Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:14:16.9511101Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_converter.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:14:16.950770] 2025-10-10T01:14:20.4222500Z 2025-10-10T01:14:20.4223331Z export/test_converter 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_converter_1.1_65303ca9ac99b1a5_.log 2025-10-10T01:14:20.4224476Z Running 0 items in this shard: 2025-10-10T01:14:20.4224673Z 2025-10-10T01:14:20.4226811Z Running export/test_experimental 1/1 ... [2025-10-10 01:14:20.422355] 2025-10-10T01:14:20.4227254Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:14:20.4231003Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_experimental.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:14:20.422740] 2025-10-10T01:14:23.7942094Z 2025-10-10T01:14:23.7943059Z export/test_experimental 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_experimental_1.1_6db50cb38a50aa59_.log 2025-10-10T01:14:23.7943812Z Running 0 items in this shard: 2025-10-10T01:14:23.7944004Z 2025-10-10T01:14:23.7945522Z Running dynamo/test_input_attr_tracking 1/1 ... [2025-10-10 01:14:23.794248] 2025-10-10T01:14:23.7946003Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:14:23.7949471Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_input_attr_tracking.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:14:23.794599] 2025-10-10T01:14:27.2154631Z 2025-10-10T01:14:27.2155615Z dynamo/test_input_attr_tracking 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_input_attr_tracking_1.1_3e484bca3e8d2089_.log 2025-10-10T01:14:27.2156393Z Running 0 items in this shard: 2025-10-10T01:14:27.2156581Z 2025-10-10T01:14:27.2158943Z Running dynamo/test_exc 1/1 ... [2025-10-10 01:14:27.215586] 2025-10-10T01:14:27.2159388Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:14:27.2162914Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_exc.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:14:27.215938] 2025-10-10T01:14:30.7371930Z 2025-10-10T01:14:30.7372784Z dynamo/test_exc 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_exc_1.1_2bc6ab6e3f64bc31_.log 2025-10-10T01:14:30.7374021Z Running 0 items in this shard: 2025-10-10T01:14:30.7374210Z 2025-10-10T01:14:30.7375855Z Running dynamo/test_hooks 1/1 ... [2025-10-10 01:14:30.737258] 2025-10-10T01:14:30.7376300Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:14:30.7379879Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_hooks.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:14:30.737611] 2025-10-10T01:14:34.1582602Z 2025-10-10T01:14:34.1583367Z dynamo/test_hooks 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_hooks_1.1_733aca6bce54ec7d_.log 2025-10-10T01:14:34.1584049Z Running 0 items in this shard: 2025-10-10T01:14:34.1584238Z 2025-10-10T01:14:34.1584890Z Running dynamo/test_trace_rules 1/1 ... [2025-10-10 01:14:34.158214] 2025-10-10T01:14:34.1585306Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:14:34.1589424Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_trace_rules.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:14:34.158570] 2025-10-10T01:14:37.3295900Z 2025-10-10T01:14:37.3296981Z dynamo/test_trace_rules 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_trace_rules_1.1_6bc15433e41e8ed1_.log 2025-10-10T01:14:37.3297767Z Running 0 items in this shard: 2025-10-10T01:14:37.3297954Z 2025-10-10T01:14:37.3299852Z Running dynamo/test_exceptions 1/1 ... [2025-10-10 01:14:37.329660] 2025-10-10T01:14:37.3300279Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:14:37.3304602Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_exceptions.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:14:37.330064] 2025-10-10T01:14:40.7014662Z 2025-10-10T01:14:40.7015611Z dynamo/test_exceptions 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_exceptions_1.1_3d46ecb5ecb75b62_.log 2025-10-10T01:14:40.7016317Z Running 0 items in this shard: 2025-10-10T01:14:40.7016515Z 2025-10-10T01:14:41.0461720Z Uploading artifacts took 0.34 seconds 2025-10-10T01:14:41.0464744Z Running export/test_schema 1/1 ... [2025-10-10 01:14:41.046137] 2025-10-10T01:14:41.0465155Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:14:41.0468919Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_schema.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:14:41.046529] 2025-10-10T01:14:44.3172612Z 2025-10-10T01:14:44.3173538Z export/test_schema 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_schema_1.1_66d8fa029e329358_.log 2025-10-10T01:14:44.3174361Z Running 0 items in this shard: 2025-10-10T01:14:44.3174562Z 2025-10-10T01:14:44.3176638Z Running inductor/test_mps_basic 1/1 ... [2025-10-10 01:14:44.317335] 2025-10-10T01:14:44.3177048Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:14:44.3180895Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_mps_basic.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:14:44.317711] 2025-10-10T01:14:51.5959310Z 2025-10-10T01:14:51.5960083Z inductor/test_mps_basic 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_mps_basic_1.1_8119c91943ef6807_.log 2025-10-10T01:14:51.5960686Z 2025-10-10T01:14:51.5962439Z Running inductor/test_cudagraph_trees_expandable_segments 1/1 ... [2025-10-10 01:14:51.595916] 2025-10-10T01:14:51.5963373Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:14:51.5966830Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cudagraph_trees_expandable_segments.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:14:51.596304] 2025-10-10T01:14:58.5244904Z 2025-10-10T01:14:58.5247468Z inductor/test_cudagraph_trees_expandable_segments 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cudagraph_trees_expandable_segments_1.1_a7072ab6f4d3014d_.log 2025-10-10T01:14:58.5249007Z Running 0 items in this shard: 2025-10-10T01:14:58.5249191Z 2025-10-10T01:14:58.5249394Z Running dynamo/test_subclasses 1/1 ... [2025-10-10 01:14:58.524593] 2025-10-10T01:14:58.5249796Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:14:58.5253138Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_subclasses.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:14:58.524971] 2025-10-10T01:15:05.4034182Z 2025-10-10T01:15:05.4035088Z dynamo/test_subclasses 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_subclasses_1.1_c98fc60eaeb1ce49_.log 2025-10-10T01:15:05.4035839Z Running 0 items in this shard: 2025-10-10T01:15:05.4036036Z 2025-10-10T01:15:05.4038552Z Running dynamo/test_repros 1/1 ... [2025-10-10 01:15:05.403528] 2025-10-10T01:15:05.4039393Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:15:05.4043041Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_repros.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:15:05.403898] 2025-10-10T01:15:09.4264309Z 2025-10-10T01:15:09.4265202Z dynamo/test_repros 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_repros_1.1_1309813200270afc_.log 2025-10-10T01:15:09.4266293Z Running 2 items in this shard: test/dynamo/test_repros.py::ReproTests::test_dont_dce_rand, test/dynamo/test_repros.py::ReproTests::test_mem_leak_guards 2025-10-10T01:15:09.4266875Z 2025-10-10T01:15:09.4268432Z Running dynamo/test_reorder_logs 1/1 ... [2025-10-10 01:15:09.426526] 2025-10-10T01:15:09.4268840Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:15:09.4272646Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_reorder_logs.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:15:09.426897] 2025-10-10T01:15:12.7980369Z 2025-10-10T01:15:12.7981559Z dynamo/test_reorder_logs 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_reorder_logs_1.1_58b593a8ffd10ed0_.log 2025-10-10T01:15:12.7982306Z Running 0 items in this shard: 2025-10-10T01:15:12.7982509Z 2025-10-10T01:15:12.7984592Z Running dynamo/test_generator 1/1 ... [2025-10-10 01:15:12.798167] 2025-10-10T01:15:12.7985052Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:15:12.7988825Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_generator.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:15:12.798566] 2025-10-10T01:15:16.2703344Z 2025-10-10T01:15:16.2704433Z dynamo/test_generator 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_generator_1.1_7f94c085e85a6bdd_.log 2025-10-10T01:15:16.2705152Z Running 0 items in this shard: 2025-10-10T01:15:16.2705764Z 2025-10-10T01:15:16.2705973Z Running export/test_lift_unlift 1/1 ... [2025-10-10 01:15:16.269959] 2025-10-10T01:15:16.2706384Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:15:16.2707481Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_lift_unlift.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:15:16.270341] 2025-10-10T01:15:19.4413348Z 2025-10-10T01:15:19.4414349Z export/test_lift_unlift 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_lift_unlift_1.1_3e93da37ecca511d_.log 2025-10-10T01:15:19.4415154Z Running 0 items in this shard: 2025-10-10T01:15:19.4415344Z 2025-10-10T01:15:19.4417851Z Running export/test_verifier 1/1 ... [2025-10-10 01:15:19.441473] 2025-10-10T01:15:19.4418278Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:15:19.4422459Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_verifier.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:15:19.441876] 2025-10-10T01:15:22.6125721Z 2025-10-10T01:15:22.6126719Z export/test_verifier 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_verifier_1.1_9a11d0dc143e42b0_.log 2025-10-10T01:15:22.6127493Z Running 0 items in this shard: 2025-10-10T01:15:22.6127684Z 2025-10-10T01:15:22.6130688Z Running profiler/test_profiler 1/1 ... [2025-10-10 01:15:22.612735] 2025-10-10T01:15:22.6131534Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:15:22.6135175Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_profiler.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:15:22.613136] 2025-10-10T01:15:26.1850222Z 2025-10-10T01:15:26.1851217Z profiler/test_profiler 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_profiler_1.1_5bb9c96b84d7a8fd_.log 2025-10-10T01:15:26.1856334Z Running 10 items in this shard: test/profiler/test_profiler.py::TestProfiler::test_source_multithreaded_basic_work_in_main_thread_False, test/profiler/test_profiler.py::TestProfiler::test_source_multithreaded_basic_work_in_main_thread_True, test/profiler/test_profiler.py::TestProfiler::test_source_multithreaded_close_in_scope_work_in_main_thread_False, test/profiler/test_profiler.py::TestProfiler::test_source_multithreaded_close_in_scope_work_in_main_thread_True, test/profiler/test_profiler.py::TestProfiler::test_source_multithreaded_complex_work_in_main_thread_False, test/profiler/test_profiler.py::TestProfiler::test_source_multithreaded_complex_work_in_main_thread_True, test/profiler/test_profiler.py::TestProfiler::test_source_multithreaded_multiple_preexisting_work_in_main_thread_False, test/profiler/test_profiler.py::TestProfiler::test_source_multithreaded_multiple_preexisting_work_in_main_thread_True, test/profiler/test_profiler.py::TestProfiler::test_source_multithreaded_open_in_scope_work_in_main_thread_False, test/profiler/test_profiler.py::TestProfiler::test_source_multithreaded_open_in_scope_work_in_main_thread_True 2025-10-10T01:15:26.1860678Z 2025-10-10T01:15:26.1860877Z Running dynamo/test_misc 1/1 ... [2025-10-10 01:15:26.185132] 2025-10-10T01:15:26.1861265Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:15:26.1862227Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_misc.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:15:26.185481] 2025-10-10T01:15:31.4090689Z 2025-10-10T01:15:31.4091575Z dynamo/test_misc 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_misc_1.1_0fcf8f36cf60d109_.log 2025-10-10T01:15:31.4092625Z Running 0 items in this shard: 2025-10-10T01:15:31.4092820Z 2025-10-10T01:15:31.4095233Z Running export/test_draft_export 1/1 ... [2025-10-10 01:15:31.409204] 2025-10-10T01:15:31.4095663Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:15:31.4101249Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_draft_export.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:15:31.409561] 2025-10-10T01:15:34.8306855Z 2025-10-10T01:15:34.8307682Z export/test_draft_export 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_draft_export_1.1_ddf78549364c320a_.log 2025-10-10T01:15:34.8308408Z Running 0 items in this shard: 2025-10-10T01:15:34.8308598Z 2025-10-10T01:15:34.8311237Z Running export/test_sparse 1/1 ... [2025-10-10 01:15:34.830783] 2025-10-10T01:15:34.8311668Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:15:34.8315301Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_sparse.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:15:34.831159] 2025-10-10T01:15:38.0514931Z 2025-10-10T01:15:38.0515851Z export/test_sparse 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_sparse_1.1_c298a1c849959967_.log 2025-10-10T01:15:38.0516535Z Running 0 items in this shard: 2025-10-10T01:15:38.0516810Z 2025-10-10T01:15:38.0519473Z Running dynamo/test_comptime 1/1 ... [2025-10-10 01:15:38.051608] 2025-10-10T01:15:38.0519898Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:15:38.0523169Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_comptime.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:15:38.051978] 2025-10-10T01:15:41.4233894Z 2025-10-10T01:15:41.4234679Z dynamo/test_comptime 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_comptime_1.1_8834e9fe44cb9fd1_.log 2025-10-10T01:15:41.4235382Z Running 0 items in this shard: 2025-10-10T01:15:41.4235580Z 2025-10-10T01:15:41.4238610Z Running dynamo/test_python_autograd 1/1 ... [2025-10-10 01:15:41.423546] 2025-10-10T01:15:41.4239048Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:15:41.4242542Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_python_autograd.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:15:41.423916] 2025-10-10T01:15:44.7952664Z 2025-10-10T01:15:44.7953503Z dynamo/test_python_autograd 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_python_autograd_1.1_6398ff528c26342f_.log 2025-10-10T01:15:44.7954372Z Running 0 items in this shard: 2025-10-10T01:15:44.7954613Z 2025-10-10T01:15:44.7956594Z Running functorch/test_rearrange 1/1 ... [2025-10-10 01:15:44.795364] 2025-10-10T01:15:44.7957017Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:15:44.7960379Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_rearrange.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:15:44.795707] 2025-10-10T01:15:48.0166071Z 2025-10-10T01:15:48.0167083Z functorch/test_rearrange 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_rearrange_1.1_032e07978013a7d3_.log 2025-10-10T01:15:48.0167836Z Running 0 items in this shard: 2025-10-10T01:15:48.0168543Z 2025-10-10T01:15:48.0170351Z Running functorch/test_parsing 1/1 ... [2025-10-10 01:15:48.016728] 2025-10-10T01:15:48.0170790Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:15:48.0173822Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_parsing.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:15:48.017053] 2025-10-10T01:15:51.2373327Z 2025-10-10T01:15:51.2374117Z functorch/test_parsing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_parsing_1.1_4bf25a31d01113af_.log 2025-10-10T01:15:51.2374926Z Running 0 items in this shard: 2025-10-10T01:15:51.2375124Z 2025-10-10T01:15:51.2377083Z Running test_package 1/1 ... [2025-10-10 01:15:51.237395] 2025-10-10T01:15:51.2377454Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:15:51.2380652Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_package.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:15:51.237723] 2025-10-10T01:15:54.6091674Z 2025-10-10T01:15:54.6092537Z test_package 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_package_1.1_aab5f1fc1d9ca256_.log 2025-10-10T01:15:54.6093192Z Running 0 items in this shard: 2025-10-10T01:15:54.6093425Z 2025-10-10T01:15:54.6095451Z Running test_comparison_utils 1/1 ... [2025-10-10 01:15:54.609257] 2025-10-10T01:15:54.6095879Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:15:54.6099813Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_comparison_utils.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:15:54.609604] 2025-10-10T01:15:57.8303636Z 2025-10-10T01:15:57.8304449Z test_comparison_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_comparison_utils_1.1_711c3b9129d799fc_.log 2025-10-10T01:15:57.8305139Z Running 0 items in this shard: 2025-10-10T01:15:57.8305328Z 2025-10-10T01:15:57.8307742Z Running test_mkl_verbose 1/1 ... [2025-10-10 01:15:57.830399] 2025-10-10T01:15:57.8308139Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:15:57.8310591Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_mkl_verbose.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:15:57.830719] 2025-10-10T01:16:01.0514223Z 2025-10-10T01:16:01.0515135Z test_mkl_verbose 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_mkl_verbose_1.1_ae2a4d433f1b52f2_.log 2025-10-10T01:16:01.0515824Z Running 0 items in this shard: 2025-10-10T01:16:01.0516107Z 2025-10-10T01:16:01.0518153Z Running functorch/test_ac_logging 1/1 ... [2025-10-10 01:16:01.051508] 2025-10-10T01:16:01.0518591Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:16:01.0522098Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_ac_logging.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:16:01.051870] 2025-10-10T01:16:04.2728147Z 2025-10-10T01:16:04.2729252Z functorch/test_ac_logging 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_ac_logging_1.1_2db252e11b8c73ef_.log 2025-10-10T01:16:04.2729985Z Running 0 items in this shard: 2025-10-10T01:16:04.2730166Z 2025-10-10T01:16:04.2733798Z Running test_mkldnn_verbose 1/1 ... [2025-10-10 01:16:04.273069] 2025-10-10T01:16:04.2734200Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:16:04.2737833Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_mkldnn_verbose.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:16:04.273439] 2025-10-10T01:16:07.4942034Z 2025-10-10T01:16:07.4942862Z test_mkldnn_verbose 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_mkldnn_verbose_1.1_6666ce1ef0660778_.log 2025-10-10T01:16:07.4943536Z Running 0 items in this shard: 2025-10-10T01:16:07.4943797Z 2025-10-10T01:16:07.4946101Z Running profiler/test_kineto 1/1 ... [2025-10-10 01:16:07.494308] 2025-10-10T01:16:07.4946634Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:16:07.4949898Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_kineto.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:16:07.494654] 2025-10-10T01:16:10.7155719Z 2025-10-10T01:16:10.7157302Z profiler/test_kineto 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_kineto_1.1_7c2d1ad2b6df02a9_.log 2025-10-10T01:16:10.7158571Z Running 0 items in this shard: 2025-10-10T01:16:10.7158916Z 2025-10-10T01:16:10.7162681Z Running test_matmul_cuda 1/1 ... [2025-10-10 01:16:10.715848] 2025-10-10T01:16:10.7163079Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:16:10.7166553Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_matmul_cuda.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:16:10.716269] 2025-10-10T01:16:14.6885699Z 2025-10-10T01:16:14.6886505Z test_matmul_cuda 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_matmul_cuda_1.1_4b2ba523bc20eaa9_.log 2025-10-10T01:16:14.6887263Z Running 0 items in this shard: 2025-10-10T01:16:14.6887472Z 2025-10-10T01:16:14.6889522Z Running test_transformers 1/1 ... [2025-10-10 01:16:14.688658] 2025-10-10T01:16:14.6889920Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:16:14.6893871Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_transformers.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:16:14.689035] 2025-10-10T01:16:22.2168339Z 2025-10-10T01:16:22.2169265Z test_transformers 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_transformers_1.1_e9070de377699fe8_.log 2025-10-10T01:16:22.2169932Z Running 0 items in this shard: 2025-10-10T01:16:22.2170116Z 2025-10-10T01:16:22.2172956Z Running test_meta 1/1 ... [2025-10-10 01:16:22.216925] 2025-10-10T01:16:22.2173325Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:16:22.2176370Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_meta.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:16:22.217288] 2025-10-10T01:16:35.9088746Z 2025-10-10T01:16:35.9089579Z test_meta 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_meta_1.1_ae20740f2a153e4a_.log 2025-10-10T01:16:35.9090250Z Running 0 items in this shard: 2025-10-10T01:16:35.9090439Z 2025-10-10T01:16:35.9093336Z Running test_license 1/1 ... [2025-10-10 01:16:35.908913] 2025-10-10T01:16:35.9093825Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:16:35.9097645Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_license.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:16:35.909272] 2025-10-10T01:16:39.0806216Z 2025-10-10T01:16:39.0807238Z test_license 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_license_1.1_c7db97813d50b504_.log 2025-10-10T01:16:39.0807940Z Running 0 items in this shard: 2025-10-10T01:16:39.0808135Z 2025-10-10T01:16:39.0810120Z Running test_utils_config_module 1/1 ... [2025-10-10 01:16:39.080709] 2025-10-10T01:16:39.0810685Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:16:39.0814694Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_utils_config_module.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:16:39.081099] 2025-10-10T01:16:42.3018474Z 2025-10-10T01:16:42.3019490Z test_utils_config_module 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_utils_config_module_1.1_0d5baaa873f9c4f8_.log 2025-10-10T01:16:42.3020247Z Running 0 items in this shard: 2025-10-10T01:16:42.3020436Z 2025-10-10T01:16:42.3022394Z Running test_decomp 1/16 ... [2025-10-10 01:16:42.301960] 2025-10-10T01:16:42.3022839Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:16:42.3026524Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=1', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:16:42.302321] 2025-10-10T01:16:49.1295167Z 2025-10-10T01:16:49.1296266Z test_decomp 1/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_1.16_bac203bb45e89c92_.log 2025-10-10T01:16:49.1296906Z Running 0 items in this shard: 2025-10-10T01:16:49.1297105Z 2025-10-10T01:16:49.1299288Z Running test_decomp 6/16 ... [2025-10-10 01:16:49.129588] 2025-10-10T01:16:49.1299676Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:16:49.1303594Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=6', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:16:49.130006] 2025-10-10T01:16:55.9071450Z 2025-10-10T01:16:55.9072304Z test_decomp 6/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_6.16_899ef757ec9ac3b7_.log 2025-10-10T01:16:55.9072967Z Running 0 items in this shard: 2025-10-10T01:16:55.9073185Z 2025-10-10T01:16:55.9075602Z Running test_decomp 7/16 ... [2025-10-10 01:16:55.907234] 2025-10-10T01:16:55.9075988Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:16:55.9079503Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=7', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:16:55.907592] 2025-10-10T01:17:02.6843155Z 2025-10-10T01:17:02.6843859Z test_decomp 7/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_7.16_b5d68edc381e6c1e_.log 2025-10-10T01:17:02.6844490Z Running 0 items in this shard: 2025-10-10T01:17:02.6844671Z 2025-10-10T01:17:02.6849771Z Running test_decomp 10/16 ... [2025-10-10 01:17:02.684662] 2025-10-10T01:17:02.6850141Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:17:02.6853934Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=10', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:17:02.685036] 2025-10-10T01:17:09.5118346Z 2025-10-10T01:17:09.5119187Z test_decomp 10/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_10.16_ed0b201de6866abe_.log 2025-10-10T01:17:09.5119812Z Running 0 items in this shard: 2025-10-10T01:17:09.5119992Z 2025-10-10T01:17:09.5122379Z Running test_decomp 15/16 ... [2025-10-10 01:17:09.511909] 2025-10-10T01:17:09.5122768Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:17:09.5126572Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=15', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:17:09.512290] 2025-10-10T01:17:16.3400647Z 2025-10-10T01:17:16.3401515Z test_decomp 15/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_15.16_c7290e8d28a49368_.log 2025-10-10T01:17:16.3402144Z Running 0 items in this shard: 2025-10-10T01:17:16.3402357Z 2025-10-10T01:17:16.3403981Z Running test_decomp 16/16 ... [2025-10-10 01:17:16.340065] 2025-10-10T01:17:16.3404371Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:17:16.3408693Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=16', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:17:16.340497] 2025-10-10T01:17:23.1677831Z 2025-10-10T01:17:23.1678731Z test_decomp 16/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_16.16_2d7cab34934f216c_.log 2025-10-10T01:17:23.1679372Z Running 0 items in this shard: 2025-10-10T01:17:23.1679558Z 2025-10-10T01:17:23.1683257Z Running xpu/test_conv 1/1 ... [2025-10-10 01:17:23.167919] 2025-10-10T01:17:23.1683641Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:17:23.1687384Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'xpu/test_conv.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:17:23.168353] 2025-10-10T01:17:26.9900925Z 2025-10-10T01:17:26.9901808Z xpu/test_conv 1/1 was successful, full logs can be found in artifacts with path test/test-reports/xpu.test_conv_1.1_ddda2c623f8f88c8_.log 2025-10-10T01:17:26.9902488Z Running 0 items in this shard: 2025-10-10T01:17:26.9902687Z 2025-10-10T01:17:26.9904906Z Running functorch/test_ops 2/2 ... [2025-10-10 01:17:26.990154] 2025-10-10T01:17:26.9905325Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:17:26.9908777Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_ops.py', '-m', 'serial', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:17:26.990509] 2025-10-10T01:17:34.1674572Z 2025-10-10T01:17:34.1675601Z functorch/test_ops 2/2 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_ops_2.2_b412cbf486341313_.log 2025-10-10T01:17:34.1676304Z Running 0 items in this shard: 2025-10-10T01:17:34.1684109Z 2025-10-10T01:17:34.1684513Z Running test_datapipe 1/1 ... [2025-10-10 01:17:34.167583] 2025-10-10T01:17:34.1685002Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:17:34.1686010Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_datapipe.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:17:34.167963] 2025-10-10T01:17:37.3888445Z 2025-10-10T01:17:37.3889361Z test_datapipe 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_datapipe_1.1_31e17acc5475cd43_.log 2025-10-10T01:17:37.3889997Z Running 0 items in this shard: 2025-10-10T01:17:37.3890186Z 2025-10-10T01:17:37.3893475Z Running lazy/test_generator 1/1 ... [2025-10-10 01:17:37.389005] 2025-10-10T01:17:37.3893880Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:17:37.3897730Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'lazy/test_generator.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:17:37.389396] 2025-10-10T01:17:40.6095914Z 2025-10-10T01:17:40.6096849Z lazy/test_generator 1/1 was successful, full logs can be found in artifacts with path test/test-reports/lazy.test_generator_1.1_467765b11eb670d9_.log 2025-10-10T01:17:40.6097548Z Running 0 items in this shard: 2025-10-10T01:17:40.6097733Z 2025-10-10T01:17:40.6100105Z Running torch_np/numpy_tests/lib/test_type_check 1/1 ... [2025-10-10 01:17:40.609702] 2025-10-10T01:17:40.6100593Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:17:40.6104353Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/lib/test_type_check.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:17:40.610096] 2025-10-10T01:17:43.8305499Z 2025-10-10T01:17:43.8306619Z torch_np/numpy_tests/lib/test_type_check 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.lib.test_type_check_1.1_c92581c692d414a3_.log 2025-10-10T01:17:43.8307464Z Running 0 items in this shard: 2025-10-10T01:17:43.8307651Z 2025-10-10T01:17:43.8309687Z Running lazy/test_debug_util 1/1 ... [2025-10-10 01:17:43.830654] 2025-10-10T01:17:43.8310153Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:17:43.8314521Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'lazy/test_debug_util.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:17:43.831009] 2025-10-10T01:17:47.0515845Z 2025-10-10T01:17:47.0516937Z lazy/test_debug_util 1/1 was successful, full logs can be found in artifacts with path test/test-reports/lazy.test_debug_util_1.1_a82fed25b63a5da1_.log 2025-10-10T01:17:47.0517675Z Running 0 items in this shard: 2025-10-10T01:17:47.0517934Z 2025-10-10T01:17:47.0520350Z Running test_jit_llga_fuser 1/1 ... [2025-10-10 01:17:47.051716] 2025-10-10T01:17:47.0520904Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:17:47.0524789Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_jit_llga_fuser.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:17:47.052088] 2025-10-10T01:17:50.7734612Z 2025-10-10T01:17:50.7735640Z test_jit_llga_fuser 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_jit_llga_fuser_1.1_9952245cc515fc15_.log 2025-10-10T01:17:50.7736372Z Running 0 items in this shard: 2025-10-10T01:17:50.7736565Z 2025-10-10T01:17:50.7738997Z Running test_numa_binding 1/1 ... [2025-10-10 01:17:50.773567] 2025-10-10T01:17:50.7739545Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:17:50.7742920Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_numa_binding.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:17:50.773959] 2025-10-10T01:17:53.9951015Z 2025-10-10T01:17:53.9952030Z test_numa_binding 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_numa_binding_1.1_233808eeddf7a589_.log 2025-10-10T01:17:53.9952754Z Running 0 items in this shard: 2025-10-10T01:17:53.9952948Z 2025-10-10T01:17:53.9954018Z Running torch_np/numpy_tests/lib/test_histograms 1/1 ... [2025-10-10 01:17:53.995134] 2025-10-10T01:17:53.9954680Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:17:53.9958681Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/lib/test_histograms.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:17:53.995520] 2025-10-10T01:17:57.2166841Z 2025-10-10T01:17:57.2167826Z torch_np/numpy_tests/lib/test_histograms 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.lib.test_histograms_1.1_1622275eb060a396_.log 2025-10-10T01:17:57.2168674Z Running 0 items in this shard: 2025-10-10T01:17:57.2168858Z 2025-10-10T01:17:57.2169101Z Running benchmark_utils/test_benchmark_utils 1/1 ... [2025-10-10 01:17:57.216317] 2025-10-10T01:17:57.2169560Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:17:57.2170684Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'benchmark_utils/test_benchmark_utils.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:17:57.216718] 2025-10-10T01:18:00.4375183Z 2025-10-10T01:18:00.4376451Z benchmark_utils/test_benchmark_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/benchmark_utils.test_benchmark_utils_1.1_b626e7c59322f5dc_.log 2025-10-10T01:18:00.4377292Z Running 0 items in this shard: 2025-10-10T01:18:00.4377499Z 2025-10-10T01:18:00.4379328Z Running torch_np/numpy_tests/core/test_scalarmath 1/1 ... [2025-10-10 01:18:00.437629] 2025-10-10T01:18:00.4380000Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:18:00.4384190Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_scalarmath.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:18:00.438007] 2025-10-10T01:18:03.6587497Z 2025-10-10T01:18:03.6588613Z torch_np/numpy_tests/core/test_scalarmath 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_scalarmath_1.1_661233fe8cec6b8b_.log 2025-10-10T01:18:03.6589486Z Running 0 items in this shard: 2025-10-10T01:18:03.6589682Z 2025-10-10T01:18:03.6591129Z Running test_indexing 1/1 ... [2025-10-10 01:18:03.658845] 2025-10-10T01:18:03.6591499Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:18:03.6595156Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_indexing.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:18:03.659202] 2025-10-10T01:18:07.4307005Z 2025-10-10T01:18:07.4307690Z test_indexing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_indexing_1.1_5dac5f241f91bc60_.log 2025-10-10T01:18:07.4308671Z Running 1 items in this shard: test/test_indexing.py::TestIndexingCUDA::test_index_put_accumulate_large_tensor_cuda 2025-10-10T01:18:07.4309143Z 2025-10-10T01:18:07.4312090Z Running profiler/test_torch_tidy 1/1 ... [2025-10-10 01:18:07.430873] 2025-10-10T01:18:07.4312547Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:18:07.4315782Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_torch_tidy.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:18:07.431236] 2025-10-10T01:18:10.6515450Z 2025-10-10T01:18:10.6517330Z profiler/test_torch_tidy 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_torch_tidy_1.1_bad9600caf6d79ab_.log 2025-10-10T01:18:10.6518795Z Running 0 items in this shard: 2025-10-10T01:18:10.6519170Z 2025-10-10T01:18:10.6519581Z Running nn/test_module_hooks 1/1 ... [2025-10-10 01:18:10.651662] 2025-10-10T01:18:10.6519993Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:18:10.6524577Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'nn/test_module_hooks.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:18:10.652028] 2025-10-10T01:18:14.1232663Z 2025-10-10T01:18:14.1233597Z nn/test_module_hooks 1/1 was successful, full logs can be found in artifacts with path test/test-reports/nn.test_module_hooks_1.1_6e7195e2b954f228_.log 2025-10-10T01:18:14.1234279Z Running 0 items in this shard: 2025-10-10T01:18:14.1234467Z 2025-10-10T01:18:14.1237021Z Running functorch/test_aotdispatch 1/1 ... [2025-10-10 01:18:14.123331] 2025-10-10T01:18:14.1237462Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:18:14.1240508Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_aotdispatch.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:18:14.123689] 2025-10-10T01:18:19.6989171Z 2025-10-10T01:18:19.6991565Z functorch/test_aotdispatch 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_aotdispatch_1.1_f0abdf06aa543a1b_.log 2025-10-10T01:18:19.6993583Z Running 0 items in this shard: 2025-10-10T01:18:19.6993976Z 2025-10-10T01:18:19.6994367Z Running nn/test_load_state_dict 1/1 ... [2025-10-10 01:18:19.698845] 2025-10-10T01:18:19.6995474Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:18:19.6997546Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'nn/test_load_state_dict.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:18:19.699203] 2025-10-10T01:18:23.1709977Z 2025-10-10T01:18:23.1710995Z nn/test_load_state_dict 1/1 was successful, full logs can be found in artifacts with path test/test-reports/nn.test_load_state_dict_1.1_0123cd304723ba25_.log 2025-10-10T01:18:23.1711701Z Running 0 items in this shard: 2025-10-10T01:18:23.1711886Z 2025-10-10T01:18:23.1714914Z Running torch_np/numpy_tests/linalg/test_linalg 1/1 ... [2025-10-10 01:18:23.171149] 2025-10-10T01:18:23.1715564Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:18:23.1719050Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/linalg/test_linalg.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:18:23.171516] 2025-10-10T01:18:26.4925708Z 2025-10-10T01:18:26.4928324Z torch_np/numpy_tests/linalg/test_linalg 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.linalg.test_linalg_1.1_301a27dd55a2ff4c_.log 2025-10-10T01:18:26.4929919Z Running 0 items in this shard: 2025-10-10T01:18:26.4930123Z 2025-10-10T01:18:26.4932736Z Running test_shape_ops 1/1 ... [2025-10-10 01:18:26.492914] 2025-10-10T01:18:26.4933254Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:18:26.4936956Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_shape_ops.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:18:26.493327] 2025-10-10T01:18:30.2654418Z 2025-10-10T01:18:30.2655637Z test_shape_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_shape_ops_1.1_86b9c16c33599f91_.log 2025-10-10T01:18:30.2656414Z Running 0 items in this shard: 2025-10-10T01:18:30.2656606Z 2025-10-10T01:18:30.2658978Z Running torch_np/numpy_tests/core/test_shape_base 1/1 ... [2025-10-10 01:18:30.265581] 2025-10-10T01:18:30.2659653Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:18:30.2663351Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_shape_base.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:18:30.265950] 2025-10-10T01:18:33.5369546Z 2025-10-10T01:18:33.5371419Z torch_np/numpy_tests/core/test_shape_base 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_shape_base_1.1_409da463f80a2b72_.log 2025-10-10T01:18:33.5372281Z Running 0 items in this shard: 2025-10-10T01:18:33.5372476Z 2025-10-10T01:18:33.5374229Z Running torch_np/numpy_tests/core/test_dtype 1/1 ... [2025-10-10 01:18:33.537105] 2025-10-10T01:18:33.5374694Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:18:33.5378373Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_dtype.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:18:33.537486] 2025-10-10T01:18:36.7575518Z 2025-10-10T01:18:36.7576652Z torch_np/numpy_tests/core/test_dtype 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_dtype_1.1_64c41a8cf03300ff_.log 2025-10-10T01:18:36.7577509Z Running 0 items in this shard: 2025-10-10T01:18:36.7577699Z 2025-10-10T01:18:36.7579365Z Running test_unary_ufuncs 1/1 ... [2025-10-10 01:18:36.757665] 2025-10-10T01:18:36.7579770Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:18:36.7583564Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_unary_ufuncs.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:18:36.758033] 2025-10-10T01:18:46.6900465Z 2025-10-10T01:18:46.6901283Z test_unary_ufuncs 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_unary_ufuncs_1.1_9717922c87452941_.log 2025-10-10T01:18:46.6901963Z Running 0 items in this shard: 2025-10-10T01:18:46.6902157Z 2025-10-10T01:18:46.6904009Z Running optim/test_optim 1/1 ... [2025-10-10 01:18:46.690110] 2025-10-10T01:18:46.6904404Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:18:46.6907639Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'optim/test_optim.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:18:46.690445] 2025-10-10T01:18:49.8109804Z 2025-10-10T01:18:49.8110703Z optim/test_optim 1/1 was successful, full logs can be found in artifacts with path test/test-reports/optim.test_optim_1.1_7ce4bd7a6b4c88ff_.log 2025-10-10T01:18:49.8111265Z 2025-10-10T01:18:49.8113871Z Running test_sparse_csr 1/2 ... [2025-10-10 01:18:49.811063] 2025-10-10T01:18:49.8114271Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:18:49.8117808Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_sparse_csr.py', '-m', 'serial', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:18:49.811438] 2025-10-10T01:18:55.9369238Z 2025-10-10T01:18:55.9369962Z test_sparse_csr 1/2 was successful, full logs can be found in artifacts with path test/test-reports/test_sparse_csr_1.2_b189d890dc16c7dd_.log 2025-10-10T01:18:55.9370613Z Running 0 items in this shard: 2025-10-10T01:18:55.9370795Z 2025-10-10T01:18:55.9372727Z Running test_serialization 1/1 ... [2025-10-10 01:18:55.936969] 2025-10-10T01:18:55.9373129Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:18:55.9376872Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_serialization.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:18:55.937350] 2025-10-10T01:18:59.9092525Z 2025-10-10T01:18:59.9093282Z test_serialization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_serialization_1.1_8385150271c87539_.log 2025-10-10T01:18:59.9094919Z Running 2 items in this shard: test/test_serialization.py::TestSerialization::test_serialization_2gb_file, test/test_serialization.py::TestSerialization::test_serialization_4gb_file 2025-10-10T01:18:59.9095621Z 2025-10-10T01:18:59.9096945Z Running torch_np/numpy_tests/lib/test_twodim_base 1/1 ... [2025-10-10 01:18:59.909369] 2025-10-10T01:18:59.9097430Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:18:59.9101643Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/lib/test_twodim_base.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:18:59.909767] 2025-10-10T01:19:03.1312323Z 2025-10-10T01:19:03.1313397Z torch_np/numpy_tests/lib/test_twodim_base 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.lib.test_twodim_base_1.1_803ee025fe5c7886_.log 2025-10-10T01:19:03.1314296Z Running 0 items in this shard: 2025-10-10T01:19:03.1314492Z 2025-10-10T01:19:03.1316654Z Running test_function_schema 1/1 ... [2025-10-10 01:19:03.131364] 2025-10-10T01:19:03.1317190Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:19:03.1320624Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_function_schema.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:19:03.131726] 2025-10-10T01:19:06.3528942Z 2025-10-10T01:19:06.3530219Z test_function_schema 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_function_schema_1.1_6d9807b902f2a45b_.log 2025-10-10T01:19:06.3530929Z Running 0 items in this shard: 2025-10-10T01:19:06.3531120Z 2025-10-10T01:19:06.3535862Z Running functorch/test_vmap 1/1 ... [2025-10-10 01:19:06.353270] 2025-10-10T01:19:06.3536290Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:19:06.3539844Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_vmap.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:19:06.353652] 2025-10-10T01:19:11.7785103Z 2025-10-10T01:19:11.7786078Z functorch/test_vmap 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_vmap_1.1_13cbcd2562c0b9de_.log 2025-10-10T01:19:11.7786788Z Running 0 items in this shard: 2025-10-10T01:19:11.7786976Z 2025-10-10T01:19:11.7787272Z Running torch_np/numpy_tests/lib/test_shape_base_ 1/1 ... [2025-10-10 01:19:11.778326] 2025-10-10T01:19:11.7787739Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:19:11.7791518Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/lib/test_shape_base_.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:19:11.778739] 2025-10-10T01:19:15.0004287Z 2025-10-10T01:19:15.0005809Z torch_np/numpy_tests/lib/test_shape_base_ 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.lib.test_shape_base__1.1_19a060713c4e9774_.log 2025-10-10T01:19:15.0006666Z Running 0 items in this shard: 2025-10-10T01:19:15.0006860Z 2025-10-10T01:19:15.0008617Z Running torch_np/numpy_tests/fft/test_pocketfft 1/1 ... [2025-10-10 01:19:15.000544] 2025-10-10T01:19:15.0009124Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:19:15.0012915Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/fft/test_pocketfft.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:19:15.000945] 2025-10-10T01:19:18.2717940Z 2025-10-10T01:19:18.2719042Z torch_np/numpy_tests/fft/test_pocketfft 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.fft.test_pocketfft_1.1_7a3eb9285227325a_.log 2025-10-10T01:19:18.2719883Z Running 0 items in this shard: 2025-10-10T01:19:18.2720066Z 2025-10-10T01:19:18.2722037Z Running test_scatter_gather_ops 1/1 ... [2025-10-10 01:19:18.271873] 2025-10-10T01:19:18.2722450Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:19:18.2725958Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_scatter_gather_ops.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:19:18.272228] 2025-10-10T01:19:22.0443573Z 2025-10-10T01:19:22.0444499Z test_scatter_gather_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_scatter_gather_ops_1.1_c34a081d5539f2ac_.log 2025-10-10T01:19:22.0445229Z Running 0 items in this shard: 2025-10-10T01:19:22.0445418Z 2025-10-10T01:19:22.0447424Z Running torch_np/test_ndarray_methods 1/1 ... [2025-10-10 01:19:22.044426] 2025-10-10T01:19:22.0447874Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:19:22.0451725Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/test_ndarray_methods.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:19:22.044820] 2025-10-10T01:19:25.3155343Z 2025-10-10T01:19:25.3157196Z torch_np/test_ndarray_methods 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.test_ndarray_methods_1.1_7fa620bd1dd39880_.log 2025-10-10T01:19:25.3157958Z Running 0 items in this shard: 2025-10-10T01:19:25.3158148Z 2025-10-10T01:19:25.3162917Z Running test_view_ops 1/1 ... [2025-10-10 01:19:25.315946] 2025-10-10T01:19:25.3163316Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:19:25.3166860Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_view_ops.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:19:25.316341] 2025-10-10T01:19:29.0883149Z 2025-10-10T01:19:29.0884040Z test_view_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_view_ops_1.1_2f352a06f209e475_.log 2025-10-10T01:19:29.0884675Z Running 0 items in this shard: 2025-10-10T01:19:29.0884863Z 2025-10-10T01:19:29.0887609Z Running torch_np/numpy_tests/core/test_dlpack 1/1 ... [2025-10-10 01:19:29.088428] 2025-10-10T01:19:29.0888136Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:19:29.0891886Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_dlpack.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:19:29.088822] 2025-10-10T01:19:32.3096288Z 2025-10-10T01:19:32.3097267Z torch_np/numpy_tests/core/test_dlpack 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_dlpack_1.1_63140e98a93e5810_.log 2025-10-10T01:19:32.3098084Z Running 0 items in this shard: 2025-10-10T01:19:32.3098277Z 2025-10-10T01:19:32.3100576Z Running torch_np/numpy_tests/core/test_getlimits 1/1 ... [2025-10-10 01:19:32.309746] 2025-10-10T01:19:32.3101053Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:19:32.3104748Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_getlimits.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:19:32.310116] 2025-10-10T01:19:35.5308812Z 2025-10-10T01:19:35.5309938Z torch_np/numpy_tests/core/test_getlimits 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_getlimits_1.1_ed6e61d089606395_.log 2025-10-10T01:19:35.5310789Z Running 0 items in this shard: 2025-10-10T01:19:35.5310981Z 2025-10-10T01:19:35.5312917Z Running test_accelerator 1/1 ... [2025-10-10 01:19:35.530991] 2025-10-10T01:19:35.5313321Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:19:35.5316992Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_accelerator.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:19:35.531355] 2025-10-10T01:19:38.7522084Z 2025-10-10T01:19:38.7522948Z test_accelerator 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_accelerator_1.1_54ab8aa03a1c43a1_.log 2025-10-10T01:19:38.7523643Z Running 0 items in this shard: 2025-10-10T01:19:38.7523841Z 2025-10-10T01:19:38.7526244Z Running lazy/test_reuse_ir 1/1 ... [2025-10-10 01:19:38.752346] 2025-10-10T01:19:38.7526639Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:19:38.7530707Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'lazy/test_reuse_ir.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:19:38.752743] 2025-10-10T01:19:41.9734679Z 2025-10-10T01:19:41.9736041Z lazy/test_reuse_ir 1/1 was successful, full logs can be found in artifacts with path test/test-reports/lazy.test_reuse_ir_1.1_99fff47802e98d89_.log 2025-10-10T01:19:41.9736720Z Running 0 items in this shard: 2025-10-10T01:19:41.9736903Z 2025-10-10T01:19:41.9738974Z Running torch_np/numpy_tests/lib/test_index_tricks 1/1 ... [2025-10-10 01:19:41.973579] 2025-10-10T01:19:41.9739469Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:19:41.9743389Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/lib/test_index_tricks.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:19:41.973964] 2025-10-10T01:19:45.1951343Z 2025-10-10T01:19:45.1954877Z torch_np/numpy_tests/lib/test_index_tricks 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.lib.test_index_tricks_1.1_73c04ebe9ceec85c_.log 2025-10-10T01:19:45.1955729Z Running 0 items in this shard: 2025-10-10T01:19:45.1955912Z 2025-10-10T01:19:45.1956105Z Running nn/test_init 1/1 ... [2025-10-10 01:19:45.195233] 2025-10-10T01:19:45.1956477Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:19:45.1959563Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'nn/test_init.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:19:45.195608] 2025-10-10T01:19:48.6664298Z 2025-10-10T01:19:48.6664996Z nn/test_init 1/1 was successful, full logs can be found in artifacts with path test/test-reports/nn.test_init_1.1_b542e5b73dcd92c9_.log 2025-10-10T01:19:48.6665653Z Running 0 items in this shard: 2025-10-10T01:19:48.6665840Z 2025-10-10T01:19:48.6668686Z Running torch_np/numpy_tests/core/test_numerictypes 1/1 ... [2025-10-10 01:19:48.666563] 2025-10-10T01:19:48.6669179Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:19:48.6672918Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_numerictypes.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:19:48.666925] 2025-10-10T01:19:51.8876448Z 2025-10-10T01:19:51.8878752Z torch_np/numpy_tests/core/test_numerictypes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_numerictypes_1.1_d08cc253be27ed78_.log 2025-10-10T01:19:51.8879626Z Running 0 items in this shard: 2025-10-10T01:19:51.8879812Z 2025-10-10T01:19:51.8880617Z Running test_type_promotion 1/1 ... [2025-10-10 01:19:51.887771] 2025-10-10T01:19:51.8881000Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:19:51.8884723Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_type_promotion.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:19:51.888134] 2025-10-10T01:19:55.7101151Z 2025-10-10T01:19:55.7102145Z test_type_promotion 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_type_promotion_1.1_18dcd77cea18637f_.log 2025-10-10T01:19:55.7102828Z Running 0 items in this shard: 2025-10-10T01:19:55.7103039Z 2025-10-10T01:19:55.7105608Z Running torch_np/numpy_tests/core/test_scalar_methods 1/1 ... [2025-10-10 01:19:55.710217] 2025-10-10T01:19:55.7106107Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:19:55.7109833Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_scalar_methods.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:19:55.710603] 2025-10-10T01:19:58.9313233Z 2025-10-10T01:19:58.9314747Z torch_np/numpy_tests/core/test_scalar_methods 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_scalar_methods_1.1_ab42ef4cbf8efd10_.log 2025-10-10T01:19:58.9315625Z Running 0 items in this shard: 2025-10-10T01:19:58.9315808Z 2025-10-10T01:19:58.9317464Z Running torch_np/numpy_tests/fft/test_helper 1/1 ... [2025-10-10 01:19:58.931440] 2025-10-10T01:19:58.9317949Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:19:58.9321523Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/fft/test_helper.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:19:58.931813] 2025-10-10T01:20:02.1525875Z 2025-10-10T01:20:02.1527694Z torch_np/numpy_tests/fft/test_helper 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.fft.test_helper_1.1_ac41356c39097f3f_.log 2025-10-10T01:20:02.1528666Z Running 0 items in this shard: 2025-10-10T01:20:02.1528885Z 2025-10-10T01:20:02.1531902Z Running torch_np/test_function_base 1/1 ... [2025-10-10 01:20:02.152862] 2025-10-10T01:20:02.1532332Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:20:02.1535664Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/test_function_base.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:20:02.153228] 2025-10-10T01:20:05.3738420Z 2025-10-10T01:20:05.3739386Z torch_np/test_function_base 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.test_function_base_1.1_f5ab374d2797df7d_.log 2025-10-10T01:20:05.3740147Z Running 0 items in this shard: 2025-10-10T01:20:05.3740346Z 2025-10-10T01:20:05.3741196Z Running profiler/test_profiler_tree 1/1 ... [2025-10-10 01:20:05.373831] 2025-10-10T01:20:05.3741720Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:20:05.3745537Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_profiler_tree.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:20:05.374180] 2025-10-10T01:20:08.5955530Z 2025-10-10T01:20:08.5956518Z profiler/test_profiler_tree 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_profiler_tree_1.1_235d4b965cf216e8_.log 2025-10-10T01:20:08.5957266Z Running 0 items in this shard: 2025-10-10T01:20:08.5959954Z 2025-10-10T01:20:08.5960220Z Running functorch/test_eager_transforms 1/1 ... [2025-10-10 01:20:08.595631] 2025-10-10T01:20:08.5960771Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:20:08.5963859Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_eager_transforms.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:20:08.595991] 2025-10-10T01:20:13.6198734Z 2025-10-10T01:20:13.6199831Z functorch/test_eager_transforms 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_eager_transforms_1.1_5fba5476cfdee925_.log 2025-10-10T01:20:13.6200775Z Running 0 items in this shard: 2025-10-10T01:20:13.6200968Z 2025-10-10T01:20:13.6203269Z Running test_sparse 1/1 ... [2025-10-10 01:20:13.619985] 2025-10-10T01:20:13.6203769Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:20:13.6207954Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_sparse.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:20:13.620363] 2025-10-10T01:20:19.0954277Z 2025-10-10T01:20:19.0955860Z test_sparse 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_sparse_1.1_4deaf8f100a8d8d0_.log 2025-10-10T01:20:19.0956604Z Running 0 items in this shard: 2025-10-10T01:20:19.0956857Z 2025-10-10T01:20:22.3075784Z Running inductor/test_dependencies 1/1 ... [2025-10-10 01:20:22.306971] 2025-10-10T01:20:22.3076391Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:20:22.3077722Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_dependencies.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:20:22.307422] 2025-10-10T01:20:22.3237461Z Running test_ops 1/1 ... [2025-10-10 01:20:22.323211] 2025-10-10T01:20:22.3237941Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:20:22.3239651Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:20:22.323622] 2025-10-10T01:20:22.3427998Z Running test_torchfuzz_repros 1/1 ... [2025-10-10 01:20:22.342297] 2025-10-10T01:20:22.3428569Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:20:22.3430109Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_torchfuzz_repros.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:20:22.342666] 2025-10-10T01:20:29.5867694Z 2025-10-10T01:20:29.5869100Z inductor/test_dependencies 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_dependencies_1.1_9d5bdfe4dc267a99_.log 2025-10-10T01:20:29.5871493Z Running 5 items in this shard: test/inductor/test_dependencies.py::TestDependencies::test_bucketize_dependencies_no_sorter, test/inductor/test_dependencies.py::TestDependencies::test_bucketize_dependencies_sorter, test/inductor/test_dependencies.py::TestDependencies::test_get_offset, test/inductor/test_dependencies.py::TestDependencies::test_normalize_with_stride_order_equal, test/inductor/test_dependencies.py::TestDependencies::test_normalize_with_stride_order_unequal 2025-10-10T01:20:29.5873271Z 2025-10-10T01:20:33.4110544Z Running test_opaque_obj 1/1 ... [2025-10-10 01:20:33.410538] 2025-10-10T01:20:33.4111098Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:20:33.4113100Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_opaque_obj.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:20:33.410928] 2025-10-10T01:20:48.2624744Z 2025-10-10T01:20:48.2625966Z test_torchfuzz_repros 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_torchfuzz_repros_1.1_e455325673a5a69c_.log 2025-10-10T01:20:48.2634216Z Running 15 items in this shard: test/test_torchfuzz_repros.py::TestFuzzerCompileIssues::test_fuzzer_issue_163674, test/test_torchfuzz_repros.py::TestFuzzerCompileIssues::test_fuzzer_issue_163876, test/test_torchfuzz_repros.py::TestFuzzerCompileIssues::test_fuzzer_issue_163877, test/test_torchfuzz_repros.py::TestFuzzerCompileIssues::test_fuzzer_issue_163894, test/test_torchfuzz_repros.py::TestFuzzerCompileIssues::test_fuzzer_issue_163971, test/test_torchfuzz_repros.py::TestFuzzerCompileIssues::test_fuzzer_issue_164059, test/test_torchfuzz_repros.py::TestFuzzerCompileIssues::test_fuzzer_issue_164086, test/test_torchfuzz_repros.py::TestFuzzerCompileIssues::test_fuzzer_issue_164088, test/test_torchfuzz_repros.py::TestFuzzerCompileIssues::test_fuzzer_issue_164157, test/test_torchfuzz_repros.py::TestFuzzerCompileIssues::test_fuzzer_issue_164185, test/test_torchfuzz_repros.py::TestFuzzerCompileIssues::test_fuzzer_issue_164186, test/test_torchfuzz_repros.py::TestFuzzerCompileIssues::test_fuzzer_issue_164428, test/test_torchfuzz_repros.py::TestFuzzerCompileIssues::test_fuzzer_issue_164428_already_exists, test/test_torchfuzz_repros.py::TestFuzzerCompileIssues::test_fuzzer_issue_164484, test/test_torchfuzz_repros.py::TestFuzzerCompileIssues::test_fuzzer_issue_164486 2025-10-10T01:20:48.2641899Z 2025-10-10T01:20:52.1627971Z Running test_testing 1/1 ... [2025-10-10 01:20:52.162222] 2025-10-10T01:20:52.1628623Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:20:52.1630181Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_testing.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:20:52.162596] 2025-10-10T01:21:00.7787856Z 2025-10-10T01:21:00.7788722Z PRINTING LOG FILE of test_opaque_obj 1/1 (test/test-reports/test_opaque_obj_1.1_83a0045b8ec005b5_.log) 2025-10-10T01:21:00.7789569Z Test results will be stored in test-reports/python-pytest/test_opaque_obj/test_opaque_obj-9737bcf04884f0b7.xml 2025-10-10T01:21:00.7800473Z ============================= test session starts ============================== 2025-10-10T01:21:00.7801246Z platform linux -- Python 3.10.18, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-10-10T01:21:00.7801852Z cachedir: .pytest_cache 2025-10-10T01:21:00.7802441Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-10-10T01:21:00.7803288Z rootdir: /var/lib/jenkins/workspace 2025-10-10T01:21:00.7803630Z configfile: pytest.ini 2025-10-10T01:21:00.7804316Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-10-10T01:21:00.7805185Z collecting ... collected 7 items 2025-10-10T01:21:00.7805637Z stepcurrent: Cannot find last run test, not skipping 2025-10-10T01:21:00.7807636Z Running 7 items in this shard: test/test_opaque_obj.py::TestOpaqueObject::test_bad_fake, test/test_opaque_obj.py::TestOpaqueObject::test_creation, test/test_opaque_obj.py::TestOpaqueObject::test_deepcopy, test/test_opaque_obj.py::TestOpaqueObject::test_eq, test/test_opaque_obj.py::TestOpaqueObject::test_make_fx_make_fx_tracing_mode_fake, test/test_opaque_obj.py::TestOpaqueObject::test_make_fx_make_fx_tracing_mode_symbolic, test/test_opaque_obj.py::TestOpaqueObject::test_ops 2025-10-10T01:21:00.7809579Z 2025-10-10T01:21:00.7809996Z test_opaque_obj.py::TestOpaqueObject::test_bad_fake SKIPPED [0.2701s] (test is fast; we disabled it with PYTORCH_TEST_SKIP_FAST) [ 14%] 2025-10-10T01:21:00.7810781Z test_opaque_obj.py::TestOpaqueObject::test_creation ('RERUN', {'yellow': True}) [0.0007s] [ 28%] 2025-10-10T01:21:00.7811440Z test_opaque_obj.py::TestOpaqueObject::test_creation ('RERUN', {'yellow': True}) [0.0003s] [ 28%] 2025-10-10T01:21:00.7812058Z test_opaque_obj.py::TestOpaqueObject::test_creation FAILED [0.0003s] [ 28%] 2025-10-10T01:21:00.7812397Z 2025-10-10T01:21:00.7812531Z ==================================== RERUNS ==================================== 2025-10-10T01:21:00.7812966Z ________________________ TestOpaqueObject.test_creation ________________________ 2025-10-10T01:21:00.7813379Z Traceback (most recent call last): 2025-10-10T01:21:00.7813820Z File "/var/lib/jenkins/workspace/test/test_opaque_obj.py", line 59, in setUp 2025-10-10T01:21:00.7814313Z torch.library.define( 2025-10-10T01:21:00.7814747Z File "/opt/conda/envs/py_3.10/lib/python3.10/functools.py", line 889, in wrapper 2025-10-10T01:21:00.7815232Z return dispatch(args[0].__class__)(*args, **kw) 2025-10-10T01:21:00.7815765Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py", line 536, in define 2025-10-10T01:21:00.7816326Z lib.define(name + schema, alias_analysis="", tags=tags) 2025-10-10T01:21:00.7816881Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py", line 163, in define 2025-10-10T01:21:00.7817457Z result = self.m.define(schema, alias_analysis, tuple(tags)) 2025-10-10T01:21:00.7819411Z RuntimeError: Tried to register an operator (_TestOpaqueObject::queue_push(__torch__.torch.classes.aten.OpaqueObject a, Tensor b) -> ()) with the same name and overload name multiple times. Each overload's schema should only be registered with a single call to def(). Duplicate registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57. Original registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57 2025-10-10T01:21:00.7821206Z ________________________ TestOpaqueObject.test_creation ________________________ 2025-10-10T01:21:00.7821614Z Traceback (most recent call last): 2025-10-10T01:21:00.7822053Z File "/var/lib/jenkins/workspace/test/test_opaque_obj.py", line 59, in setUp 2025-10-10T01:21:00.7822501Z torch.library.define( 2025-10-10T01:21:00.7822915Z File "/opt/conda/envs/py_3.10/lib/python3.10/functools.py", line 889, in wrapper 2025-10-10T01:21:00.7823393Z return dispatch(args[0].__class__)(*args, **kw) 2025-10-10T01:21:00.7823925Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py", line 536, in define 2025-10-10T01:21:00.7824485Z lib.define(name + schema, alias_analysis="", tags=tags) 2025-10-10T01:21:00.7825040Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py", line 163, in define 2025-10-10T01:21:00.7825619Z result = self.m.define(schema, alias_analysis, tuple(tags)) 2025-10-10T01:21:00.7827386Z RuntimeError: Tried to register an operator (_TestOpaqueObject::queue_push(__torch__.torch.classes.aten.OpaqueObject a, Tensor b) -> ()) with the same name and overload name multiple times. Each overload's schema should only be registered with a single call to def(). Duplicate registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57. Original registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57 2025-10-10T01:21:00.7829088Z =================================== FAILURES =================================== 2025-10-10T01:21:00.7829536Z ________________________ TestOpaqueObject.test_creation ________________________ 2025-10-10T01:21:00.7830182Z Traceback (most recent call last): 2025-10-10T01:21:00.7830622Z File "/var/lib/jenkins/workspace/test/test_opaque_obj.py", line 59, in setUp 2025-10-10T01:21:00.7831210Z torch.library.define( 2025-10-10T01:21:00.7831611Z File "/opt/conda/envs/py_3.10/lib/python3.10/functools.py", line 889, in wrapper 2025-10-10T01:21:00.7832086Z return dispatch(args[0].__class__)(*args, **kw) 2025-10-10T01:21:00.7832619Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py", line 536, in define 2025-10-10T01:21:00.7833180Z lib.define(name + schema, alias_analysis="", tags=tags) 2025-10-10T01:21:00.7833732Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py", line 163, in define 2025-10-10T01:21:00.7834308Z result = self.m.define(schema, alias_analysis, tuple(tags)) 2025-10-10T01:21:00.7836082Z RuntimeError: Tried to register an operator (_TestOpaqueObject::queue_push(__torch__.torch.classes.aten.OpaqueObject a, Tensor b) -> ()) with the same name and overload name multiple times. Each overload's schema should only be registered with a single call to def(). Duplicate registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57. Original registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57 2025-10-10T01:21:00.7838152Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/test_opaque_obj/test_opaque_obj-9737bcf04884f0b7.xml - 2025-10-10T01:21:00.7838880Z =========================== short test summary info ============================ 2025-10-10T01:21:00.7840937Z FAILED [0.0003s] test_opaque_obj.py::TestOpaqueObject::test_creation - RuntimeError: Tried to register an operator (_TestOpaqueObject::queue_push(__torch__.torch.classes.aten.OpaqueObject a, Tensor b) -> ()) with the same name and overload name multiple times. Each overload's schema should only be registered with a single call to def(). Duplicate registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57. Original registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57 2025-10-10T01:21:00.7842908Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-10-10T01:21:00.7843339Z ==================== 1 failed, 1 skipped, 2 rerun in 0.29s ===================== 2025-10-10T01:21:00.7843697Z Got exit code 1 2025-10-10T01:21:00.7843928Z Retrying single test... 2025-10-10T01:21:00.7844444Z Test results will be stored in test-reports/python-pytest/test_opaque_obj/test_opaque_obj-fc7762df66962a95.xml 2025-10-10T01:21:00.7845041Z ============================= test session starts ============================== 2025-10-10T01:21:00.7845571Z platform linux -- Python 3.10.18, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-10-10T01:21:00.7846045Z cachedir: .pytest_cache 2025-10-10T01:21:00.7846617Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-10-10T01:21:00.7847301Z rootdir: /var/lib/jenkins/workspace 2025-10-10T01:21:00.7847604Z configfile: pytest.ini 2025-10-10T01:21:00.7848190Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-10-10T01:21:00.7848886Z collecting ... collected 7 items / 6 deselected / 1 selected 2025-10-10T01:21:00.7849520Z stepcurrent: skipping 1 already run items. Running only test/test_opaque_obj.py::TestOpaqueObject::test_creation 2025-10-10T01:21:00.7850084Z Running 1 items in this shard 2025-10-10T01:21:00.7850263Z 2025-10-10T01:21:00.7850891Z test_opaque_obj.py::TestOpaqueObject::test_creation SKIPPED [0.2969s] (test is fast; we disabled it with PYTORCH_TEST_SKIP_FAST) [100%] 2025-10-10T01:21:00.7851397Z 2025-10-10T01:21:00.7851890Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/test_opaque_obj/test_opaque_obj-fc7762df66962a95.xml - 2025-10-10T01:21:00.7852613Z ======================= 1 skipped, 6 deselected in 0.32s ======================= 2025-10-10T01:21:00.7852965Z Got exit code 0 2025-10-10T01:21:00.7853313Z Test succeeeded in new process, continuing with the rest of the tests 2025-10-10T01:21:00.7854057Z Test results will be stored in test-reports/python-pytest/test_opaque_obj/test_opaque_obj-7b38027311ebcd3d.xml 2025-10-10T01:21:00.7854638Z ============================= test session starts ============================== 2025-10-10T01:21:00.7855169Z platform linux -- Python 3.10.18, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-10-10T01:21:00.7855648Z cachedir: .pytest_cache 2025-10-10T01:21:00.7856214Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-10-10T01:21:00.7856827Z rootdir: /var/lib/jenkins/workspace 2025-10-10T01:21:00.7857117Z configfile: pytest.ini 2025-10-10T01:21:00.7857702Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-10-10T01:21:00.7858401Z collecting ... collected 7 items / 2 deselected / 5 selected 2025-10-10T01:21:00.7858800Z stepcurrent: skipping 2 already run items. 2025-10-10T01:21:00.7859119Z Running 5 items in this shard 2025-10-10T01:21:00.7859303Z 2025-10-10T01:21:00.7859723Z test_opaque_obj.py::TestOpaqueObject::test_deepcopy SKIPPED [0.2792s] (test is fast; we disabled it with PYTORCH_TEST_SKIP_FAST) [ 20%] 2025-10-10T01:21:00.7860488Z test_opaque_obj.py::TestOpaqueObject::test_eq ('RERUN', {'yellow': True}) [0.0007s] [ 40%] 2025-10-10T01:21:00.7861108Z test_opaque_obj.py::TestOpaqueObject::test_eq ('RERUN', {'yellow': True}) [0.0003s] [ 40%] 2025-10-10T01:21:00.7861694Z test_opaque_obj.py::TestOpaqueObject::test_eq FAILED [0.0003s] [ 40%] 2025-10-10T01:21:00.7862022Z 2025-10-10T01:21:00.7862242Z ==================================== RERUNS ==================================== 2025-10-10T01:21:00.7862655Z ___________________________ TestOpaqueObject.test_eq ___________________________ 2025-10-10T01:21:00.7863057Z Traceback (most recent call last): 2025-10-10T01:21:00.7863496Z File "/var/lib/jenkins/workspace/test/test_opaque_obj.py", line 59, in setUp 2025-10-10T01:21:00.7863947Z torch.library.define( 2025-10-10T01:21:00.7864355Z File "/opt/conda/envs/py_3.10/lib/python3.10/functools.py", line 889, in wrapper 2025-10-10T01:21:00.7864834Z return dispatch(args[0].__class__)(*args, **kw) 2025-10-10T01:21:00.7865368Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py", line 536, in define 2025-10-10T01:21:00.7865929Z lib.define(name + schema, alias_analysis="", tags=tags) 2025-10-10T01:21:00.7866479Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py", line 163, in define 2025-10-10T01:21:00.7867058Z result = self.m.define(schema, alias_analysis, tuple(tags)) 2025-10-10T01:21:00.7868844Z RuntimeError: Tried to register an operator (_TestOpaqueObject::queue_push(__torch__.torch.classes.aten.OpaqueObject a, Tensor b) -> ()) with the same name and overload name multiple times. Each overload's schema should only be registered with a single call to def(). Duplicate registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57. Original registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57 2025-10-10T01:21:00.7870619Z ___________________________ TestOpaqueObject.test_eq ___________________________ 2025-10-10T01:21:00.7871026Z Traceback (most recent call last): 2025-10-10T01:21:00.7871463Z File "/var/lib/jenkins/workspace/test/test_opaque_obj.py", line 59, in setUp 2025-10-10T01:21:00.7871902Z torch.library.define( 2025-10-10T01:21:00.7872315Z File "/opt/conda/envs/py_3.10/lib/python3.10/functools.py", line 889, in wrapper 2025-10-10T01:21:00.7872795Z return dispatch(args[0].__class__)(*args, **kw) 2025-10-10T01:21:00.7873340Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py", line 536, in define 2025-10-10T01:21:00.7873901Z lib.define(name + schema, alias_analysis="", tags=tags) 2025-10-10T01:21:00.7874451Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py", line 163, in define 2025-10-10T01:21:00.7875114Z result = self.m.define(schema, alias_analysis, tuple(tags)) 2025-10-10T01:21:00.7876882Z RuntimeError: Tried to register an operator (_TestOpaqueObject::queue_push(__torch__.torch.classes.aten.OpaqueObject a, Tensor b) -> ()) with the same name and overload name multiple times. Each overload's schema should only be registered with a single call to def(). Duplicate registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57. Original registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57 2025-10-10T01:21:00.7878603Z =================================== FAILURES =================================== 2025-10-10T01:21:00.7879040Z ___________________________ TestOpaqueObject.test_eq ___________________________ 2025-10-10T01:21:00.7879435Z Traceback (most recent call last): 2025-10-10T01:21:00.7879872Z File "/var/lib/jenkins/workspace/test/test_opaque_obj.py", line 59, in setUp 2025-10-10T01:21:00.7880316Z torch.library.define( 2025-10-10T01:21:00.7880727Z File "/opt/conda/envs/py_3.10/lib/python3.10/functools.py", line 889, in wrapper 2025-10-10T01:21:00.7881208Z return dispatch(args[0].__class__)(*args, **kw) 2025-10-10T01:21:00.7881735Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py", line 536, in define 2025-10-10T01:21:00.7882294Z lib.define(name + schema, alias_analysis="", tags=tags) 2025-10-10T01:21:00.7882850Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py", line 163, in define 2025-10-10T01:21:00.7883429Z result = self.m.define(schema, alias_analysis, tuple(tags)) 2025-10-10T01:21:00.7885367Z RuntimeError: Tried to register an operator (_TestOpaqueObject::queue_push(__torch__.torch.classes.aten.OpaqueObject a, Tensor b) -> ()) with the same name and overload name multiple times. Each overload's schema should only be registered with a single call to def(). Duplicate registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57. Original registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57 2025-10-10T01:21:00.7887526Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/test_opaque_obj/test_opaque_obj-7b38027311ebcd3d.xml - 2025-10-10T01:21:00.7888243Z =========================== short test summary info ============================ 2025-10-10T01:21:00.7890196Z FAILED [0.0003s] test_opaque_obj.py::TestOpaqueObject::test_eq - RuntimeError: Tried to register an operator (_TestOpaqueObject::queue_push(__torch__.torch.classes.aten.OpaqueObject a, Tensor b) -> ()) with the same name and overload name multiple times. Each overload's schema should only be registered with a single call to def(). Duplicate registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57. Original registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57 2025-10-10T01:21:00.7892135Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-10-10T01:21:00.7892586Z ============= 1 failed, 1 skipped, 2 deselected, 2 rerun in 0.30s ============== 2025-10-10T01:21:00.7892960Z Got exit code 1 2025-10-10T01:21:00.7893189Z Retrying single test... 2025-10-10T01:21:00.7893704Z Test results will be stored in test-reports/python-pytest/test_opaque_obj/test_opaque_obj-d7344d8b55ee5f0d.xml 2025-10-10T01:21:00.7894301Z ============================= test session starts ============================== 2025-10-10T01:21:00.7894827Z platform linux -- Python 3.10.18, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-10-10T01:21:00.7895304Z cachedir: .pytest_cache 2025-10-10T01:21:00.7895865Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-10-10T01:21:00.7896483Z rootdir: /var/lib/jenkins/workspace 2025-10-10T01:21:00.7896786Z configfile: pytest.ini 2025-10-10T01:21:00.7897371Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-10-10T01:21:00.7898171Z collecting ... collected 7 items / 6 deselected / 1 selected 2025-10-10T01:21:00.7899262Z stepcurrent: skipping 3 already run items. Running only test/test_opaque_obj.py::TestOpaqueObject::test_eq 2025-10-10T01:21:00.7899821Z Running 1 items in this shard 2025-10-10T01:21:00.7900004Z 2025-10-10T01:21:00.7900406Z test_opaque_obj.py::TestOpaqueObject::test_eq SKIPPED [0.2709s] (test is fast; we disabled it with PYTORCH_TEST_SKIP_FAST) [100%] 2025-10-10T01:21:00.7900887Z 2025-10-10T01:21:00.7901391Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/test_opaque_obj/test_opaque_obj-d7344d8b55ee5f0d.xml - 2025-10-10T01:21:00.7902132Z ======================= 1 skipped, 6 deselected in 0.29s ======================= 2025-10-10T01:21:00.7902485Z Got exit code 0 2025-10-10T01:21:00.7902835Z Test succeeeded in new process, continuing with the rest of the tests 2025-10-10T01:21:00.7903498Z Test results will be stored in test-reports/python-pytest/test_opaque_obj/test_opaque_obj-0ab798b2fefbd685.xml 2025-10-10T01:21:00.7904103Z ============================= test session starts ============================== 2025-10-10T01:21:00.7904630Z platform linux -- Python 3.10.18, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-10-10T01:21:00.7905113Z cachedir: .pytest_cache 2025-10-10T01:21:00.7905680Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-10-10T01:21:00.7906300Z rootdir: /var/lib/jenkins/workspace 2025-10-10T01:21:00.7906607Z configfile: pytest.ini 2025-10-10T01:21:00.7907478Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-10-10T01:21:00.7908187Z collecting ... collected 7 items / 4 deselected / 3 selected 2025-10-10T01:21:00.7908595Z stepcurrent: skipping 4 already run items. 2025-10-10T01:21:00.7908926Z Running 3 items in this shard 2025-10-10T01:21:00.7909103Z 2025-10-10T01:21:00.7909587Z test_opaque_obj.py::TestOpaqueObject::test_make_fx_make_fx_tracing_mode_fake SKIPPED [0.2696s] (test is fast; we disabled it with PYTORCH_TEST_SKIP_FAST) [ 33%] 2025-10-10T01:21:00.7910535Z test_opaque_obj.py::TestOpaqueObject::test_make_fx_make_fx_tracing_mode_symbolic ('RERUN', {'yellow': True}) [0.0007s] [ 66%] 2025-10-10T01:21:00.7911381Z test_opaque_obj.py::TestOpaqueObject::test_make_fx_make_fx_tracing_mode_symbolic ('RERUN', {'yellow': True}) [0.0004s] [ 66%] 2025-10-10T01:21:00.7912158Z test_opaque_obj.py::TestOpaqueObject::test_make_fx_make_fx_tracing_mode_symbolic FAILED [0.0003s] [ 66%] 2025-10-10T01:21:00.7912567Z 2025-10-10T01:21:00.7912703Z ==================================== RERUNS ==================================== 2025-10-10T01:21:00.7913172Z _________ TestOpaqueObject.test_make_fx_make_fx_tracing_mode_symbolic __________ 2025-10-10T01:21:00.7913611Z Traceback (most recent call last): 2025-10-10T01:21:00.7914063Z File "/var/lib/jenkins/workspace/test/test_opaque_obj.py", line 59, in setUp 2025-10-10T01:21:00.7914509Z torch.library.define( 2025-10-10T01:21:00.7914922Z File "/opt/conda/envs/py_3.10/lib/python3.10/functools.py", line 889, in wrapper 2025-10-10T01:21:00.7915392Z return dispatch(args[0].__class__)(*args, **kw) 2025-10-10T01:21:00.7915922Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py", line 536, in define 2025-10-10T01:21:00.7916484Z lib.define(name + schema, alias_analysis="", tags=tags) 2025-10-10T01:21:00.7917037Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py", line 163, in define 2025-10-10T01:21:00.7917611Z result = self.m.define(schema, alias_analysis, tuple(tags)) 2025-10-10T01:21:00.7919385Z RuntimeError: Tried to register an operator (_TestOpaqueObject::queue_push(__torch__.torch.classes.aten.OpaqueObject a, Tensor b) -> ()) with the same name and overload name multiple times. Each overload's schema should only be registered with a single call to def(). Duplicate registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57. Original registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57 2025-10-10T01:21:00.7921363Z _________ TestOpaqueObject.test_make_fx_make_fx_tracing_mode_symbolic __________ 2025-10-10T01:21:00.7921807Z Traceback (most recent call last): 2025-10-10T01:21:00.7922242Z File "/var/lib/jenkins/workspace/test/test_opaque_obj.py", line 59, in setUp 2025-10-10T01:21:00.7922697Z torch.library.define( 2025-10-10T01:21:00.7923110Z File "/opt/conda/envs/py_3.10/lib/python3.10/functools.py", line 889, in wrapper 2025-10-10T01:21:00.7923591Z return dispatch(args[0].__class__)(*args, **kw) 2025-10-10T01:21:00.7924139Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py", line 536, in define 2025-10-10T01:21:00.7924717Z lib.define(name + schema, alias_analysis="", tags=tags) 2025-10-10T01:21:00.7925274Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py", line 163, in define 2025-10-10T01:21:00.7925848Z result = self.m.define(schema, alias_analysis, tuple(tags)) 2025-10-10T01:21:00.7927744Z RuntimeError: Tried to register an operator (_TestOpaqueObject::queue_push(__torch__.torch.classes.aten.OpaqueObject a, Tensor b) -> ()) with the same name and overload name multiple times. Each overload's schema should only be registered with a single call to def(). Duplicate registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57. Original registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57 2025-10-10T01:21:00.7929549Z =================================== FAILURES =================================== 2025-10-10T01:21:00.7930012Z _________ TestOpaqueObject.test_make_fx_make_fx_tracing_mode_symbolic __________ 2025-10-10T01:21:00.7930453Z Traceback (most recent call last): 2025-10-10T01:21:00.7930888Z File "/var/lib/jenkins/workspace/test/test_opaque_obj.py", line 59, in setUp 2025-10-10T01:21:00.7931336Z torch.library.define( 2025-10-10T01:21:00.7931741Z File "/opt/conda/envs/py_3.10/lib/python3.10/functools.py", line 889, in wrapper 2025-10-10T01:21:00.7932217Z return dispatch(args[0].__class__)(*args, **kw) 2025-10-10T01:21:00.7932750Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py", line 536, in define 2025-10-10T01:21:00.7933300Z lib.define(name + schema, alias_analysis="", tags=tags) 2025-10-10T01:21:00.7933858Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py", line 163, in define 2025-10-10T01:21:00.7934424Z result = self.m.define(schema, alias_analysis, tuple(tags)) 2025-10-10T01:21:00.7936179Z RuntimeError: Tried to register an operator (_TestOpaqueObject::queue_push(__torch__.torch.classes.aten.OpaqueObject a, Tensor b) -> ()) with the same name and overload name multiple times. Each overload's schema should only be registered with a single call to def(). Duplicate registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57. Original registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57 2025-10-10T01:21:00.7938228Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/test_opaque_obj/test_opaque_obj-0ab798b2fefbd685.xml - 2025-10-10T01:21:00.7938947Z =========================== short test summary info ============================ 2025-10-10T01:21:00.7941004Z FAILED [0.0003s] test_opaque_obj.py::TestOpaqueObject::test_make_fx_make_fx_tracing_mode_symbolic - RuntimeError: Tried to register an operator (_TestOpaqueObject::queue_push(__torch__.torch.classes.aten.OpaqueObject a, Tensor b) -> ()) with the same name and overload name multiple times. Each overload's schema should only be registered with a single call to def(). Duplicate registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57. Original registration: registered at /var/lib/jenkins/workspace/test/test_opaque_obj.py:57 2025-10-10T01:21:00.7943142Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-10-10T01:21:00.7943578Z ============= 1 failed, 1 skipped, 4 deselected, 2 rerun in 0.29s ============== 2025-10-10T01:21:00.7943940Z Got exit code 1 2025-10-10T01:21:00.7944178Z Retrying single test... 2025-10-10T01:21:00.7944687Z Test results will be stored in test-reports/python-pytest/test_opaque_obj/test_opaque_obj-635f7d94067fd76e.xml 2025-10-10T01:21:00.7945549Z ============================= test session starts ============================== 2025-10-10T01:21:00.7946072Z platform linux -- Python 3.10.18, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-10-10T01:21:00.7946545Z cachedir: .pytest_cache 2025-10-10T01:21:00.7947109Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-10-10T01:21:00.7947724Z rootdir: /var/lib/jenkins/workspace 2025-10-10T01:21:00.7948021Z configfile: pytest.ini 2025-10-10T01:21:00.7948599Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-10-10T01:21:00.7949292Z collecting ... collected 7 items / 6 deselected / 1 selected 2025-10-10T01:21:00.7950022Z stepcurrent: skipping 5 already run items. Running only test/test_opaque_obj.py::TestOpaqueObject::test_make_fx_make_fx_tracing_mode_symbolic 2025-10-10T01:21:00.7950672Z Running 1 items in this shard 2025-10-10T01:21:00.7950850Z 2025-10-10T01:21:00.7951440Z test_opaque_obj.py::TestOpaqueObject::test_make_fx_make_fx_tracing_mode_symbolic SKIPPED [0.2844s] (test is fast; we disabled it with PYTORCH_TEST_SKIP_FAST) [100%] 2025-10-10T01:21:00.7952046Z 2025-10-10T01:21:00.7952520Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/test_opaque_obj/test_opaque_obj-635f7d94067fd76e.xml - 2025-10-10T01:21:00.7953247Z ======================= 1 skipped, 6 deselected in 0.30s ======================= 2025-10-10T01:21:00.7953600Z Got exit code 0 2025-10-10T01:21:00.7953941Z Test succeeeded in new process, continuing with the rest of the tests 2025-10-10T01:21:00.7954590Z Test results will be stored in test-reports/python-pytest/test_opaque_obj/test_opaque_obj-245949ab9a0dd00e.xml 2025-10-10T01:21:00.7955178Z ============================= test session starts ============================== 2025-10-10T01:21:00.7955706Z platform linux -- Python 3.10.18, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-10-10T01:21:00.7956179Z cachedir: .pytest_cache 2025-10-10T01:21:00.7956752Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-10-10T01:21:00.7957363Z rootdir: /var/lib/jenkins/workspace 2025-10-10T01:21:00.7957653Z configfile: pytest.ini 2025-10-10T01:21:00.7958233Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-10-10T01:21:00.7958933Z collecting ... collected 7 items / 6 deselected / 1 selected 2025-10-10T01:21:00.7959332Z stepcurrent: skipping 6 already run items. 2025-10-10T01:21:00.7959648Z Running 1 items in this shard 2025-10-10T01:21:00.7959832Z 2025-10-10T01:21:00.7960231Z test_opaque_obj.py::TestOpaqueObject::test_ops SKIPPED [0.2873s] (test is fast; we disabled it with PYTORCH_TEST_SKIP_FAST) [100%] 2025-10-10T01:21:00.7960726Z 2025-10-10T01:21:00.7961203Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/test_opaque_obj/test_opaque_obj-245949ab9a0dd00e.xml - 2025-10-10T01:21:00.7961930Z ======================= 1 skipped, 6 deselected in 0.30s ======================= 2025-10-10T01:21:00.7963071Z The following tests failed and then succeeded when run in a new process['test/test_opaque_obj.py::TestOpaqueObject::test_creation', 'test/test_opaque_obj.py::TestOpaqueObject::test_eq', 'test/test_opaque_obj.py::TestOpaqueObject::test_make_fx_make_fx_tracing_mode_symbolic'] 2025-10-10T01:21:00.7964155Z 2025-10-10T01:21:00.7964494Z FINISHED PRINTING LOG FILE of test_opaque_obj 1/1 (test/test-reports/test_opaque_obj_1.1_83a0045b8ec005b5_.log) 2025-10-10T01:21:00.7964921Z 2025-10-10T01:21:04.7180982Z Running test_public_bindings 1/1 ... [2025-10-10 01:21:04.717526] 2025-10-10T01:21:04.7181410Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:21:04.7182784Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_public_bindings.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:21:04.717941] 2025-10-10T01:21:08.5912538Z 2025-10-10T01:21:08.5913591Z test_public_bindings 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_public_bindings_1.1_b11c263fd0c85ed8_.log 2025-10-10T01:21:08.5915819Z Running 4 items in this shard: test/test_public_bindings.py::TestPublicBindings::test_correct_module_names, test/test_public_bindings.py::TestPublicBindings::test_modules_can_be_imported, test/test_public_bindings.py::TestPublicBindings::test_no_new_bindings, test/test_public_bindings.py::TestPublicBindings::test_no_new_reexport_callables 2025-10-10T01:21:08.5917469Z 2025-10-10T01:21:09.0151099Z 2025-10-10T01:21:09.0152222Z test_testing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_testing_1.1_34083e212ec9e153_.log 2025-10-10T01:21:09.0946460Z Running 2073 items in this shard: test/test_testing.py::TestTestingCUDA::test_assertEqual_longMessage_cuda, test/test_testing.py::TestTestingCUDA::test_assertEqual_numpy_cuda_bool, test/test_testing.py::TestTestingCUDA::test_assertEqual_numpy_cuda_complex128, test/test_testing.py::TestTestingCUDA::test_assertEqual_numpy_cuda_complex64, test/test_testing.py::TestTestingCUDA::test_assertEqual_numpy_cuda_float16, test/test_testing.py::TestTestingCUDA::test_assertEqual_numpy_cuda_float32, test/test_testing.py::TestTestingCUDA::test_assertEqual_numpy_cuda_float64, test/test_testing.py::TestTestingCUDA::test_assertEqual_numpy_cuda_int16, test/test_testing.py::TestTestingCUDA::test_assertEqual_numpy_cuda_int32, test/test_testing.py::TestTestingCUDA::test_assertEqual_numpy_cuda_int64, test/test_testing.py::TestTestingCUDA::test_assertEqual_numpy_cuda_int8, test/test_testing.py::TestTestingCUDA::test_assertEqual_numpy_cuda_uint8, test/test_testing.py::TestTestingCUDA::test_cuda_assert_should_not_stop_common_distributed_test_suite_cuda, test/test_testing.py::TestTestingCUDA::test_cuda_assert_should_stop_common_device_type_test_suite_cuda, test/test_testing.py::TestTestingCUDA::test_cuda_assert_should_stop_common_utils_test_suite_cuda, test/test_testing.py::TestTestingCUDA::test_get_supported_dtypes_cuda, test/test_testing.py::TestTestingCUDA::test_isclose_atol_rtol_greater_than_zero_cuda_bool, test/test_testing.py::TestTestingCUDA::test_isclose_atol_rtol_greater_than_zero_cuda_float16, test/test_testing.py::TestTestingCUDA::test_isclose_atol_rtol_greater_than_zero_cuda_float32, test/test_testing.py::TestTestingCUDA::test_isclose_atol_rtol_greater_than_zero_cuda_float64, test/test_testing.py::TestTestingCUDA::test_isclose_atol_rtol_greater_than_zero_cuda_int16, test/test_testing.py::TestTestingCUDA::test_isclose_atol_rtol_greater_than_zero_cuda_int32, test/test_testing.py::TestTestingCUDA::test_isclose_atol_rtol_greater_than_zero_cuda_int64, test/test_testing.py::TestTestingCUDA::test_isclose_atol_rtol_greater_than_zero_cuda_int8, test/test_testing.py::TestTestingCUDA::test_isclose_atol_rtol_greater_than_zero_cuda_uint8, test/test_testing.py::TestTestingCUDA::test_isclose_bool_cuda, test/test_testing.py::TestTestingCUDA::test_isclose_complex_cuda_complex128, test/test_testing.py::TestTestingCUDA::test_isclose_complex_cuda_complex64, test/test_testing.py::TestTestingCUDA::test_isclose_equality_shortcut_cuda, test/test_testing.py::TestTestingCUDA::test_isclose_float_cuda_float16, test/test_testing.py::TestTestingCUDA::test_isclose_float_cuda_float32, test/test_testing.py::TestTestingCUDA::test_isclose_float_cuda_float64, test/test_testing.py::TestTestingCUDA::test_isclose_integer_cuda_int16, test/test_testing.py::TestTestingCUDA::test_isclose_integer_cuda_int32, test/test_testing.py::TestTestingCUDA::test_isclose_integer_cuda_int64, test/test_testing.py::TestTestingCUDA::test_isclose_integer_cuda_int8, test/test_testing.py::TestTestingCUDA::test_isclose_integer_cuda_uint8, test/test_testing.py::TestTestingCUDA::test_isclose_nan_equality_shortcut_cuda_complex128, test/test_testing.py::TestTestingCUDA::test_isclose_nan_equality_shortcut_cuda_complex64, test/test_testing.py::TestTestingCUDA::test_isclose_nan_equality_shortcut_cuda_float16, test/test_testing.py::TestTestingCUDA::test_isclose_nan_equality_shortcut_cuda_float32, test/test_testing.py::TestTestingCUDA::test_isclose_nan_equality_shortcut_cuda_float64, test/test_testing.py::TestTestingCUDA::test_setup_and_teardown_run_for_device_specific_tests_cuda, test/test_testing.py::TestTestingCUDA::test_supported_dtypes_abs_cuda, test/test_testing.py::TestFrameworkUtils::test_filtering_env_var, test/test_testing.py::TestAssertClose::test_bool, test/test_testing.py::TestAssertClose::test_default_tolerance_selection_mismatching_dtypes, test/test_testing.py::TestAssertClose::test_docstring_examples, test/test_testing.py::TestAssertClose::test_matching, test/test_testing.py::TestAssertClose::test_matching_atol, test/test_testing.py::TestAssertClose::test_matching_conjugate_bit, test/test_testing.py::TestAssertClose::test_matching_nan, test/test_testing.py::TestAssertClose::test_matching_nan_with_equal_nan, test/test_testing.py::TestAssertClose::test_matching_rtol, test/test_testing.py::TestAssertClose::test_meta, test/test_testing.py::TestAssertClose::test_mismatching_dtype, test/test_testing.py::TestAssertClose::test_mismatching_dtype_no_check, test/test_testing.py::TestAssertClose::test_mismatching_layout, test/test_testing.py::TestAssertClose::test_mismatching_layout_no_check, test/test_testing.py::TestAssertClose::test_mismatching_shape, test/test_testing.py::TestAssertClose::test_mismatching_stride, test/test_testing.py::TestAssertClose::test_mismatching_stride_no_check, test/test_testing.py::TestAssertClose::test_mismatching_types, test/test_testing.py::TestAssertClose::test_mismatching_types_subclasses, test/test_testing.py::TestAssertClose::test_mismatching_types_type_equality, test/test_testing.py::TestAssertClose::test_mismatching_values, test/test_testing.py::TestAssertClose::test_mismatching_values_atol, test/test_testing.py::TestAssertClose::test_mismatching_values_rtol, test/test_testing.py::TestAssertClose::test_none, test/test_testing.py::TestAssertClose::test_none_mismatch, test/test_testing.py::TestAssertClose::test_numpy, test/test_testing.py::TestAssertClose::test_only_atol, test/test_testing.py::TestAssertClose::test_only_rtol, test/test_testing.py::TestAssertClose::test_scalar, test/test_testing.py::TestAssertClose::test_unexpected_error_compare, test/test_testing.py::TestAssertClose::test_unexpected_error_originate, test/test_testing.py::TestAssertClose::test_unknown_layout, test/test_testing.py::TestAssertClose::test_unknown_type, test/test_testing.py::TestAssertCloseMultiDeviceCUDA::test_mismatching_device_cuda, test/test_testing.py::TestAssertCloseMultiDeviceCUDA::test_mismatching_device_no_check_cuda, test/test_testing.py::TestAssertCloseErrorMessage::test_abs_diff, test/test_testing.py::TestAssertCloseErrorMessage::test_abs_diff_scalar, test/test_testing.py::TestAssertCloseErrorMessage::test_atol, test/test_testing.py::TestAssertCloseErrorMessage::test_identifier_scalars, test/test_testing.py::TestAssertCloseErrorMessage::test_identifier_tensor_likes, test/test_testing.py::TestAssertCloseErrorMessage::test_mismatched_elements, test/test_testing.py::TestAssertCloseErrorMessage::test_msg_callable, test/test_testing.py::TestAssertCloseErrorMessage::test_msg_str, test/test_testing.py::TestAssertCloseErrorMessage::test_not_close, test/test_testing.py::TestAssertCloseErrorMessage::test_not_equal, test/test_testing.py::TestAssertCloseErrorMessage::test_rel_diff, test/test_testing.py::TestAssertCloseErrorMessage::test_rel_diff_scalar, test/test_testing.py::TestAssertCloseErrorMessage::test_rtol, test/test_testing.py::TestAssertCloseErrorMessage::test_small_float_dtype, test/test_testing.py::TestAssertCloseErrorMessage::test_zero_div_zero, test/test_testing.py::TestAssertCloseContainer::test_mapping_mismatching_keys, test/test_testing.py::TestAssertCloseContainer::test_mapping_mismatching_values_msg, test/test_testing.py::TestAssertCloseContainer::test_sequence_mismatching_len, test/test_testing.py::TestAssertCloseContainer::test_sequence_mismatching_values_msg, test/test_testing.py::TestAssertCloseSparseCOO::test_matching_coalesced, test/test_testing.py::TestAssertCloseSparseCOO::test_matching_uncoalesced, test/test_testing.py::TestAssertCloseSparseCOO::test_mismatching_indices_msg, test/test_testing.py::TestAssertCloseSparseCOO::test_mismatching_nnz, test/test_testing.py::TestAssertCloseSparseCOO::test_mismatching_sparse_dims, test/test_testing.py::TestAssertCloseSparseCOO::test_mismatching_values_msg, test/test_testing.py::TestAssertCloseSparseCSR::test_matching, test/test_testing.py::TestAssertCloseSparseCSR::test_mismatching_col_indices_msg, test/test_testing.py::TestAssertCloseSparseCSR::test_mismatching_crow_indices_msg, test/test_testing.py::TestAssertCloseSparseCSR::test_mismatching_values_msg, test/test_testing.py::TestAssertCloseSparseCSC::test_matching, test/test_testing.py::TestAssertCloseSparseCSC::test_mismatching_ccol_indices_msg, test/test_testing.py::TestAssertCloseSparseCSC::test_mismatching_row_indices_msg, test/test_testing.py::TestAssertCloseSparseCSC::test_mismatching_values_msg, test/test_testing.py::TestAssertCloseSparseBSR::test_matching, test/test_testing.py::TestAssertCloseSparseBSR::test_mismatching_col_indices_msg, test/test_testing.py::TestAssertCloseSparseBSR::test_mismatching_crow_indices_msg, test/test_testing.py::TestAssertCloseSparseBSR::test_mismatching_values_msg, test/test_testing.py::TestAssertCloseSparseBSC::test_matching, test/test_testing.py::TestAssertCloseSparseBSC::test_mismatching_ccol_indices_msg, test/test_testing.py::TestAssertCloseSparseBSC::test_mismatching_row_indices_msg, test/test_testing.py::TestAssertCloseSparseBSC::test_mismatching_values_msg, test/test_testing.py::TestAssertCloseQuantized::test_matching_per_channel, test/test_testing.py::TestAssertCloseQuantized::test_matching_per_tensor, test/test_testing.py::TestAssertCloseQuantized::test_mismatching_is_quantized, test/test_testing.py::TestAssertCloseQuantized::test_mismatching_qscheme, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral1_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral1_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral1_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral1_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral1_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral1_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral2_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral2_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral2_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral2_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral2_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral2_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_uint8, test/test_testing.py::TestTestParametrization::test_apply_param_specific_decorators, test/test_testing.py::TestTestParametrization::test_compose_param_specific_decorators, test/test_testing.py::TestTestParametrization::test_default_names, test/test_testing.py::TestTestParametrization::test_modules_decorator_misuse_error, test/test_testing.py::TestTestParametrization::test_multiple_handling_of_same_param_error, test/test_testing.py::TestTestParametrization::test_name_fn, test/test_testing.py::TestTestParametrization::test_ops_decorator_misuse_error, test/test_testing.py::TestTestParametrization::test_reparametrize, test/test_testing.py::TestTestParametrization::test_subtest_expected_failure_x_1, test/test_testing.py::TestTestParametrization::test_subtest_expected_failure_x_2, test/test_testing.py::TestTestParametrization::test_subtest_expected_failure_x_3, test/test_testing.py::TestTestParametrization::test_subtest_names, test/test_testing.py::TestTestParametrization::test_two_things_subtest_expected_failure_x_1_y_4, test/test_testing.py::TestTestParametrization::test_two_things_subtest_expected_failure_x_1_y_5, test/test_testing.py::TestTestParametrization::test_two_things_subtest_expected_failure_x_1_y_6, test/test_testing.py::TestTestParametrization::test_two_things_subtest_expected_failure_x_2_y_4, test/test_testing.py::TestTestParametrization::test_two_things_subtest_expected_failure_x_2_y_5, test/test_testing.py::TestTestParametrization::test_two_things_subtest_expected_failure_x_2_y_6, test/test_testing.py::TestTestParametrization::test_two_things_subtest_expected_failure_x_3_y_4, test/test_testing.py::TestTestParametrization::test_two_things_subtest_expected_failure_x_3_y_5, test/test_testing.py::TestTestParametrization::test_two_things_subtest_expected_failure_x_3_y_6, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_default_name_non_primitive_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_default_names_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_dtypes_composition_invalid_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_dtypes_composition_valid_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_empty_param_list_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_empty_param_names_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_modules_composition_names_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_modules_decorator_applies_module_and_param_specific_decorators_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_multiple_handling_of_same_param_error_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_name_fn_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_ops_composition_names_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_ops_decorator_applies_op_and_param_specific_decorators_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_param_specific_decoration_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_subtest_expected_failure_x_1_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_subtest_expected_failure_x_2_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_subtest_expected_failure_x_3_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_subtest_names_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_two_things_subtest_expected_failure_x_1_y_4_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_two_things_subtest_expected_failure_x_1_y_5_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_two_things_subtest_expected_failure_x_1_y_6_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_two_things_subtest_expected_failure_x_2_y_4_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_two_things_subtest_expected_failure_x_2_y_5_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_two_things_subtest_expected_failure_x_2_y_6_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_two_things_subtest_expected_failure_x_3_y_4_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_two_things_subtest_expected_failure_x_3_y_5_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_two_things_subtest_expected_failure_x_3_y_6_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_unparametrized_names_cuda, test/test_testing.py::TestImports::test_circular_dependencies, test/test_testing.py::TestImports::test_lazy_imports_are_lazy, test/test_testing.py::TestImports::test_no_mutate_global_logging_on_import_path_functorch, test/test_testing.py::TestImports::test_no_mutate_global_logging_on_import_path_torch, test/test_testing.py::TestImports::test_no_warning_on_import, test/test_testing.py::TestImports::test_not_import_sympy, test/test_testing.py::TestOpInfos::test_sample_input, test/test_testing.py::TestOpInfos::test_sample_input_metadata, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_T_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators___radd___cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators___rand___cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators___rdiv___cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators___rmod___cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators___rmul___cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators___ror___cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators___rpow___cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators___rsub___cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators___rxor___cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators__chunk_cat_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_add_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_amax_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_amin_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_aminmax_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_arange_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_as_strided_scatter_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_atan2_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_bernoulli_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_bitwise_and_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_bitwise_left_shift_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_bitwise_or_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_bitwise_right_shift_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_bitwise_xor_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_bucketize_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_cat_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_cauchy_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_clamp_max_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_clamp_min_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_complex_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_copysign_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_cov_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_diag_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_diag_embed_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_diagonal_copy_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_diagonal_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_diff_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_div_floor_rounding_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_div_no_rounding_mode_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_div_trunc_rounding_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_dot_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_dsplit_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_dstack_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_empty_permuted_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_eq_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_exponential_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_eye_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_fft2_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_fft_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_fftn_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_hfft2_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_hfft_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_hfftn_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_ifft2_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_ifft_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_ifftn_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_ihfft2_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_ihfft_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_ihfftn_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_irfft2_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_irfft_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_irfftn_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_rfft2_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_rfft_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_rfftn_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fliplr_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_flipud_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_float_power_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_floor_divide_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fmax_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fmin_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fmod_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_gather_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_gcd_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_ge_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_geometric_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_gradient_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_gt_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_heaviside_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_histogramdd_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_hsplit_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_hstack_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_hypot_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_igamma_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_igammac_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_index_add_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_index_select_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_isclose_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_item_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_jiterator_binary_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_jiterator_binary_return_by_ref_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_kthvalue_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_lcm_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_ldexp_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_le_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_linalg_cross_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_linalg_diagonal_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_linalg_lstsq_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_linalg_lstsq_grad_oriented_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_linspace_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_linspace_tensor_overload_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_log_normal_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_logaddexp_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_logcumsumexp_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_logical_and_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_logical_or_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_logical_xor_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_logspace_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_logspace_tensor_overload_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_lt_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_masked_fill_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_masked_scatter_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_masked_select_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_max_binary_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_maximum_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_mean_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_median_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_min_binary_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_minimum_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_movedim_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_mul_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_multinomial_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_narrow_copy_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_narrow_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_native_layer_norm_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_ne_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_neg_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nextafter_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_adaptive_avg_pool1d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_adaptive_avg_pool2d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_adaptive_avg_pool3d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_adaptive_max_pool1d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_adaptive_max_pool2d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_adaptive_max_pool3d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_avg_pool1d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_avg_pool2d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_avg_pool3d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_conv1d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_conv2d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_conv3d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_embedding_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_gaussian_nll_loss_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_gelu_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_group_norm_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_hardtanh_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_hinge_embedding_loss_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_huber_loss_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_l1_loss_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_margin_ranking_loss_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_max_pool1d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_max_pool2d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_max_pool3d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_multi_margin_loss_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_multilabel_margin_loss_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_poisson_nll_loss_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_prelu_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_rms_norm_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_rrelu_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_soft_margin_loss_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_softshrink_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_triplet_margin_loss_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_triplet_margin_with_distance_loss_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_normal_in_place_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_ormqr_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_polar_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_pow_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_remainder_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_renorm_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_reshape_as_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_reshape_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_roll_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_rot90_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_rsub_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_scatter_add_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_scatter_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_signal_windows_bartlett_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_signal_windows_blackman_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_signal_windows_cosine_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_signal_windows_exponential_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_signal_windows_gaussian_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_signal_windows_general_cosine_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_signal_windows_general_hamming_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_signal_windows_hamming_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_signal_windows_hann_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_signal_windows_kaiser_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_signal_windows_nuttall_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_chebyshev_polynomial_t_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_chebyshev_polynomial_u_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_chebyshev_polynomial_v_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_chebyshev_polynomial_w_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_hermite_polynomial_h_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_hermite_polynomial_he_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_laguerre_polynomial_l_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_legendre_polynomial_p_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_shifted_chebyshev_polynomial_t_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_shifted_chebyshev_polynomial_u_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_shifted_chebyshev_polynomial_v_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_shifted_chebyshev_polynomial_w_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_xlog1py_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_zeta_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_sub_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_sum_to_size_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_t_copy_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_t_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_take_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_trace_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_tril_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_triu_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_true_divide_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_unbind_copy_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_unbind_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_uniform_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_vdot_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_view_as_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_view_copy_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_view_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_vsplit_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_vstack_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_where_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_xlogy_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators___radd___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators___rand___cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators___rdiv___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators___rmod___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators___rmul___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators___ror___cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators___rpow___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators___rsub___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators___rxor___cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_abs_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_acos_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_acosh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_add_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_addcdiv_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_addcmul_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_angle_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_asin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_asinh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_atan2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_atan_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_atanh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_bfloat16_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_bitwise_and_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_bitwise_left_shift_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_bitwise_not_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_bitwise_or_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_bitwise_right_shift_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_bitwise_xor_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_bool_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_broadcast_tensors_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_bucketize_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_byte_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_cat_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_cdouble_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_ceil_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_cfloat_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_chalf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_char_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_chunk_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_clamp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_clamp_max_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_clamp_min_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_clone_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_complex_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_conj_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_conj_physical_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_contiguous_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_copysign_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_cos_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_cosh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_deg2rad_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_diag_embed_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_diagonal_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_diagonal_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_digamma_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_div_floor_rounding_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_div_no_rounding_mode_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_div_trunc_rounding_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_double_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_empty_like_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_eq_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_erf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_erfc_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_erfinv_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_exp2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_exp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_expm1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_fill_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_flatten_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_float_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_float_power_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_floor_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_floor_divide_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_fmax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_fmin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_fmod_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_frac_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_frexp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_gcd_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_ge_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_gt_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_half_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_heaviside_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_hypot_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_i0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_igamma_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_igammac_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_imag_cuda_complex64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_index_add_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_index_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_index_fill_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_index_select_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_int_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_isclose_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_isfinite_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_isinf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_isnan_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_isneginf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_isposinf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_isreal_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_jiterator_binary_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_jiterator_binary_return_by_ref_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_jiterator_unary_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_lcm_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_ldexp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_le_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_lgamma_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_log10_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_log1p_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_log2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_log_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_logaddexp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_logical_and_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_logical_not_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_logical_or_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_logical_xor_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_logit_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_logsumexp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_long_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_lt_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_max_binary_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_maximum_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_min_binary_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_minimum_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_movedim_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_mul_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nan_to_num_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_narrow_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_narrow_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_ne_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_neg_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nextafter_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_celu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_elu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_grid_sample_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_group_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_hardshrink_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_hardsigmoid_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_hardtanh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_hinge_embedding_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_interpolate_bicubic_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_interpolate_bilinear_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_logsigmoid_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_margin_ranking_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_mish_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_multi_margin_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_multilabel_margin_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_prelu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_relu6_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_relu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_rrelu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_selu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_silu_complex_cuda_complex64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_silu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_softplus_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_softshrink_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_softsign_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_tanhshrink_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_threshold_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_upsample_bilinear_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_permute_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_permute_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_polar_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_polygamma_polygamma_n_0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_polygamma_polygamma_n_1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_polygamma_polygamma_n_2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_polygamma_polygamma_n_3_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_polygamma_polygamma_n_4_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_positive_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_pow_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_rad2deg_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_real_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_reciprocal_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_remainder_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_reshape_as_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_reshape_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_round_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_round_decimals_0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_round_decimals_3_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_round_decimals_neg_3_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_rsqrt_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_rsub_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_sgn_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_short_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_sigmoid_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_sign_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signal_windows_bartlett_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signal_windows_blackman_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signal_windows_cosine_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signal_windows_exponential_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signal_windows_gaussian_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signal_windows_general_cosine_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signal_windows_general_hamming_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signal_windows_hamming_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signal_windows_hann_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signal_windows_kaiser_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signal_windows_nuttall_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signbit_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_sin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_sinc_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_sinh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_airy_ai_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_bessel_j0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_bessel_j1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_bessel_y0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_bessel_y1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_chebyshev_polynomial_t_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_chebyshev_polynomial_u_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_chebyshev_polynomial_v_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_chebyshev_polynomial_w_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_entr_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_erfcx_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_hermite_polynomial_h_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_hermite_polynomial_he_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_i0e_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_i1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_i1e_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_laguerre_polynomial_l_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_legendre_polynomial_p_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_log_ndtr_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_modified_bessel_i0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_modified_bessel_i1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_modified_bessel_k0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_modified_bessel_k1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_ndtr_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_ndtri_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_scaled_modified_bessel_k0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_scaled_modified_bessel_k1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_spherical_bessel_j0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_xlog1py_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_zeta_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_sqrt_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_square_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_sub_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_tan_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_tanh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_true_divide_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_trunc_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_unsafe_chunk_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_view_as_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_view_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_where_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_xlogy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_H_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_T_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators___getitem___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators___radd___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators___rand___cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators___rdiv___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators___rmatmul___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators___rmod___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators___rmul___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators___ror___cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators___rpow___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators___rsub___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators___rxor___cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators__batch_norm_with_update_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators__chunk_cat_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators__native_batch_norm_legit_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators__segment_reduce_lengths_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators__segment_reduce_offsets_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators__softmax_backward_data_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators__unsafe_masked_index_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators__unsafe_masked_index_put_accumulate_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators__upsample_bilinear2d_aa_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_abs_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_acos_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_acosh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_add_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_addbmm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_addcdiv_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_addcmul_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_addmm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_addmm_decomposed_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_addmv_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_addr_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_alias_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_all_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_allclose_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_amax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_amin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_aminmax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_angle_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_any_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_arange_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_argmax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_argmin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_argsort_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_argwhere_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_as_strided_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_as_strided_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_as_strided_partial_views_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_as_strided_scatter_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_asin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_asinh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_atan2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_atan_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_atanh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_atleast_1d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_atleast_2d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_atleast_3d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_baddbmm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bernoulli_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bfloat16_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bincount_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bitwise_and_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bitwise_left_shift_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bitwise_not_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bitwise_or_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bitwise_right_shift_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bitwise_xor_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_block_diag_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bmm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bool_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_broadcast_shapes_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_broadcast_tensors_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_broadcast_to_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bucketize_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_byte_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cartesian_prod_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cat_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cauchy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cdist_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cdouble_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_ceil_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cfloat_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_chalf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_char_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cholesky_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cholesky_inverse_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cholesky_solve_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_chunk_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_clamp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_clamp_max_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_clamp_min_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_clone_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_column_stack_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_combinations_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_complex_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_conj_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_conj_physical_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_constant_pad_nd_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_contiguous_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_copysign_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_corrcoef_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cos_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cosh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_count_nonzero_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cov_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cross_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cummax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cummin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cumprod_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cumsum_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cumulative_trapezoid_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_deg2rad_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_diag_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_diag_embed_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_diagflat_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_diagonal_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_diagonal_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_diagonal_scatter_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_diff_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_digamma_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_dist_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_div_floor_rounding_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_div_no_rounding_mode_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_div_trunc_rounding_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_dot_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_double_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_dsplit_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_dstack_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_einsum_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_empty_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_empty_like_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_empty_permuted_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_empty_strided_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_eq_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_equal_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_erf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_erfc_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_erfinv_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_exp2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_exp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_expand_as_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_expand_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_expand_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_expm1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_exponential_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_eye_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_fft2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_fft_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_fftn_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_fftshift_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_hfft2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_hfft_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_hfftn_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_ifft2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_ifft_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_ifftn_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_ifftshift_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_ihfft2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_ihfft_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_ihfftn_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_irfft2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_irfft_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_irfftn_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_rfft2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_rfft_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_rfftn_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fill_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_flatten_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_flip_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fliplr_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_flipud_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_float_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_float_power_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_floor_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_floor_divide_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fmax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fmin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fmod_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_frac_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_frexp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_full_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_full_like_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_gather_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_gcd_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_ge_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_geometric_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_geqrf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_gradient_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_grid_sampler_2d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_grid_sampler_3d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_gt_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_half_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_hash_tensor_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_heaviside_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_histc_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_hsplit_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_hstack_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_hypot_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_i0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_igamma_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_igammac_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_imag_cuda_complex64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_index_add_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_index_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_index_fill_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_index_put_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_index_reduce_amax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_index_reduce_amin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_index_reduce_mean_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_index_reduce_prod_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_index_select_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_inner_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_int_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_isclose_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_isfinite_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_isin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_isinf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_isnan_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_isneginf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_isposinf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_isreal_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_istft_cuda_complex64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_item_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_jiterator_2inputs_2outputs_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_jiterator_4inputs_with_extra_args_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_jiterator_binary_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_jiterator_binary_return_by_ref_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_jiterator_unary_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_kron_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_kthvalue_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_lcm_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_ldexp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_le_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_lerp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_lgamma_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_cholesky_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_cholesky_ex_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_cond_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_cross_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_det_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_diagonal_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_eig_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_eigh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_eigvals_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_eigvalsh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_householder_product_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_inv_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_inv_ex_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_ldl_factor_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_ldl_factor_ex_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_ldl_solve_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_lstsq_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_lstsq_grad_oriented_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_lu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_lu_factor_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_lu_factor_ex_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_lu_solve_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_matrix_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_matrix_power_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_matrix_rank_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_matrix_rank_hermitian_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_multi_dot_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_norm_subgradients_at_zero_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_pinv_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_pinv_hermitian_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_pinv_singular_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_qr_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_slogdet_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_solve_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_solve_ex_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_solve_triangular_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_svd_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_svdvals_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_tensorinv_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_tensorsolve_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_vander_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_vecdot_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_vector_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linspace_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linspace_tensor_overload_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_log10_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_log1p_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_log2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_log_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_log_normal_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_log_softmax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_log_softmax_with_dtype_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logaddexp2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logaddexp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logcumsumexp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logdet_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logical_and_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logical_not_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logical_or_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logical_xor_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logit_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logspace_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logspace_tensor_overload_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logsumexp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_long_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_lt_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_lu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_lu_solve_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_lu_unpack_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_mH_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_mT_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_amax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_amin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_argmax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_argmin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_cumprod_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_cumsum_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_fill_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_log_softmax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_logaddexp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_logsumexp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_mean_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_median_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_normalize_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_prod_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_scatter_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_select_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_softmax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_softmin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_std_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_sum_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_var_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_matmul_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_matrix_exp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_max_binary_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_max_pool2d_with_indices_backward_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_max_reduction_no_dim_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_max_reduction_with_dim_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_maximum_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_mean_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_median_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_meshgrid_list_of_tensors_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_meshgrid_variadic_tensors_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_min_binary_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_min_reduction_no_dim_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_min_reduction_with_dim_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_minimum_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_mm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_mode_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_movedim_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_msort_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_mul_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_multinomial_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_mv_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nan_to_num_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nanmean_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nanmedian_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nanquantile_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nansum_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_narrow_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_narrow_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_native_batch_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_native_dropout_backward_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_native_layer_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_ne_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_neg_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_new_empty_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_new_empty_strided_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_new_full_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_new_ones_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_new_zeros_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nextafter_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_alpha_dropout_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_avg_pool1d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_avg_pool2d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_avg_pool3d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_batch_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_bilinear_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_binary_cross_entropy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_celu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_channel_shuffle_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_conv1d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_conv2d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_conv3d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_conv_transpose1d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_conv_transpose2d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_conv_transpose3d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_cosine_embedding_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_cosine_similarity_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_cross_entropy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_ctc_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_dropout2d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_dropout3d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_dropout_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_elu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_embedding_bag_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_embedding_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_fractional_max_pool2d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_fractional_max_pool3d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_gaussian_nll_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_gelu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_glu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_grid_sample_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_group_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_hardshrink_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_hardsigmoid_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_hardswish_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_hardtanh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_hinge_embedding_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_huber_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_instance_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_interpolate_area_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_interpolate_bicubic_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_interpolate_bilinear_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_interpolate_linear_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_interpolate_nearest_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_interpolate_trilinear_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_kl_div_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_l1_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_layer_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_leaky_relu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_linear_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_local_response_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_logsigmoid_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_margin_ranking_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_max_pool1d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_max_pool2d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_max_pool3d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_max_unpool1d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_max_unpool1d_grad_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_max_unpool2d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_max_unpool2d_grad_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_max_unpool3d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_max_unpool3d_grad_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_mish_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_mse_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_multi_head_attention_forward_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_multi_margin_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_multilabel_margin_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_nll_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_normalize_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_one_hot_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_pad_circular_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_pad_constant_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_pad_reflect_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_pad_replicate_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_pad_replicate_negative_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_pairwise_distance_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_pdist_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_pixel_shuffle_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_pixel_unshuffle_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_poisson_nll_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_prelu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_relu6_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_relu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_rms_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_rrelu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_selu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_silu_complex_cuda_complex64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_silu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_smooth_l1_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_soft_margin_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_softmin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_softmin_with_dtype_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_softplus_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_softshrink_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_softsign_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_tanhshrink_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_threshold_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_triplet_margin_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_unfold_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_upsample_bilinear_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_upsample_nearest_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nonzero_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nonzero_static_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_norm_fro_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_norm_inf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_norm_nuc_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_normal_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_normal_in_place_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_normal_number_mean_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_ones_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_ones_like_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_ormqr_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_outer_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_pca_lowrank_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_permute_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_permute_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_pinverse_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_polar_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_polygamma_polygamma_n_0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_polygamma_polygamma_n_1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_polygamma_polygamma_n_2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_polygamma_polygamma_n_3_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_polygamma_polygamma_n_4_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_positive_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_pow_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_prod_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_put_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_qr_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_quantile_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_rad2deg_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_rand_like_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_randint_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_randint_like_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_randn_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_randn_like_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_ravel_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_real_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_reciprocal_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_remainder_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_renorm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_repeat_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_repeat_interleave_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_reshape_as_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_reshape_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_resize__cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_resize_as__cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_resolve_conj_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_resolve_neg_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_roll_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_rot90_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_round_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_round_decimals_0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_round_decimals_3_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_round_decimals_neg_3_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_rsqrt_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_rsub_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_scalar_tensor_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_scatter_add_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_scatter_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_scatter_reduce_amax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_scatter_reduce_amin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_scatter_reduce_mean_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_scatter_reduce_prod_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_scatter_reduce_sum_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_searchsorted_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_select_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_select_scatter_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sgn_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_short_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sigmoid_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sign_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signal_windows_bartlett_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signal_windows_blackman_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signal_windows_cosine_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signal_windows_exponential_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signal_windows_gaussian_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signal_windows_general_cosine_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signal_windows_general_hamming_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signal_windows_hamming_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signal_windows_hann_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signal_windows_kaiser_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signal_windows_nuttall_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signbit_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sinc_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sinh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_slice_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_slice_scatter_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_softmax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_softmax_with_dtype_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sort_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sparse_mm_reduce_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sparse_sampled_addmm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_airy_ai_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_bessel_j0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_bessel_j1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_bessel_y0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_bessel_y1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_chebyshev_polynomial_t_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_chebyshev_polynomial_u_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_chebyshev_polynomial_v_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_chebyshev_polynomial_w_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_entr_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_erfcx_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_hermite_polynomial_h_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_hermite_polynomial_he_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_i0e_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_i1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_i1e_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_laguerre_polynomial_l_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_legendre_polynomial_p_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_log_ndtr_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_modified_bessel_i0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_modified_bessel_i1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_modified_bessel_k0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_modified_bessel_k1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_ndtr_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_ndtri_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_scaled_modified_bessel_k0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_scaled_modified_bessel_k1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_spherical_bessel_j0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_xlog1py_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_zeta_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_split_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_split_list_args_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_split_with_sizes_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_split_with_sizes_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sqrt_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_square_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_squeeze_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_squeeze_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_squeeze_multiple_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_stack_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_std_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_std_mean_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_std_mean_unbiased_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_std_unbiased_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_stft_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sub_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sum_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sum_to_size_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_svd_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_svd_lowrank_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_t_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_t_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_take_along_dim_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_take_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_tan_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_tanh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_tensor_split_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_tensordot_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_tile_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_to_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_to_sparse_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_topk_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_torch__scaled_mm_cuda_float8_e4m3fn, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_trace_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_transpose_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_transpose_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_trapezoid_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_trapz_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_triangular_solve_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_tril_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_tril_indices_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_triu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_triu_indices_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_true_divide_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_trunc_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unbind_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unbind_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unflatten_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unfold_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unfold_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_uniform_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unique_consecutive_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unique_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unravel_index_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unsafe_chunk_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unsafe_split_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unsqueeze_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unsqueeze_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_var_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_var_mean_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_var_mean_unbiased_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_var_unbiased_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_vdot_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_view_as_complex_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_view_as_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_view_as_real_cuda_complex64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_view_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_view_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_vsplit_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_vstack_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_where_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_xlogy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_zero__cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_zeros_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_zeros_like_cuda_float32 2025-10-10T01:21:09.1696481Z 2025-10-10T01:21:12.5812242Z Running inductor/test_aot_inductor 1/1 ... [2025-10-10 01:21:12.580589] 2025-10-10T01:21:12.5813978Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:21:12.5815021Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:21:12.581064] 2025-10-10T01:21:12.8996342Z Running inductor/test_torchinductor 1/1 ... [2025-10-10 01:21:12.898928] 2025-10-10T01:21:12.8997284Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:21:12.8998733Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:21:12.899349] 2025-10-10T01:24:28.8424710Z 2025-10-10T01:24:28.8425933Z test_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_1.1_70958335944aa2bf_.log 2025-10-10T01:24:29.8972626Z Running 33605 items in this shard: test/test_ops.py::TestSelfKwarg::test_self_kwargs, test/test_ops.py::TestCommonCUDA::test_compare_cpu_H_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu___getitem___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu___radd___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu___rand___cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu___rdiv___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu___rmatmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu___rmod___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu___rmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu___ror___cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu___rpow___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu___rsub___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu___rxor___cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__chunk_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_cdouble_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_cfloat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_chalf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_half_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_bitwise_left_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_block_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_bucketize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_contiguous_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_diagonal_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_fft_fftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_movedim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_new_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_normal__in_place_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_repeat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_reshape_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_rot90_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_select_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_special_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_special_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_special_zeta_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_split_with_sizes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_sum_to_size_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_trace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_transpose_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_unflatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_unsqueeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_view_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__softmax_backward_data_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_addmm_decomposed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_addmv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_argsort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_baddbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_bernoulli_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_bincount_cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu_bitwise_left_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu_block_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_bmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_bucketize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cartesian_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cdouble_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cfloat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_chalf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cholesky_inverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cholesky_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_combinations_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_contiguous_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_corrcoef_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cov_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cummax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cummin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_diagonal_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_dist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_einsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_empty_permuted_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_fft_fftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_full_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_gather_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_geqrf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_gradient_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_grid_sampler_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_half_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_histc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_inner_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_isin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_compare_cpu_kron_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_kthvalue_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_ldexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_cond_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_det_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_eig_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_eigh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_eigvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_householder_product_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_inv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_lstsq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_pinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_slogdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_logcumsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_logdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_lu_unpack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_mH_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_mT_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_softmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_matrix_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_mm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_movedim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_msort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_multinomial_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nanmedian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nanquantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_native_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_native_dropout_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_new_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_embedding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_softmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nonzero_static_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_norm_fro_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_norm_inf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_norm_nuc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_normal_in_place_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_ones_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_ormqr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_outer_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_pca_lowrank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_pinverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_quantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_rand_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_randint_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_randint_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_randn_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_repeat_interleave_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_reshape_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_resize__cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_resize_as__cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_resolve_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_resolve_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_rot90_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scalar_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_select_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_slice_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_slice_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_sort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_zeta_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_split_list_args_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_split_with_sizes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_std_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_sum_to_size_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_svd_lowrank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_take_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_tensordot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_to_sparse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_topk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_torch__scaled_mm_cuda_float8_e4m3fn, test/test_ops.py::TestCommonCUDA::test_compare_cpu_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_trace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_transpose_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_trapezoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_trapz_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_triangular_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_unflatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_uniform_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_unique_consecutive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_unique_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_unsafe_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_unsafe_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_unsqueeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_var_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_view_as_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_compare_cpu_view_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_zero__cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_zeros_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_H_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_T_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing___getitem___cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing__chunk_cat_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_abs_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_acos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_acosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_add_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_alias_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_angle_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_as_strided_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_as_strided_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_as_strided_partial_views_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_as_strided_scatter_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_asin_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_asinh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_atan_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_atanh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_atleast_1d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_atleast_2d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_atleast_3d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_bfloat16_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_block_diag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_bool_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_cat_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_cdouble_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_cfloat_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_chalf_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_char_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_chunk_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_clone_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_column_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_conj_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_conj_physical_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_contiguous_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_cos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_cosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_diag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_diag_embed_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_diagonal_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_diagonal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_div_no_rounding_mode_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_double_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_dsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_dstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_empty_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_empty_like_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_empty_permuted_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_eq_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_exp_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_fft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_fft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_fftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_fftshift_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_hfft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_hfft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_hfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_ifft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_ifft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_ifftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_ifftshift_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_irfft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_irfft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_irfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_flatten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_float_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_full_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_hsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_hstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_imag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_index_add_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_index_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_index_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_index_put_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_index_select_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_isfinite_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_isinf_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_isreal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_item_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_lerp_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_linalg_diagonal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_log_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_long_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_mH_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_mT_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_masked_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_movedim_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_mul_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nanmean_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nansum_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_narrow_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_narrow_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_neg_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_new_empty_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_new_empty_strided_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_new_full_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_new_ones_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_new_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nn_functional_conv1d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nn_functional_conv2d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nn_functional_conv3d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nn_functional_conv_transpose1d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nn_functional_conv_transpose2d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nn_functional_conv_transpose3d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nonzero_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nonzero_static_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_ones_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_ones_like_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_permute_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_permute_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_positive_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_pow_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_prod_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_rand_like_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_randn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_randn_like_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_ravel_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_real_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_repeat_interleave_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_reshape_as_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_reshape_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_resolve_neg_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_roll_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_rsqrt_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_scalar_tensor_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_select_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_sgn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_sigmoid_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_sin_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_sinh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_slice_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_split_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_split_with_sizes_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_split_with_sizes_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_sqrt_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_squeeze_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_squeeze_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_squeeze_multiple_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_sub_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_sum_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_tan_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_tanh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_trace_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_transpose_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_transpose_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_tril_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_triu_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_true_divide_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_unbind_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_unbind_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_unflatten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_unfold_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_unfold_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_unsafe_chunk_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_unsafe_split_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_unsqueeze_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_unsqueeze_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_view_as_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_view_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_vsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_vstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_where_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_zeros_like_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_dtypes_H_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_T_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes___getitem___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes___radd___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes___rand___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes___rdiv___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes___rmatmul___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes___rmod___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes___rmul___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes___ror___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes___rpow___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes___rsub___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes___rxor___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__batch_norm_with_update_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__chunk_cat_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__native_batch_norm_legit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_T_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_bfloat16_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_bool_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_byte_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_cdouble_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_cfloat_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_chalf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_char_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_complex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_double_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_float_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_half_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_int_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_long_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_polar_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_short_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_abs_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_acos_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_acosh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_add_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_addcdiv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_addcmul_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_addr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_alias_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_all_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_allclose_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_amax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_amin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_any_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_arange_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_as_strided_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_as_strided_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_as_strided_partial_views_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_as_strided_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_asin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_asinh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_atan2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_atan_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_atanh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_atleast_1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_atleast_2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_atleast_3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_bitwise_and_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_bitwise_left_shift_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_bitwise_not_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_bitwise_or_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_bitwise_right_shift_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_bitwise_xor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_block_diag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_broadcast_shapes_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_broadcast_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_broadcast_to_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_bucketize_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_cat_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_cauchy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_ceil_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_chunk_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_clamp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_clamp_max_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_clamp_min_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_clone_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_column_stack_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_conj_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_conj_physical_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_constant_pad_nd_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_contiguous_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_copysign_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_cos_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_cosh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_count_nonzero_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_cumprod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_cumsum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_deg2rad_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_diag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_diag_embed_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_diagonal_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_diagonal_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_digamma_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_div_floor_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_div_no_rounding_mode_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_div_trunc_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_dot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_dsplit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_dstack_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_empty_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_empty_like_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_empty_strided_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_eq_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_equal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_erf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_erfc_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_erfinv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_exp2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_exp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_expand_as_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_expand_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_expand_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_expm1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_exponential_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_eye_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_fft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_fft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_fftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_fftshift_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_hfft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_hfft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_hfftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_ifft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_ifft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_ifftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_ifftshift_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_ihfft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_ihfft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_ihfftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_irfft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_irfft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_irfftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_rfft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_rfft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_rfftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fill_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_flatten_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_flip_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fliplr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_flipud_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_float_power_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_floor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_floor_divide_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fmax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fmin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fmod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_frac_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_frexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_gcd_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_ge_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_geometric_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_gt_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_heaviside_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_hsplit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_hstack_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_hypot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_i0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_igamma_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_igammac_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_imag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_index_add_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_index_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_index_fill_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_index_select_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_isclose_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_isfinite_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_isinf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_isnan_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_isneginf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_isposinf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_isreal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_istft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_item_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_lcm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_le_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_lerp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_lgamma_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_cross_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_matrix_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_svd_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_svdvals_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_vecdot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_vector_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linspace_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_log10_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_log1p_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_log2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_log_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_log_normal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_log_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_logaddexp2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_logaddexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_logical_and_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_logical_not_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_logical_or_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_logical_xor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_logspace_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_logspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_logsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_lt_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_masked_fill_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_maximum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_meshgrid_list_of_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_meshgrid_variadic_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_minimum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_movedim_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_mul_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nan_to_num_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_narrow_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_narrow_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_native_layer_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_ne_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_neg_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_new_empty_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_new_empty_strided_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_new_full_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_new_ones_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_new_zeros_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nextafter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_alpha_dropout_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_celu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_channel_shuffle_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_dropout_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_elu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_gelu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_glu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_group_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_hardshrink_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_hardtanh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_hinge_embedding_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_huber_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_layer_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_leaky_relu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_log_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_margin_ranking_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_mish_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_mse_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_pairwise_distance_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_pdist_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_pixel_shuffle_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_pixel_unshuffle_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_poisson_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_prelu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_relu6_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_relu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_selu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_smooth_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_softmin_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_softplus_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_softshrink_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_tanhshrink_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_threshold_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_triplet_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_normal__in_place_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_normal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_normal_number_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_ones_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_permute_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_permute_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_positive_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_pow_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_prod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_rad2deg_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_randn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_ravel_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_real_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_reciprocal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_remainder_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_renorm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_repeat_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_reshape_as_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_reshape_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_roll_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_rot90_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_round_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_rsqrt_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_rsub_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_select_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sgn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sigmoid_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sign_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_signbit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sinc_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sinh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_bessel_j0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_bessel_j1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_entr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_erfcx_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_i0e_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_i1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_i1e_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_log_ndtr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_log_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_logit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_multigammaln_mvlgamma_p_1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_multigammaln_mvlgamma_p_3_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_multigammaln_mvlgamma_p_5_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_ndtr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_ndtri_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_spherical_bessel_j0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_xlog1py_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_zeta_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_split_with_sizes_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sqrt_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_square_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_squeeze_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_squeeze_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_squeeze_multiple_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_stack_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_std_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_std_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_stft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sub_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sum_to_size_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_t_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_t_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_take_along_dim_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_tan_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_tanh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_tensor_split_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_to_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_trace_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_transpose_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_transpose_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_tril_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_tril_indices_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_triu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_triu_indices_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_true_divide_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_trunc_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_unbind_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_unbind_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_unflatten_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_unfold_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_unfold_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_unsqueeze_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_unsqueeze_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_var_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_var_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_vdot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_view_as_complex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_view_as_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_view_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_view_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_vsplit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_vstack_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_where_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_xlogy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_zeros_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__segment_reduce_lengths_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__segment_reduce_offsets_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__softmax_backward_data_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__unsafe_masked_index_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__unsafe_masked_index_put_accumulate_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__upsample_bilinear2d_aa_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_abs_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_acos_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_acosh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_add_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addbmm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addcdiv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addcmul_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addmm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addmm_decomposed_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addmv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_alias_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_all_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_allclose_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_amax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_amin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_aminmax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_angle_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_any_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_arange_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_argmax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_argmin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_argsort_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_argwhere_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_as_strided_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_as_strided_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_as_strided_partial_views_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_as_strided_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_asin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_asinh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_atan2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_atan_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_atanh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_atleast_1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_atleast_2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_atleast_3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_baddbmm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bernoulli_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bfloat16_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bincount_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bitwise_and_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bitwise_left_shift_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bitwise_not_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bitwise_or_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bitwise_right_shift_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bitwise_xor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_block_diag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bmm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bool_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_broadcast_shapes_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_broadcast_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_broadcast_to_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bucketize_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_byte_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cartesian_prod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cat_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cauchy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cdist_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cdouble_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ceil_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cfloat_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_chalf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_char_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cholesky_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cholesky_inverse_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cholesky_solve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_chunk_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_clamp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_clamp_max_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_clamp_min_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_clone_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_column_stack_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_combinations_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_complex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_conj_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_conj_physical_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_constant_pad_nd_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_contiguous_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_copysign_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_corrcoef_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cos_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cosh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_count_nonzero_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cov_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cross_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cummax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cummin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cumprod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cumsum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cumulative_trapezoid_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_deg2rad_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diag_embed_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diagflat_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diagonal_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diagonal_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diff_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_digamma_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_dist_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_div_floor_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_div_no_rounding_mode_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_div_trunc_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_dot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_double_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_dsplit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_dstack_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_einsum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_empty_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_empty_like_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_empty_permuted_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_empty_strided_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_eq_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_equal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_erf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_erfc_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_erfinv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_exp2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_exp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_expand_as_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_expand_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_expand_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_expm1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_exponential_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_eye_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_fft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_fft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_fftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_fftshift_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_hfft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_hfft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_hfftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_ifft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_ifft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_ifftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_ifftshift_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_ihfft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_ihfft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_ihfftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_irfft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_irfft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_irfftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_rfft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_rfft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_rfftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fill_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_flatten_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_flip_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fliplr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_flipud_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_float_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_float_power_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_floor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_floor_divide_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fmax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fmin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fmod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_frac_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_frexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_full_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_full_like_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_gather_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_gcd_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ge_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_geometric_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_geqrf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_gradient_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_grid_sampler_2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_grid_sampler_3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_gt_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_half_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_hash_tensor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_heaviside_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_histc_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_histogram_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_histogramdd_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_hsplit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_hstack_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_hypot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_i0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_igamma_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_igammac_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_imag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_add_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_fill_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_put_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_reduce_amax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_reduce_amin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_reduce_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_reduce_prod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_select_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_inner_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_int_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_isclose_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_isfinite_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_isin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_isinf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_isnan_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_isneginf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_isposinf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_isreal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_istft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_item_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_jiterator_2inputs_2outputs_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_jiterator_4inputs_with_extra_args_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_jiterator_binary_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_jiterator_binary_return_by_ref_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_jiterator_unary_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_kron_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_kthvalue_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_lcm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ldexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_le_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_lerp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_lgamma_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_cholesky_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_cholesky_ex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_cond_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_cross_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_det_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_eig_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_eigh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_eigvals_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_eigvalsh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_householder_product_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_inv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_inv_ex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_ldl_factor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_ldl_factor_ex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_ldl_solve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_lstsq_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_lstsq_grad_oriented_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_lu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_lu_factor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_lu_factor_ex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_lu_solve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_matrix_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_matrix_power_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_matrix_rank_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_matrix_rank_hermitian_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_multi_dot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_norm_subgradients_at_zero_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_pinv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_pinv_hermitian_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_pinv_singular_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_qr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_slogdet_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_solve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_solve_ex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_solve_triangular_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_svd_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_svdvals_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_tensorinv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_tensorsolve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_vander_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_vecdot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_vector_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linspace_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_log10_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_log1p_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_log2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_log_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_log_normal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_log_softmax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_log_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logaddexp2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logaddexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logcumsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logdet_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logical_and_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logical_not_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logical_or_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logical_xor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logspace_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_long_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_lt_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_lu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_lu_solve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_lu_unpack_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mH_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mT_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_amax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_amin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_argmax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_argmin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_cumprod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_cumsum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_fill_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_log_softmax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_logaddexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_logsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_median_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_normalize_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_prod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_select_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_softmax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_softmin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_std_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_sum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_var_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_matmul_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_matrix_exp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_max_binary_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_max_pool2d_with_indices_backward_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_max_reduction_no_dim_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_max_reduction_with_dim_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_maximum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_median_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_meshgrid_list_of_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_meshgrid_variadic_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_min_binary_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_min_reduction_no_dim_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_min_reduction_with_dim_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_minimum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mode_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_movedim_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_msort_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mul_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_multinomial_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mvlgamma_mvlgamma_p_1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mvlgamma_mvlgamma_p_3_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mvlgamma_mvlgamma_p_5_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nan_to_num_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nanmean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nanmedian_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nanquantile_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nansum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_narrow_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_narrow_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_native_batch_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_native_dropout_backward_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_native_layer_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ne_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_neg_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_new_empty_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_new_empty_strided_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_new_full_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_new_ones_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_new_zeros_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nextafter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_adaptive_avg_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_adaptive_avg_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_adaptive_avg_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_adaptive_max_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_adaptive_max_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_adaptive_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_alpha_dropout_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_avg_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_avg_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_avg_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_batch_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_batch_norm_without_cudnn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_bilinear_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_binary_cross_entropy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_binary_cross_entropy_with_logits_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_celu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_channel_shuffle_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_conv1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_conv2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_conv3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_conv_transpose1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_conv_transpose2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_conv_transpose3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_cosine_embedding_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_cosine_similarity_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_cross_entropy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_ctc_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_dropout2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_dropout3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_dropout_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_elu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_embedding_bag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_embedding_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_feature_alpha_dropout_with_train_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_feature_alpha_dropout_without_train_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_fractional_max_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_fractional_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_gaussian_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_gelu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_glu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_grid_sample_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_group_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_hardshrink_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_hardsigmoid_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_hardswish_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_hardtanh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_hinge_embedding_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_huber_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_instance_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_area_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_bicubic_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_bilinear_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_linear_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_nearest-exact_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_nearest_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_trilinear_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_kl_div_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_layer_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_leaky_relu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_linear_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_local_response_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_logsigmoid_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_margin_ranking_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_unpool1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_unpool1d_grad_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_unpool2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_unpool2d_grad_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_unpool3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_unpool3d_grad_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_mish_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_mse_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_multi_head_attention_forward_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_multi_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_multilabel_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_multilabel_soft_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_normalize_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_one_hot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pad_circular_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pad_constant_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pad_reflect_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pad_replicate_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pad_replicate_negative_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pairwise_distance_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pdist_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pixel_shuffle_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pixel_unshuffle_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_poisson_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_prelu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_relu6_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_relu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_rms_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_rrelu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_scaled_dot_product_attention_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_selu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_silu_complex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_silu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_smooth_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_soft_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_softmin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_softmin_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_softplus_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_softshrink_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_softsign_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_tanhshrink_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_threshold_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_triplet_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_triplet_margin_with_distance_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_unfold_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_upsample_bilinear_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_upsample_nearest_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nonzero_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nonzero_static_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_norm_fro_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_norm_inf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_norm_nuc_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_normal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_normal_in_place_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_normal_number_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ones_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ones_like_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ormqr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_outer_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_pca_lowrank_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_permute_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_permute_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_pinverse_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_polar_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_polygamma_polygamma_n_0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_polygamma_polygamma_n_1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_polygamma_polygamma_n_2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_polygamma_polygamma_n_3_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_polygamma_polygamma_n_4_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_positive_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_pow_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_prod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_put_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_qr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_quantile_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_rad2deg_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_rand_like_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_randint_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_randint_like_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_randn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_randn_like_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ravel_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_real_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_reciprocal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_remainder_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_renorm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_repeat_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_repeat_interleave_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_reshape_as_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_reshape_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_resize__cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_resize_as__cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_resolve_conj_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_resolve_neg_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_roll_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_rot90_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_round_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_round_decimals_0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_round_decimals_3_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_round_decimals_neg_3_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_rsqrt_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_rsub_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_scalar_tensor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_scatter_add_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_scatter_reduce_amax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_scatter_reduce_amin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_scatter_reduce_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_scatter_reduce_prod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_scatter_reduce_sum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_searchsorted_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_select_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_select_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sgn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_short_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sigmoid_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sign_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_bartlett_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_blackman_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_cosine_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_exponential_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_gaussian_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_general_cosine_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_general_hamming_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_hamming_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_hann_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_kaiser_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_nuttall_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signbit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sinc_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sinh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_slice_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_slice_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_softmax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sort_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sparse_mm_reduce_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sparse_sampled_addmm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_airy_ai_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_bessel_j0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_bessel_j1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_bessel_y0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_bessel_y1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_chebyshev_polynomial_t_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_chebyshev_polynomial_u_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_chebyshev_polynomial_v_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_chebyshev_polynomial_w_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_entr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_erfcx_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_hermite_polynomial_h_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_hermite_polynomial_he_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_i0e_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_i1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_i1e_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_laguerre_polynomial_l_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_legendre_polynomial_p_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_log_ndtr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_modified_bessel_i0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_modified_bessel_i1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_modified_bessel_k0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_modified_bessel_k1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_ndtr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_ndtri_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_polygamma_special_polygamma_n_0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_scaled_modified_bessel_k0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_scaled_modified_bessel_k1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_shifted_chebyshev_polynomial_t_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_shifted_chebyshev_polynomial_u_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_shifted_chebyshev_polynomial_v_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_shifted_chebyshev_polynomial_w_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_spherical_bessel_j0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_xlog1py_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_zeta_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_split_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_split_list_args_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_split_with_sizes_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_split_with_sizes_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sqrt_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_square_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_squeeze_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_squeeze_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_squeeze_multiple_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_stack_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_std_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_std_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_std_mean_unbiased_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_std_unbiased_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_stft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sub_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sum_to_size_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_svd_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_svd_lowrank_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_t_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_t_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_take_along_dim_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_take_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_tan_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_tanh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_tensor_split_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_tensordot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_tile_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_to_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_to_sparse_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_topk_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_torch__scaled_mm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_torch_ops_aten__efficient_attention_forward_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_torch_ops_aten__flash_attention_forward_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_torch_ops_aten__safe_softmax_default_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_trace_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_transpose_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_transpose_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_trapezoid_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_trapz_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_triangular_solve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_tril_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_tril_indices_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_triu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_triu_indices_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_true_divide_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_trunc_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unbind_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unbind_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unflatten_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unfold_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unfold_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_uniform_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unique_consecutive_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unique_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unravel_index_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unsafe_chunk_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unsafe_split_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unsqueeze_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unsqueeze_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_var_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_var_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_var_mean_unbiased_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_var_unbiased_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_vdot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_view_as_complex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_view_as_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_view_as_real_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_view_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_view_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_vsplit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_vstack_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_where_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_xlogy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_zero__cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_zeros_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_zeros_like_cuda, test/test_ops.py::TestCommonCUDA::test_errors_T_cuda, test/test_ops.py::TestCommonCUDA::test_errors___radd___cuda, test/test_ops.py::TestCommonCUDA::test_errors___rand___cuda, test/test_ops.py::TestCommonCUDA::test_errors___rdiv___cuda, test/test_ops.py::TestCommonCUDA::test_errors___rmod___cuda, test/test_ops.py::TestCommonCUDA::test_errors___rmul___cuda, test/test_ops.py::TestCommonCUDA::test_errors___ror___cuda, test/test_ops.py::TestCommonCUDA::test_errors___rpow___cuda, test/test_ops.py::TestCommonCUDA::test_errors___rsub___cuda, test/test_ops.py::TestCommonCUDA::test_errors___rxor___cuda, test/test_ops.py::TestCommonCUDA::test_errors__chunk_cat_cuda, test/test_ops.py::TestCommonCUDA::test_errors_add_cuda, test/test_ops.py::TestCommonCUDA::test_errors_amax_cuda, test/test_ops.py::TestCommonCUDA::test_errors_amin_cuda, test/test_ops.py::TestCommonCUDA::test_errors_aminmax_cuda, test/test_ops.py::TestCommonCUDA::test_errors_arange_cuda, test/test_ops.py::TestCommonCUDA::test_errors_as_strided_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_errors_atan2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_bernoulli_cuda, test/test_ops.py::TestCommonCUDA::test_errors_bitwise_and_cuda, test/test_ops.py::TestCommonCUDA::test_errors_bitwise_left_shift_cuda, test/test_ops.py::TestCommonCUDA::test_errors_bitwise_or_cuda, test/test_ops.py::TestCommonCUDA::test_errors_bitwise_right_shift_cuda, test/test_ops.py::TestCommonCUDA::test_errors_bitwise_xor_cuda, test/test_ops.py::TestCommonCUDA::test_errors_bucketize_cuda, test/test_ops.py::TestCommonCUDA::test_errors_cat_cuda, test/test_ops.py::TestCommonCUDA::test_errors_cauchy_cuda, test/test_ops.py::TestCommonCUDA::test_errors_clamp_max_cuda, test/test_ops.py::TestCommonCUDA::test_errors_clamp_min_cuda, test/test_ops.py::TestCommonCUDA::test_errors_complex_cuda, test/test_ops.py::TestCommonCUDA::test_errors_copysign_cuda, test/test_ops.py::TestCommonCUDA::test_errors_cov_cuda, test/test_ops.py::TestCommonCUDA::test_errors_diag_cuda, test/test_ops.py::TestCommonCUDA::test_errors_diag_embed_cuda, test/test_ops.py::TestCommonCUDA::test_errors_diagonal_copy_cuda, test/test_ops.py::TestCommonCUDA::test_errors_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_errors_diff_cuda, test/test_ops.py::TestCommonCUDA::test_errors_div_floor_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_errors_div_no_rounding_mode_cuda, test/test_ops.py::TestCommonCUDA::test_errors_div_trunc_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_errors_dot_cuda, test/test_ops.py::TestCommonCUDA::test_errors_dsplit_cuda, test/test_ops.py::TestCommonCUDA::test_errors_dstack_cuda, test/test_ops.py::TestCommonCUDA::test_errors_empty_permuted_cuda, test/test_ops.py::TestCommonCUDA::test_errors_eq_cuda, test/test_ops.py::TestCommonCUDA::test_errors_exponential_cuda, test/test_ops.py::TestCommonCUDA::test_errors_eye_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_fft2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_fft_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_fftn_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_hfft2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_hfft_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_hfftn_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_ifft2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_ifft_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_ifftn_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_ihfft2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_ihfft_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_ihfftn_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_irfft2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_irfft_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_irfftn_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_rfft2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_rfft_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_rfftn_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fliplr_cuda, test/test_ops.py::TestCommonCUDA::test_errors_flipud_cuda, test/test_ops.py::TestCommonCUDA::test_errors_float_power_cuda, test/test_ops.py::TestCommonCUDA::test_errors_floor_divide_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fmax_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fmin_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fmod_cuda, test/test_ops.py::TestCommonCUDA::test_errors_gather_cuda, test/test_ops.py::TestCommonCUDA::test_errors_gcd_cuda, test/test_ops.py::TestCommonCUDA::test_errors_ge_cuda, test/test_ops.py::TestCommonCUDA::test_errors_geometric_cuda, test/test_ops.py::TestCommonCUDA::test_errors_gradient_cuda, test/test_ops.py::TestCommonCUDA::test_errors_gt_cuda, test/test_ops.py::TestCommonCUDA::test_errors_heaviside_cuda, test/test_ops.py::TestCommonCUDA::test_errors_histogramdd_cuda, test/test_ops.py::TestCommonCUDA::test_errors_hsplit_cuda, test/test_ops.py::TestCommonCUDA::test_errors_hstack_cuda, test/test_ops.py::TestCommonCUDA::test_errors_hypot_cuda, test/test_ops.py::TestCommonCUDA::test_errors_igamma_cuda, test/test_ops.py::TestCommonCUDA::test_errors_igammac_cuda, test/test_ops.py::TestCommonCUDA::test_errors_index_add_cuda, test/test_ops.py::TestCommonCUDA::test_errors_index_select_cuda, test/test_ops.py::TestCommonCUDA::test_errors_isclose_cuda, test/test_ops.py::TestCommonCUDA::test_errors_item_cuda, test/test_ops.py::TestCommonCUDA::test_errors_jiterator_binary_cuda, test/test_ops.py::TestCommonCUDA::test_errors_jiterator_binary_return_by_ref_cuda, test/test_ops.py::TestCommonCUDA::test_errors_kthvalue_cuda, test/test_ops.py::TestCommonCUDA::test_errors_lcm_cuda, test/test_ops.py::TestCommonCUDA::test_errors_ldexp_cuda, test/test_ops.py::TestCommonCUDA::test_errors_le_cuda, test/test_ops.py::TestCommonCUDA::test_errors_linalg_cross_cuda, test/test_ops.py::TestCommonCUDA::test_errors_linalg_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_errors_linalg_lstsq_cuda, test/test_ops.py::TestCommonCUDA::test_errors_linalg_lstsq_grad_oriented_cuda, test/test_ops.py::TestCommonCUDA::test_errors_linspace_cuda, test/test_ops.py::TestCommonCUDA::test_errors_linspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_errors_log_normal_cuda, test/test_ops.py::TestCommonCUDA::test_errors_logaddexp_cuda, test/test_ops.py::TestCommonCUDA::test_errors_logcumsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_errors_logical_and_cuda, test/test_ops.py::TestCommonCUDA::test_errors_logical_or_cuda, test/test_ops.py::TestCommonCUDA::test_errors_logical_xor_cuda, test/test_ops.py::TestCommonCUDA::test_errors_logspace_cuda, test/test_ops.py::TestCommonCUDA::test_errors_logspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_errors_lt_cuda, test/test_ops.py::TestCommonCUDA::test_errors_masked_fill_cuda, test/test_ops.py::TestCommonCUDA::test_errors_masked_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_errors_masked_select_cuda, test/test_ops.py::TestCommonCUDA::test_errors_max_binary_cuda, test/test_ops.py::TestCommonCUDA::test_errors_maximum_cuda, test/test_ops.py::TestCommonCUDA::test_errors_mean_cuda, test/test_ops.py::TestCommonCUDA::test_errors_median_cuda, test/test_ops.py::TestCommonCUDA::test_errors_min_binary_cuda, test/test_ops.py::TestCommonCUDA::test_errors_minimum_cuda, test/test_ops.py::TestCommonCUDA::test_errors_movedim_cuda, test/test_ops.py::TestCommonCUDA::test_errors_mul_cuda, test/test_ops.py::TestCommonCUDA::test_errors_multinomial_cuda, test/test_ops.py::TestCommonCUDA::test_errors_narrow_copy_cuda, test/test_ops.py::TestCommonCUDA::test_errors_narrow_cuda, test/test_ops.py::TestCommonCUDA::test_errors_native_layer_norm_cuda, test/test_ops.py::TestCommonCUDA::test_errors_ne_cuda, test/test_ops.py::TestCommonCUDA::test_errors_neg_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nextafter_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_adaptive_avg_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_adaptive_avg_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_adaptive_avg_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_adaptive_max_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_adaptive_max_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_adaptive_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_avg_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_avg_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_avg_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_conv1d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_conv2d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_conv3d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_embedding_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_gaussian_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_gelu_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_group_norm_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_hardtanh_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_hinge_embedding_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_huber_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_margin_ranking_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_max_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_max_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_multi_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_multilabel_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_poisson_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_prelu_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_rms_norm_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_rrelu_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_soft_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_softshrink_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_triplet_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_triplet_margin_with_distance_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_normal_in_place_cuda, test/test_ops.py::TestCommonCUDA::test_errors_ormqr_cuda, test/test_ops.py::TestCommonCUDA::test_errors_polar_cuda, test/test_ops.py::TestCommonCUDA::test_errors_pow_cuda, test/test_ops.py::TestCommonCUDA::test_errors_remainder_cuda, test/test_ops.py::TestCommonCUDA::test_errors_renorm_cuda, test/test_ops.py::TestCommonCUDA::test_errors_reshape_as_cuda, test/test_ops.py::TestCommonCUDA::test_errors_reshape_cuda, test/test_ops.py::TestCommonCUDA::test_errors_roll_cuda, test/test_ops.py::TestCommonCUDA::test_errors_rot90_cuda, test/test_ops.py::TestCommonCUDA::test_errors_rsub_cuda, test/test_ops.py::TestCommonCUDA::test_errors_scatter_add_cuda, test/test_ops.py::TestCommonCUDA::test_errors_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_bartlett_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_blackman_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_cosine_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_exponential_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_gaussian_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_general_cosine_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_general_hamming_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_hamming_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_hann_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_kaiser_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_nuttall_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_mul_layout0_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_mul_layout1_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_mul_layout2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_mul_layout3_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_mul_layout4_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_randn_like_layout0_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_randn_like_layout1_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_randn_like_layout2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_randn_like_layout3_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_randn_like_layout4_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_sum_layout0_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_sum_layout1_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_sum_layout2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_sum_layout3_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_sum_layout4_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_zeros_like_layout0_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_zeros_like_layout1_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_zeros_like_layout2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_zeros_like_layout3_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_zeros_like_layout4_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_chebyshev_polynomial_t_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_chebyshev_polynomial_u_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_chebyshev_polynomial_v_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_chebyshev_polynomial_w_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_hermite_polynomial_h_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_hermite_polynomial_he_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_laguerre_polynomial_l_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_legendre_polynomial_p_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_shifted_chebyshev_polynomial_t_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_shifted_chebyshev_polynomial_u_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_shifted_chebyshev_polynomial_v_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_shifted_chebyshev_polynomial_w_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_xlog1py_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_zeta_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sub_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sum_to_size_cuda, test/test_ops.py::TestCommonCUDA::test_errors_t_copy_cuda, test/test_ops.py::TestCommonCUDA::test_errors_t_cuda, test/test_ops.py::TestCommonCUDA::test_errors_take_cuda, test/test_ops.py::TestCommonCUDA::test_errors_trace_cuda, test/test_ops.py::TestCommonCUDA::test_errors_tril_cuda, test/test_ops.py::TestCommonCUDA::test_errors_triu_cuda, test/test_ops.py::TestCommonCUDA::test_errors_true_divide_cuda, test/test_ops.py::TestCommonCUDA::test_errors_unbind_copy_cuda, test/test_ops.py::TestCommonCUDA::test_errors_unbind_cuda, test/test_ops.py::TestCommonCUDA::test_errors_uniform_cuda, test/test_ops.py::TestCommonCUDA::test_errors_vdot_cuda, test/test_ops.py::TestCommonCUDA::test_errors_view_as_cuda, test/test_ops.py::TestCommonCUDA::test_errors_view_copy_cuda, test/test_ops.py::TestCommonCUDA::test_errors_view_cuda, test/test_ops.py::TestCommonCUDA::test_errors_vsplit_cuda, test/test_ops.py::TestCommonCUDA::test_errors_vstack_cuda, test/test_ops.py::TestCommonCUDA::test_errors_where_cuda, test/test_ops.py::TestCommonCUDA::test_errors_xlogy_cuda, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch__chunk_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_addbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_addmm_decomposed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_addmv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_aminmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_angle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_argmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_asin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_atan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_baddbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_bernoulli_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_bmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_bucketize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_ceil_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_cholesky_inverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_cholesky_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_conj_physical_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_cummax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_cummin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_diff_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_equal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_exp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_fftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_ifft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_irfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_rfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_floor_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_gather_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_ge_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_geqrf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_hash_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_heaviside_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_histc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_index_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_index_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_index_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_index_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_inner_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_isin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_isneginf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_kron_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_kthvalue_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_ldexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_le_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_cond_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_det_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_eig_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_eigh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_eigvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_householder_product_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_inv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_lstsq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_pinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_slogdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_log2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_log_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_logcumsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_logical_and_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_logical_xor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_logit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_lt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_lu_unpack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_masked_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_max_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_maximum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_min_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_minimum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_mm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_msort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_multinomial_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nanmean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nanquantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nansum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_native_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nn_functional_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_norm_fro_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_norm_inf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_norm_nuc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_ormqr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_outer_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_quantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_reciprocal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_round_decimals_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_round_decimals_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_scatter_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_searchsorted_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_sin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_slice_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_sort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_airy_ai_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_bessel_y0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_bessel_y1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_erfcx_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_zeta_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_sqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_squeeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_take_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_tensordot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_topk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_transpose_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_triangular_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_unbind_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_view_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_where_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_H_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_H_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_T_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices___getitem___cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___getitem___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices___radd___cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___radd___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rand___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rdiv___cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rdiv___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rmatmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rmod___cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rmod___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rmul___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices___ror___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rpow___cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rpow___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rsub___cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rsub___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rxor___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__chunk_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__chunk_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__softmax_backward_data_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__unsafe_masked_index_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__unsafe_masked_index_put_accumulate_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_abs_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_acos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_acosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_addbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_addcmul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_addmm_decomposed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_addmv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_addr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_alias_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_all_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_allclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_aminmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_aminmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_angle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_angle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_any_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_arange_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_argmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_argmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_argmin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_argsort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_argsort_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_argwhere_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_argwhere_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_as_strided_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_as_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_as_strided_partial_views_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_as_strided_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_asin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_asin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_asinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atan2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atleast_1d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atleast_2d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atleast_3d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_baddbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bernoulli_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bfloat16_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bincount_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bitwise_and_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bitwise_left_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bitwise_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bitwise_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bitwise_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_block_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_block_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bool_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_broadcast_shapes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_broadcast_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_broadcast_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_broadcast_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_broadcast_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bucketize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bucketize_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_byte_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cartesian_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cartesian_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cdouble_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cdouble_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ceil_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ceil_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cfloat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cfloat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_chalf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_chalf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_char_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cholesky_inverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cholesky_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_chunk_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_max_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_min_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clone_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clone_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_column_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_combinations_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_combinations_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_conj_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_conj_physical_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_conj_physical_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_constant_pad_nd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_contiguous_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_contiguous_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_copysign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_corrcoef_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_corrcoef_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_count_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_count_nonzero_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cov_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cov_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cross_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cummax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cummax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cummin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cummin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cumprod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cumsum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cumulative_trapezoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_deg2rad_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diag_embed_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diagflat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diagflat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diagonal_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diagonal_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diagonal_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diff_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diff_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_digamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_dist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_div_floor_rounding_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_div_no_rounding_mode_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_div_trunc_rounding_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_double_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_dsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_dstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_einsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_empty_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_empty_permuted_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_empty_permuted_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_empty_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_eq_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_equal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_equal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_erf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_erfc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_exp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_exp2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_exp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_expand_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_expand_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_expand_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_expm1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_eye_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_fft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_fft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_fftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_fftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_fftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_fftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_hfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_hfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_hfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ifft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ifft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ifft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ifftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ifftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ihfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ihfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ihfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_irfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_irfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_irfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_irfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_rfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_rfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_rfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_rfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_flatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_flip_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fliplr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_flipud_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_float_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_float_power_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_floor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_floor_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_floor_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fmin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fmod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_full_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_full_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_full_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_gather_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_gather_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_gcd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ge_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ge_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_geometric_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_geqrf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_gradient_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_gradient_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_grid_sampler_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_gt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_half_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_half_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_hash_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_hash_tensor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_heaviside_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_heaviside_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_histc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_histc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_hsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_hstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_put_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_mean_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_inner_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_int_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isclose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isfinite_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isfinite_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isinf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isnan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isnan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isneginf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isneginf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isposinf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isreal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isreal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_item_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_item_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_2inputs_2outputs_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_4inputs_with_extra_args_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_binary_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_binary_return_by_ref_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_unary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_unary_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_kron_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_kron_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_kthvalue_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_kthvalue_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lcm_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ldexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ldexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_le_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_le_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lgamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_cond_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_cross_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_det_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_eig_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_eigh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_eigvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_householder_product_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_inv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_lstsq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_pinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_slogdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_vander_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_vander_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log10_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log1p_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logcumsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logical_and_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logical_and_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logical_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logical_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logical_xor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logical_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logsumexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_long_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lu_unpack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mH_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mH_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mT_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mT_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_argmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_argmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_argmin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_cumprod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_cumsum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_logsumexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_softmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_std_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_var_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_matrix_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_max_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_max_binary_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_max_reduction_no_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_max_reduction_with_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_maximum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_maximum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_median_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_meshgrid_list_of_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_meshgrid_variadic_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_min_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_min_binary_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_min_reduction_no_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_min_reduction_with_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_minimum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_minimum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mode_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_movedim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_movedim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_msort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_msort_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_multinomial_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nan_to_num_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nanmean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nanmedian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nanmedian_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nanquantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nansum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nansum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_narrow_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_narrow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_native_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_native_dropout_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ne_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_neg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_new_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_new_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_new_empty_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_new_full_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_new_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_new_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_celu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_channel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_cosine_embedding_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_elu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_embedding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_hardtanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_margin_ranking_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_one_hot_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_circular_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_constant_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_reflect_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_replicate_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_replicate_negative_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pairwise_distance_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pixel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pixel_unshuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_poisson_nll_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_relu6_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_relu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_selu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_silu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_softmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_softmin_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_softsign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_tanhshrink_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_threshold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_triplet_margin_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nonzero_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nonzero_static_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nonzero_static_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_norm_fro_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_norm_inf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_norm_nuc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_normal_in_place_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ones_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ones_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ormqr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_outer_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_outer_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_pca_lowrank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_permute_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_permute_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_permute_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_pinverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_4_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_positive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_positive_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_pow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_put_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_quantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_rad2deg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_rand_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_randint_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_randint_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_randint_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_randint_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_randn_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ravel_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ravel_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_real_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_reciprocal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_reciprocal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_remainder_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_repeat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_repeat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_repeat_interleave_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_repeat_interleave_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_reshape_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_reshape_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_reshape_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_resize__cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_resize__cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_resize_as__cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_resize_as__cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_resolve_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_resolve_conj_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_resolve_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_resolve_neg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_roll_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_rot90_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_rot90_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_round_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_round_decimals_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_round_decimals_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_rsqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_rsub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scalar_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scalar_tensor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_mean_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_searchsorted_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_searchsorted_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_select_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_select_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sgn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_short_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sigmoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_hamming_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_hann_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signbit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sinc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_slice_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_slice_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_slice_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_slice_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sort_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_airy_ai_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_airy_ai_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_bessel_j0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_bessel_j1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_bessel_y0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_bessel_y0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_bessel_y1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_bessel_y1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_u_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_v_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_w_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_entr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_erfcx_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_erfcx_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_hermite_polynomial_h_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_hermite_polynomial_he_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_i0e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_i1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_i1e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_laguerre_polynomial_l_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_legendre_polynomial_p_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_log_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_i1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_k0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_k1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_ndtri_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_scaled_modified_bessel_k0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_scaled_modified_bessel_k1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_spherical_bessel_j0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_xlog1py_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_zeta_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_zeta_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_split_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_split_list_args_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_split_list_args_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_split_with_sizes_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_split_with_sizes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_split_with_sizes_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_square_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_squeeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_squeeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_squeeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_squeeze_multiple_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_squeeze_multiple_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_std_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sum_to_size_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sum_to_size_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_svd_lowrank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_t_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_take_along_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_take_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_take_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tensor_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tensor_split_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tensordot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tile_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_to_sparse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_to_sparse_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_topk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_topk_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_trace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_trace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_transpose_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_transpose_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_transpose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_transpose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_trapezoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_trapezoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_trapz_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_trapz_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_triangular_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tril_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tril_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_triu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_triu_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_true_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_trunc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unbind_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unbind_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unbind_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unbind_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unflatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unflatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unfold_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unfold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_uniform_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unique_consecutive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unique_consecutive_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unique_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unique_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unravel_index_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unsafe_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unsafe_chunk_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unsafe_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unsafe_split_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unsqueeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unsqueeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unsqueeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_var_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_view_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_view_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_view_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_view_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_view_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_vsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_vstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_where_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_where_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_xlogy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_zero__cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_zero__cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_zeros_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_zeros_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_H_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_T_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values___getitem___cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values___radd___cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values___rand___cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values___rdiv___cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values___rmul___cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values___ror___cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values___rxor___cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values__chunk_cat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values__unsafe_masked_index_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values__unsafe_masked_index_put_accumulate_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_abs_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_acos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_acosh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_add_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_addr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_alias_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_all_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_amax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_amin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_aminmax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_angle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_any_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_argsort_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_argwhere_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_as_strided_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_as_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_as_strided_partial_views_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_as_strided_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_asin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_asinh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_atan2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_atan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_atanh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_atleast_1d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_atleast_2d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_atleast_3d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_bfloat16_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_bitwise_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_bitwise_not_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_bitwise_or_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_bitwise_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_block_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_bool_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_broadcast_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_broadcast_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_byte_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cartesian_prod_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cdouble_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cfloat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_chalf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_char_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_chunk_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_clamp_max_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_clamp_min_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_clone_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_column_stack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_combinations_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_conj_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_conj_physical_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_constant_pad_nd_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_contiguous_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_copysign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cosh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_count_nonzero_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cummax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cummin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_deg2rad_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diag_embed_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diagflat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diagonal_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diagonal_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diff_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_digamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_div_no_rounding_mode_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_double_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_dsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_dstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_empty_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_empty_like_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_empty_permuted_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_empty_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_eq_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_equal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_erf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_erfc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_erfinv_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_exp2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_exp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_expand_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_expand_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_expand_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_expm1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_eye_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_fft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_fft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_fftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_fftshift_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_hfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_hfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_hfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ifft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ifft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ifftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ifftshift_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ihfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ihfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ihfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_irfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_irfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_irfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_rfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_rfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_rfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_flatten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_flip_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fliplr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_flipud_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_float_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_float_power_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fmax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fmin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_full_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_full_like_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_gather_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_ge_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_gt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_half_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_hash_tensor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_heaviside_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_hsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_hstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_i0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_index_add_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_index_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_index_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_index_put_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_index_select_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_int_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_isclose_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_isfinite_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_isinf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_isnan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_isneginf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_isposinf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_isreal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_item_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_jiterator_2inputs_2outputs_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_jiterator_4inputs_with_extra_args_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_jiterator_binary_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_jiterator_binary_return_by_ref_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_jiterator_unary_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_kron_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_ldexp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_le_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_lgamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_linalg_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_log10_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_log1p_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_log2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_log_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_logical_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_logical_not_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_logical_or_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_logical_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_logit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_logsumexp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_long_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_lt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_mH_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_mT_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_masked_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_masked_prod_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_masked_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_masked_select_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_masked_sum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_max_binary_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_max_reduction_no_dim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_max_reduction_with_dim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_maximum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_meshgrid_list_of_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_meshgrid_variadic_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_min_binary_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_min_reduction_no_dim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_min_reduction_with_dim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_minimum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_mode_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_movedim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_msort_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_mul_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nan_to_num_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nansum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_narrow_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_narrow_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_ne_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_new_empty_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_new_empty_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_new_full_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_new_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_new_zeros_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_channel_shuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_cosine_embedding_loss_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_pad_circular_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_pad_constant_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_pixel_shuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_pixel_unshuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_softsign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_unfold_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nonzero_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nonzero_static_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_ones_like_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_outer_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_permute_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_permute_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_polygamma_polygamma_n_0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_polygamma_polygamma_n_1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_polygamma_polygamma_n_2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_polygamma_polygamma_n_3_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_polygamma_polygamma_n_4_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_prod_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_put_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_rad2deg_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_ravel_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_real_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_reciprocal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_repeat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_repeat_interleave_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_reshape_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_reshape_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_resize__cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_resize_as__cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_resolve_conj_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_resolve_neg_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_roll_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_rot90_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_rsqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_scalar_tensor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_scatter_add_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_scatter_reduce_sum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_select_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_select_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sgn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_short_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sigmoid_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_signbit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sinc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sinh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_slice_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_slice_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sort_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_airy_ai_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_bessel_j0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_bessel_j1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_bessel_y0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_bessel_y1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_chebyshev_polynomial_t_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_chebyshev_polynomial_u_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_chebyshev_polynomial_v_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_chebyshev_polynomial_w_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_entr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_erfcx_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_hermite_polynomial_h_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_hermite_polynomial_he_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_i0e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_i1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_i1e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_laguerre_polynomial_l_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_legendre_polynomial_p_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_log_ndtr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_modified_bessel_i0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_modified_bessel_i1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_modified_bessel_k0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_modified_bessel_k1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_ndtr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_ndtri_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_scaled_modified_bessel_k0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_scaled_modified_bessel_k1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_spherical_bessel_j0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_xlog1py_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_zeta_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_split_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_split_list_args_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_split_with_sizes_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_split_with_sizes_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_square_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_squeeze_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_squeeze_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_squeeze_multiple_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_stack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sum_to_size_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_t_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_t_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_take_along_dim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_take_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_tan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_tanh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_tensor_split_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_tile_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_to_sparse_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_trace_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_transpose_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_transpose_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_tril_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_triu_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_true_divide_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unbind_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unbind_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unflatten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unfold_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unfold_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unique_consecutive_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unique_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unsafe_chunk_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unsafe_split_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unsqueeze_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unsqueeze_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_view_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_view_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_view_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_vsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_vstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_where_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_xlogy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_zero__cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_zeros_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_zeros_like_cuda_bool, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_H_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_H_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_H_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_T_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_T_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___getitem___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___getitem___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___getitem___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___radd___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___radd___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___radd___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rand___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rdiv___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rdiv___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rdiv___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmatmul___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmatmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmod___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmod___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmul___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmul___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___ror___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rpow___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rpow___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rpow___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rsub___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rsub___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rsub___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rxor___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__chunk_cat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__chunk_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__chunk_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__softmax_backward_data_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__unsafe_masked_index_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__unsafe_masked_index_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__unsafe_masked_index_put_accumulate_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_abs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_abs_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_acos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_acos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_acosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_acosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addbmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addcdiv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addcmul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addcmul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addmm_decomposed_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addmm_decomposed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addmv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addmv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_alias_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_alias_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_all_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_all_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_allclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_allclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_aminmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_aminmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_angle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_angle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_angle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_any_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_any_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_arange_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_argmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_argmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_argmin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_argsort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_argsort_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_argwhere_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_argwhere_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_argwhere_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_partial_views_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_partial_views_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_asin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_asin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_asin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_asinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_asinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atan2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atleast_1d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atleast_1d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atleast_2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atleast_2d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atleast_3d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atleast_3d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_baddbmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_baddbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bernoulli_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bfloat16_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bfloat16_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bincount_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bitwise_and_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bitwise_left_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bitwise_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bitwise_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bitwise_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_block_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_block_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_block_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bool_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bool_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_shapes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bucketize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bucketize_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_byte_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_byte_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cartesian_prod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cartesian_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cartesian_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cdouble_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cdouble_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cdouble_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ceil_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ceil_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cfloat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cfloat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cfloat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_chalf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_chalf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_chalf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_char_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_char_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cholesky_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cholesky_inverse_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cholesky_inverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cholesky_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cholesky_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_chunk_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_chunk_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clamp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clamp_max_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clamp_min_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clone_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clone_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clone_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_column_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_column_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_combinations_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_combinations_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_combinations_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_conj_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_conj_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_conj_physical_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_conj_physical_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_conj_physical_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_constant_pad_nd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_constant_pad_nd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_contiguous_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_contiguous_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_contiguous_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_copysign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_corrcoef_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_corrcoef_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_corrcoef_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_count_nonzero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_count_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_count_nonzero_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cov_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cov_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cov_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cross_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cummax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cummax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cummin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cummin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cumprod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cumprod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cumsum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cumulative_trapezoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cumulative_trapezoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_deg2rad_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diag_embed_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diag_embed_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagflat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagflat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagflat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diff_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diff_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diff_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_digamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dist_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_div_floor_rounding_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_div_no_rounding_mode_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_div_no_rounding_mode_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_div_trunc_rounding_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_double_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_double_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_einsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_einsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_permuted_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_permuted_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_permuted_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_eq_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_eq_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_equal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_equal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_equal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_erf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_erfc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_exp2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_exp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_exp2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_exp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_exp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expand_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expand_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expand_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expand_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expand_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expand_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expm1_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expm1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_eye_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_eye_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_hfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_hfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_hfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_hfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_hfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_hfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ihfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ihfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ihfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_rfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_rfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_rfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_rfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flip_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flip_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fliplr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fliplr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flipud_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flipud_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_float_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_float_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_float_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_float_power_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_floor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_floor_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_floor_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fmin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fmod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_full_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_full_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_full_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_full_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_full_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gather_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gather_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gather_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gcd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ge_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ge_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_geometric_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_geqrf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_geqrf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gradient_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gradient_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gradient_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_grid_sampler_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_half_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_half_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_half_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hash_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hash_tensor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_heaviside_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_heaviside_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_histc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_histc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_imag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_put_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_put_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_reduce_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_reduce_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_reduce_mean_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_reduce_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_inner_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_inner_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_int_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_int_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isclose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isfinite_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isfinite_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isfinite_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isinf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isinf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isnan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isnan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isnan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isneginf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isneginf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isposinf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isreal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isreal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isreal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_item_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_item_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_item_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_2inputs_2outputs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_2inputs_2outputs_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_4inputs_with_extra_args_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_return_by_ref_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_return_by_ref_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_unary_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_unary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_unary_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_kron_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_kron_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_kron_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_kthvalue_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_kthvalue_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lcm_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ldexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ldexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ldexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_le_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_le_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lerp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lgamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cholesky_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cholesky_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cond_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cond_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cross_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_det_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_det_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_eig_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_eig_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_eigh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_eigh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_eigvals_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_eigvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_eigvalsh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_householder_product_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_householder_product_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_inv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_inv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_inv_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_ldl_factor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_ldl_factor_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_ldl_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lstsq_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lstsq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lstsq_grad_oriented_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lu_factor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lu_factor_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lu_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_matrix_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_matrix_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_matrix_rank_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_matrix_rank_hermitian_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_multi_dot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_pinv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_pinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_pinv_hermitian_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_pinv_singular_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_qr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_slogdet_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_slogdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_solve_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_solve_triangular_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_svdvals_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_tensorinv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_tensorsolve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_vander_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_vander_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_vander_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_vecdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_vector_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log10_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log10_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log1p_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log1p_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logcumsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logcumsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logdet_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_and_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_and_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_and_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_not_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_or_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_xor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_xor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logsumexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_long_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_long_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lu_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lu_unpack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lu_unpack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mH_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mH_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mH_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mT_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mT_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mT_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_argmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_argmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_argmin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_cumprod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_cumprod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_cumsum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_logsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_logsumexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_normalize_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_prod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_softmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_std_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_std_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_sum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_var_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_var_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_matmul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_matrix_exp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_matrix_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_max_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_max_binary_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_max_reduction_no_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_max_reduction_with_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_maximum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_maximum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_median_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_meshgrid_list_of_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_meshgrid_list_of_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_meshgrid_variadic_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_meshgrid_variadic_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_min_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_min_binary_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_min_reduction_no_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_min_reduction_with_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_minimum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_minimum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mode_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_movedim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_movedim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_movedim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_msort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_msort_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_multinomial_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nan_to_num_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nanmean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nanmean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nanmedian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nanmedian_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nanquantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nansum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nansum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nansum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_narrow_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_narrow_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_narrow_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_narrow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_native_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_native_dropout_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ne_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ne_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_neg_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_neg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_empty_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_empty_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_full_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_full_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_zeros_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_celu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_channel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_channel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv1d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv3d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv_transpose1d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv_transpose2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv_transpose3d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_cosine_embedding_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_elu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_embedding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_hardtanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_l1_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_linear_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_margin_ranking_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_normalize_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_one_hot_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_circular_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_circular_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_constant_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_constant_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_reflect_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_reflect_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_replicate_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_replicate_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_replicate_negative_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_replicate_negative_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pairwise_distance_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pairwise_distance_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pixel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pixel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pixel_unshuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pixel_unshuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_poisson_nll_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_relu6_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_relu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_rms_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_selu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_silu_complex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_silu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softmin_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softmin_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softsign_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softsign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_tanhshrink_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_tanhshrink_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_threshold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_triplet_margin_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_unfold_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nonzero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nonzero_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nonzero_static_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nonzero_static_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nonzero_static_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_norm_fro_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_norm_fro_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_norm_inf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_norm_inf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_norm_nuc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_norm_nuc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_normal_in_place_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_normal_in_place_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ones_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ones_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ones_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ormqr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ormqr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_outer_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_outer_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_outer_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_pca_lowrank_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_pca_lowrank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_permute_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_permute_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_permute_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_permute_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_permute_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_pinverse_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_pinverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polygamma_polygamma_n_0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polygamma_polygamma_n_1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polygamma_polygamma_n_2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polygamma_polygamma_n_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polygamma_polygamma_n_4_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_positive_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_positive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_positive_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_pow_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_pow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_prod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_put_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_put_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_qr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_quantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rad2deg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rand_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rand_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_randint_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_randint_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_randint_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_randint_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_randn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_randn_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_randn_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ravel_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ravel_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ravel_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_real_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_reciprocal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_reciprocal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_reciprocal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_remainder_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_renorm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_repeat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_repeat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_repeat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_repeat_interleave_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_repeat_interleave_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_repeat_interleave_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_reshape_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_reshape_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_reshape_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_reshape_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_reshape_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resize__cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resize__cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resize__cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resize_as__cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resize_as__cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resize_as__cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resolve_conj_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resolve_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resolve_conj_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resolve_neg_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resolve_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resolve_neg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_roll_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_roll_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rot90_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rot90_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rot90_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_round_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_round_decimals_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_round_decimals_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rsqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rsqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rsub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rsub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scalar_tensor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scalar_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scalar_tensor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_mean_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_searchsorted_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_searchsorted_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_select_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_select_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sgn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sgn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_short_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_short_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sigmoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sigmoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_hamming_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_hann_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signbit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sinc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sinc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_slice_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_slice_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_slice_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_slice_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_slice_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sort_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sparse_sampled_addmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_airy_ai_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_airy_ai_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_bessel_j0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_bessel_j1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_bessel_y0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_bessel_y0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_bessel_y1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_bessel_y1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_u_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_v_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_w_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_entr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_erfcx_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_erfcx_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_hermite_polynomial_h_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_hermite_polynomial_he_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_i0e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_i1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_i1e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_laguerre_polynomial_l_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_legendre_polynomial_p_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_log_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_modified_bessel_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_modified_bessel_i1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_modified_bessel_k0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_modified_bessel_k1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_ndtri_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_scaled_modified_bessel_k0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_scaled_modified_bessel_k1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_spherical_bessel_j0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_xlog1py_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_zeta_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_zeta_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_list_args_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_list_args_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_list_args_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_square_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_square_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_multiple_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_multiple_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_multiple_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_mean_unbiased_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_unbiased_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_stft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sum_to_size_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sum_to_size_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sum_to_size_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_svd_lowrank_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_svd_lowrank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_t_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_t_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_take_along_dim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_take_along_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_take_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_take_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_take_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tensor_split_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tensor_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tensor_split_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tensordot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tensordot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tile_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tile_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_to_sparse_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_to_sparse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_to_sparse_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_topk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_topk_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_transpose_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_transpose_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_transpose_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_transpose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_transpose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_transpose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trapezoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trapezoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trapezoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trapz_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trapz_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trapz_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_triangular_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_triangular_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tril_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tril_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tril_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_triu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_triu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_triu_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_true_divide_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_true_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trunc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unbind_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unbind_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unbind_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unbind_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unbind_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unbind_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unflatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unflatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unflatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unfold_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unfold_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unfold_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unfold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_uniform_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_uniform_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unique_consecutive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unique_consecutive_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unique_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unique_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unravel_index_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsafe_chunk_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsafe_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsafe_chunk_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsafe_split_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsafe_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsafe_split_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsqueeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsqueeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsqueeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsqueeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsqueeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_var_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_var_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_var_mean_unbiased_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_var_unbiased_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_var_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_as_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_where_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_where_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_where_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_xlogy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_zero__cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_zero__cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_zero__cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_zeros_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_zeros_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_zeros_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_zeros_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_addbmm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_addbmm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_allclose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_allclose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_aminmax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_aminmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_argwhere_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_argwhere_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_argwhere_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_broadcast_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_broadcast_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_broadcast_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_broadcast_to_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_broadcast_to_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_broadcast_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_cat_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_cat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_clamp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_clamp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_clone_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_clone_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_clone_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_diag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_diag_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_diagflat_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_diagflat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_diagflat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_diff_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_diff_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_diff_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_equal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_equal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_equal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_flatten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_flatten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_flatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_item_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_item_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_item_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_jiterator_2inputs_2outputs_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_jiterator_2inputs_2outputs_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_jiterator_2inputs_2outputs_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_jiterator_4inputs_with_extra_args_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_jiterator_4inputs_with_extra_args_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_cross_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_cross_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_cross_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_tensorinv_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_tensorinv_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_tensorsolve_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_tensorsolve_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_vander_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_vander_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_vander_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_vecdot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_vecdot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_meshgrid_variadic_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_meshgrid_variadic_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_meshgrid_variadic_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_native_layer_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_conv_transpose1d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_conv_transpose1d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_conv_transpose2d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_conv_transpose2d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_conv_transpose3d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_conv_transpose3d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_gelu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_group_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_l1_loss_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_l1_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_layer_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_mse_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_one_hot_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_pairwise_distance_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_pairwise_distance_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_pairwise_distance_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_pdist_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_rms_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_rms_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_permute_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_permute_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_permute_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_ravel_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_ravel_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_ravel_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_repeat_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_repeat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_repeat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_roll_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_roll_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_roll_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_searchsorted_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_searchsorted_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_bartlett_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_blackman_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_cosine_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_exponential_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_gaussian_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_general_cosine_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_general_hamming_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_hamming_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_hann_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_kaiser_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_nuttall_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_squeeze_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_squeeze_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_squeeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_squeeze_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_squeeze_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_squeeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_squeeze_multiple_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_squeeze_multiple_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_squeeze_multiple_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_tensor_split_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_tensor_split_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_tensor_split_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_tile_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_tile_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_tile_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_transpose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_transpose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_transpose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_tril_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_triu_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_unbind_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_unbind_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_unbind_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_unbind_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_unbind_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_unbind_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_unravel_index_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_view_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_view_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_view_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_where_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_where_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_where_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_H_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out___getitem___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out___radd___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out___rand___cuda_int64, test/test_ops.py::TestCommonCUDA::test_out___rdiv___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out___rmatmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out___rmod___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out___rmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out___ror___cuda_int64, test/test_ops.py::TestCommonCUDA::test_out___rpow___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out___rsub___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out___rxor___cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__chunk_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_cdouble_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_cfloat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_chalf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_half_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_allclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_asin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_atan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_bitwise_and_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs_bitwise_left_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs_bitwise_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs_bitwise_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs_bitwise_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs_block_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_broadcast_shapes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_broadcast_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_broadcast_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_bucketize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_ceil_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_clone_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_conj_physical_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_contiguous_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_count_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_diagonal_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_equal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_exp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_fftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_fftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_ifft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_irfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_rfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_floor_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_gcd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs_ge_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_heaviside_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_imag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out__refs_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_isclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_isfinite_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_isinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_isnan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_isneginf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_isreal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out__refs_item_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_lcm_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs_le_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_log2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_log_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_logical_and_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_logical_xor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_lt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_maximum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_minimum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_movedim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_new_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_celu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_elu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_selu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_normal__in_place_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_permute_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_positive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_ravel_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_reciprocal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_repeat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_reshape_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_rot90_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_select_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_erfcx_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_logit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_zeta_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_split_with_sizes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_squeeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_squeeze_multiple_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sum_to_size_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_tensor_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_trace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_transpose_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_transpose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_tril_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_triu_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_unbind_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_unbind_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_unflatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_unsqueeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_view_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_view_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_where_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__softmax_backward_data_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_addbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_addmm_decomposed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_addmv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_allclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_aminmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_angle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_argmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_argsort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_argwhere_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_asin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_atan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_baddbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_bernoulli_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_bincount_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_bitwise_and_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_bitwise_left_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_bitwise_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_bitwise_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_bitwise_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_block_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_bmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_broadcast_shapes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_broadcast_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_broadcast_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_bucketize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cartesian_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cdouble_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_ceil_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cfloat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_chalf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cholesky_inverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cholesky_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_clone_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_combinations_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_conj_physical_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_contiguous_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_corrcoef_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_count_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cov_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cummax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cummin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_diagflat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_diagonal_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_diff_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_dist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_einsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_empty_permuted_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_equal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_exp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_fftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_fftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_ifft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_irfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_rfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_floor_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_full_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_gather_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_gcd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_ge_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_geqrf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_gradient_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_grid_sampler_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_half_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_hash_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_heaviside_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_histc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_imag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_index_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_index_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_index_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_index_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_index_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_inner_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_integral_dtype__refs_prod_cuda_int16, test/test_ops.py::TestCommonCUDA::test_out_integral_dtype__refs_sum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_out_isclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_isfinite_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_isin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_isinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_isnan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_isneginf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_isreal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_item_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_jiterator_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_jiterator_unary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_kron_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_kthvalue_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_lcm_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_ldexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_le_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_cond_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_det_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_eig_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_eigh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_eigvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_householder_product_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_inv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_lstsq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_pinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_slogdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_vander_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_log2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_log_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logcumsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logical_and_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logical_xor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_lt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_lu_unpack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mH_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mT_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_argmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_softmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_matrix_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_max_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_maximum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_min_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_minimum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_movedim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_msort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_multinomial_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nanmean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nanmedian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nanquantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nansum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_native_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_native_dropout_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_new_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_celu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_elu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_embedding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_one_hot_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_selu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_silu_complex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_silu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_softmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nonzero_static_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_norm_fro_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_norm_inf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_norm_nuc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_normal_in_place_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_ones_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_ormqr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_outer_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_pca_lowrank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_permute_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_pinverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_positive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_quantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_rand_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_randint_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_randint_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_randn_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_ravel_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_reciprocal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_repeat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_repeat_interleave_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_abs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_acos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_acosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addbmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addcdiv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addcmul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addmm_decomposed_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addmm_decomposed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addmv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addmv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_alias_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_angle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_angle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_asin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_asin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_asinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_atan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_atan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_atanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_baddbmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_baddbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_bernoulli_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_bmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_bmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_ceil_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cholesky_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cholesky_inverse_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cholesky_inverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cholesky_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cholesky_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_column_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_conj_physical_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_conj_physical_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cummax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cummin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cumprod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_diagonal_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_diff_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_diff_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_div_no_rounding_mode_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_dot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_dstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_exp2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_exp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_exp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_expand_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_expm1_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_fft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_fft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_fftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_fftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_hfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_hfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_hfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_ifft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_ifft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_ifft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_ifftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_irfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_irfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_irfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_irfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_rfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_float_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_full_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_gather_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_gather_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_hstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_inner_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_inner_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_kron_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_kron_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_kthvalue_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_ldexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_ldexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lerp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cholesky_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cholesky_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cond_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cond_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_det_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_det_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eig_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eig_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eigh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eigh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eigvals_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eigvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eigvalsh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_householder_product_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_householder_product_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_inv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_inv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_inv_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lstsq_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lstsq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_factor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_factor_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_matrix_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_matrix_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_multi_dot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_pinv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_pinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_pinv_hermitian_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_qr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_slogdet_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_slogdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_solve_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_solve_triangular_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_svdvals_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_tensorinv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_tensorsolve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_vecdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_vector_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log10_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log1p_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logcumsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logcumsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lu_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lu_unpack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lu_unpack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_masked_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_masked_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_matmul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_max_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_maximum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_min_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_minimum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_msort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nanmean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nanmean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nanquantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nansum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nansum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_native_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_neg_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_linear_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_normalize_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_norm_fro_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_norm_fro_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_norm_inf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_norm_inf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_norm_nuc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_norm_nuc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_ormqr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_ormqr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_outer_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_outer_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_permute_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_pow_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_qr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_quantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_reciprocal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_reciprocal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_renorm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_round_decimals_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_round_decimals_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_rsqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sgn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sigmoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sinc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_slice_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sparse_sampled_addmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_special_erfcx_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_split_with_sizes_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_square_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_squeeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_squeeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_std_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_take_along_dim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_take_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_take_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_tan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_tanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_tensordot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_tensordot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_topk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_transpose_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_transpose_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_triangular_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_triangular_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_tril_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_triu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_true_divide_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_unbind_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_unbind_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_unfold_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_unsqueeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_var_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_vdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_view_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_view_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_vstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_where_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_where_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_zeros_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_reshape_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_resize__cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_resize_as__cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_resolve_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_resolve_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_rot90_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_round_decimals_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_round_decimals_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_scalar_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_scatter_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_searchsorted_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_select_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_hamming_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_hann_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_slice_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_slice_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_airy_ai_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_bessel_y0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_bessel_y1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_erfcx_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_zeta_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_split_list_args_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_split_with_sizes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_squeeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_squeeze_multiple_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_std_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sum_to_size_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_svd_lowrank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_take_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_tensor_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_tensordot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_tile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_to_sparse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_topk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_torch__scaled_mm_cuda_float8_e4m3fn, test/test_ops.py::TestCommonCUDA::test_out_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_ops.py::TestCommonCUDA::test_out_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_trace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_transpose_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_transpose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_trapezoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_trapz_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_triangular_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_tril_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_triu_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unbind_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unbind_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unflatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_uniform_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unique_consecutive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unique_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unravel_index_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_unsafe_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unsafe_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unsqueeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_var_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_view_as_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_view_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_view_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_warning_H_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_T_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___getitem___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___radd___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___rand___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___rdiv___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___rmatmul___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___rmod___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___rmul___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___ror___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___rpow___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___rsub___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___rxor___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__batch_norm_with_update_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__chunk_cat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__native_batch_norm_legit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_T_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_bfloat16_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_bool_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_byte_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_cdouble_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_cfloat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_chalf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_char_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_complex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_double_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_float_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_half_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_int_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_long_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_polar_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_short_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_abs_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_acos_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_acosh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_add_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_addcdiv_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_addcmul_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_addr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_alias_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_all_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_allclose_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_amax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_amin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_any_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_arange_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_as_strided_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_as_strided_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_as_strided_partial_views_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_as_strided_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_asin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_asinh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_atan2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_atan_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_atanh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_atleast_1d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_atleast_2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_atleast_3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_bitwise_and_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_bitwise_left_shift_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_bitwise_not_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_bitwise_or_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_bitwise_right_shift_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_bitwise_xor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_block_diag_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_broadcast_shapes_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_broadcast_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_broadcast_to_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_bucketize_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_cat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_cauchy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_ceil_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_chunk_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_clamp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_clamp_max_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_clamp_min_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_clone_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_column_stack_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_conj_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_conj_physical_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_constant_pad_nd_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_contiguous_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_copysign_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_cos_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_cosh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_count_nonzero_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_cumprod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_cumsum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_deg2rad_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_diag_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_diag_embed_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_diagonal_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_diagonal_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_digamma_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_div_floor_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_div_no_rounding_mode_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_div_trunc_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_dot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_dsplit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_dstack_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_empty_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_empty_like_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_empty_strided_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_eq_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_equal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_erf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_erfc_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_erfinv_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_exp2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_exp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_expand_as_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_expand_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_expand_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_expm1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_exponential_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_eye_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_fft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_fft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_fftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_fftshift_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_hfft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_hfft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_hfftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_ifft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_ifft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_ifftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_ifftshift_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_ihfft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_ihfft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_ihfftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_irfft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_irfft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_irfftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_rfft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_rfft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_rfftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fill_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_flatten_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_flip_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fliplr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_flipud_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_float_power_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_floor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_floor_divide_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fmin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fmod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_frac_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_frexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_gcd_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_ge_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_geometric_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_gt_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_heaviside_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_hsplit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_hstack_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_hypot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_i0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_igamma_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_igammac_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_imag_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_index_add_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_index_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_index_fill_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_index_select_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_isclose_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_isfinite_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_isinf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_isnan_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_isneginf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_isposinf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_isreal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_istft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_item_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_lcm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_le_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_lerp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_lgamma_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_cross_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_matrix_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_svd_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_svdvals_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_vecdot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_vector_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linspace_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_log10_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_log1p_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_log2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_log_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_log_normal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_log_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logaddexp2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logaddexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logical_and_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logical_not_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logical_or_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logical_xor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logspace_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_lt_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_masked_fill_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_maximum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_meshgrid_list_of_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_meshgrid_variadic_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_minimum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_movedim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_mul_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nan_to_num_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_narrow_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_narrow_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_native_layer_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_ne_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_neg_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_new_empty_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_new_empty_strided_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_new_full_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_new_ones_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_new_zeros_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nextafter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_alpha_dropout_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_celu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_channel_shuffle_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_dropout_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_elu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_gelu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_glu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_group_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_hardshrink_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_hardtanh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_hinge_embedding_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_huber_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_layer_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_leaky_relu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_log_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_margin_ranking_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_mish_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_mse_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_pairwise_distance_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_pdist_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_pixel_shuffle_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_pixel_unshuffle_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_poisson_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_prelu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_relu6_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_relu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_selu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_smooth_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_softmin_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_softplus_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_softshrink_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_tanhshrink_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_threshold_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_triplet_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_normal__in_place_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_normal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_normal_number_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_ones_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_permute_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_permute_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_positive_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_pow_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_prod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_rad2deg_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_randn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_ravel_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_real_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_reciprocal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_remainder_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_renorm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_repeat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_reshape_as_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_reshape_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_roll_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_rot90_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_round_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_rsqrt_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_rsub_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_select_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sgn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sigmoid_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sign_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_signbit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sinc_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sinh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_bessel_j0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_bessel_j1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_entr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_erfcx_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_i0e_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_i1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_i1e_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_log_ndtr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_log_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_logit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_multigammaln_mvlgamma_p_1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_multigammaln_mvlgamma_p_3_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_multigammaln_mvlgamma_p_5_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_ndtr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_ndtri_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_spherical_bessel_j0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_xlog1py_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_zeta_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_split_with_sizes_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sqrt_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_square_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_squeeze_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_squeeze_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_squeeze_multiple_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_stack_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_std_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_std_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_stft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sub_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sum_to_size_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_t_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_t_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_take_along_dim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_tan_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_tanh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_tensor_split_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_to_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_trace_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_transpose_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_transpose_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_tril_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_tril_indices_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_triu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_triu_indices_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_true_divide_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_trunc_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_unbind_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_unbind_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_unflatten_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_unfold_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_unfold_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_unsqueeze_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_unsqueeze_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_var_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_var_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_vdot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_view_as_complex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_view_as_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_view_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_view_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_vsplit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_vstack_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_where_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_xlogy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_zeros_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__segment_reduce_lengths_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__segment_reduce_offsets_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__softmax_backward_data_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__unsafe_masked_index_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__unsafe_masked_index_put_accumulate_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__upsample_bilinear2d_aa_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_abs_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_acos_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_acosh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_add_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_addbmm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_addcdiv_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_addcmul_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_addmm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_addmm_decomposed_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_addmv_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_addr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_alias_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_all_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_allclose_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_amax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_amin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_aminmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_angle_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_any_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_arange_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_argmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_argmin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_argsort_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_argwhere_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_as_strided_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_as_strided_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_as_strided_partial_views_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_as_strided_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_asin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_asinh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_atan2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_atan_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_atanh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_atleast_1d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_atleast_2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_atleast_3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_baddbmm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bernoulli_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bfloat16_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bincount_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bitwise_and_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bitwise_left_shift_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bitwise_not_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bitwise_or_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bitwise_right_shift_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bitwise_xor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_block_diag_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bmm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bool_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_broadcast_shapes_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_broadcast_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_broadcast_to_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bucketize_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_byte_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cartesian_prod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cauchy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cdist_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cdouble_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_ceil_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cfloat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_chalf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_char_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cholesky_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cholesky_inverse_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cholesky_solve_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_chunk_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_clamp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_clamp_max_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_clamp_min_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_clone_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_column_stack_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_combinations_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_complex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_conj_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_conj_physical_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_constant_pad_nd_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_contiguous_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_copysign_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_corrcoef_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cos_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cosh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_count_nonzero_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cov_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cross_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cummax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cummin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cumprod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cumsum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cumulative_trapezoid_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_deg2rad_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_diag_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_diag_embed_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_diagflat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_diagonal_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_diagonal_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_diff_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_digamma_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_dist_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_div_floor_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_div_no_rounding_mode_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_div_trunc_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_dot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_double_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_dsplit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_dstack_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_einsum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_empty_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_empty_like_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_empty_permuted_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_empty_strided_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_eq_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_equal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_erf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_erfc_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_erfinv_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_exp2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_exp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_expand_as_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_expand_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_expand_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_expm1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_exponential_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_eye_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_fft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_fft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_fftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_fftshift_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_hfft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_hfft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_hfftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_ifft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_ifft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_ifftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_ifftshift_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_ihfft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_ihfft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_ihfftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_irfft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_irfft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_irfftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_rfft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_rfft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_rfftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fill_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_flatten_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_flip_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fliplr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_flipud_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_float_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_float_power_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_floor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_floor_divide_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fmin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fmod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_frac_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_frexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_full_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_full_like_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_gather_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_gcd_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_ge_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_geometric_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_geqrf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_gradient_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_grid_sampler_2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_grid_sampler_3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_gt_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_half_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_hash_tensor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_heaviside_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_histc_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_histogram_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_histogramdd_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_hsplit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_hstack_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_hypot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_i0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_igamma_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_igammac_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_imag_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_add_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_fill_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_put_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_reduce_amax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_reduce_amin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_reduce_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_reduce_prod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_select_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_inner_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_int_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isclose_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isfinite_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isinf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isnan_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isneginf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isposinf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isreal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_istft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_item_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_jiterator_2inputs_2outputs_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_jiterator_4inputs_with_extra_args_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_jiterator_binary_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_jiterator_binary_return_by_ref_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_jiterator_unary_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_kron_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_kthvalue_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_lcm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_ldexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_le_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_lerp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_lgamma_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_cholesky_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_cholesky_ex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_cond_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_cross_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_det_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_eig_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_eigh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_eigvals_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_eigvalsh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_householder_product_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_inv_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_inv_ex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_ldl_factor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_ldl_factor_ex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_ldl_solve_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_lstsq_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_lstsq_grad_oriented_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_lu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_lu_factor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_lu_factor_ex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_lu_solve_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_matrix_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_matrix_power_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_matrix_rank_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_matrix_rank_hermitian_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_multi_dot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_norm_subgradients_at_zero_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_pinv_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_pinv_hermitian_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_pinv_singular_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_qr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_slogdet_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_solve_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_solve_ex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_solve_triangular_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_svd_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_svdvals_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_tensorinv_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_tensorsolve_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_vander_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_vecdot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_vector_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linspace_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_log10_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_log1p_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_log2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_log_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_log_normal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_log_softmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_log_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logaddexp2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logaddexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logcumsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logdet_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logical_and_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logical_not_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logical_or_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logical_xor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logspace_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_long_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_lt_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_lu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_lu_solve_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_lu_unpack_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mH_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mT_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_amax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_amin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_argmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_argmin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_cumprod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_cumsum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_fill_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_log_softmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_logaddexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_logsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_median_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_normalize_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_prod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_select_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_softmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_softmin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_std_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_sum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_var_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_matmul_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_matrix_exp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_max_binary_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_max_pool2d_with_indices_backward_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_max_reduction_no_dim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_max_reduction_with_dim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_maximum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_median_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_meshgrid_list_of_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_meshgrid_variadic_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_min_binary_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_min_reduction_no_dim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_min_reduction_with_dim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_minimum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mode_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_movedim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_msort_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mul_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_multinomial_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mv_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mvlgamma_mvlgamma_p_1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mvlgamma_mvlgamma_p_3_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mvlgamma_mvlgamma_p_5_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nan_to_num_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nanmean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nanmedian_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nanquantile_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nansum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_narrow_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_narrow_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_native_batch_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_native_dropout_backward_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_native_layer_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_ne_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_neg_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_new_empty_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_new_empty_strided_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_new_full_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_new_ones_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_new_zeros_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nextafter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_adaptive_avg_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_adaptive_avg_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_adaptive_avg_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_adaptive_max_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_adaptive_max_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_adaptive_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_alpha_dropout_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_avg_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_avg_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_avg_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_batch_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_batch_norm_without_cudnn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_bilinear_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_binary_cross_entropy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_binary_cross_entropy_with_logits_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_celu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_channel_shuffle_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_conv1d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_conv2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_conv3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_conv_transpose1d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_conv_transpose2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_conv_transpose3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_cosine_embedding_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_cosine_similarity_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_cross_entropy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_ctc_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_dropout2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_dropout3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_dropout_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_elu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_embedding_bag_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_embedding_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_feature_alpha_dropout_with_train_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_feature_alpha_dropout_without_train_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_fractional_max_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_fractional_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_gaussian_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_gelu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_glu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_grid_sample_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_group_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_hardshrink_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_hardsigmoid_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_hardswish_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_hardtanh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_hinge_embedding_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_huber_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_instance_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_interpolate_area_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_interpolate_bicubic_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_interpolate_bilinear_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_interpolate_linear_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_interpolate_nearest-exact_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_interpolate_nearest_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_interpolate_trilinear_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_kl_div_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_layer_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_leaky_relu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_linear_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_local_response_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_logsigmoid_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_margin_ranking_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_max_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_max_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_max_unpool1d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_max_unpool1d_grad_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_max_unpool2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_max_unpool2d_grad_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_max_unpool3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_max_unpool3d_grad_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_mish_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_mse_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_multi_head_attention_forward_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_multi_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_multilabel_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_multilabel_soft_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_normalize_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_one_hot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pad_circular_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pad_constant_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pad_reflect_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pad_replicate_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pad_replicate_negative_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pairwise_distance_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pdist_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pixel_shuffle_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pixel_unshuffle_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_poisson_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_prelu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_relu6_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_relu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_rms_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_rrelu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_scaled_dot_product_attention_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_selu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_silu_complex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_silu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_smooth_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_soft_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_softmin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_softmin_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_softplus_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_softshrink_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_softsign_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_tanhshrink_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_threshold_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_triplet_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_triplet_margin_with_distance_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_unfold_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_upsample_bilinear_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_upsample_nearest_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nonzero_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nonzero_static_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_norm_fro_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_norm_inf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_norm_nuc_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_normal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_normal_in_place_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_normal_number_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_ones_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_ones_like_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_ormqr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_outer_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_pca_lowrank_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_permute_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_permute_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_pinverse_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_polar_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_polygamma_polygamma_n_0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_polygamma_polygamma_n_1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_polygamma_polygamma_n_2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_polygamma_polygamma_n_3_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_polygamma_polygamma_n_4_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_positive_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_pow_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_prod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_put_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_qr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_quantile_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_rad2deg_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_rand_like_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_randint_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_randint_like_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_randn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_randn_like_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_ravel_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_real_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_reciprocal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_remainder_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_renorm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_repeat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_repeat_interleave_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_reshape_as_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_reshape_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_resize__cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_resize_as__cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_resolve_conj_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_resolve_neg_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_roll_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_rot90_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_round_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_round_decimals_0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_round_decimals_3_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_round_decimals_neg_3_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_rsqrt_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_rsub_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_scalar_tensor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_scatter_add_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_scatter_reduce_amax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_scatter_reduce_amin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_scatter_reduce_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_scatter_reduce_prod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_scatter_reduce_sum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_searchsorted_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_select_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_select_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sgn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_short_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sigmoid_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sign_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_bartlett_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_blackman_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_cosine_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_exponential_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_gaussian_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_general_cosine_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_general_hamming_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_hamming_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_hann_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_kaiser_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_nuttall_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signbit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sinc_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sinh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_slice_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_slice_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_softmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sort_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sparse_mm_reduce_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sparse_sampled_addmm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_airy_ai_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_bessel_j0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_bessel_j1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_bessel_y0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_bessel_y1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_chebyshev_polynomial_t_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_chebyshev_polynomial_u_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_chebyshev_polynomial_v_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_chebyshev_polynomial_w_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_entr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_erfcx_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_hermite_polynomial_h_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_hermite_polynomial_he_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_i0e_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_i1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_i1e_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_laguerre_polynomial_l_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_legendre_polynomial_p_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_log_ndtr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_modified_bessel_i0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_modified_bessel_i1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_modified_bessel_k0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_modified_bessel_k1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_ndtr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_ndtri_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_polygamma_special_polygamma_n_0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_scaled_modified_bessel_k0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_scaled_modified_bessel_k1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_shifted_chebyshev_polynomial_t_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_shifted_chebyshev_polynomial_u_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_shifted_chebyshev_polynomial_v_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_shifted_chebyshev_polynomial_w_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_spherical_bessel_j0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_xlog1py_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_zeta_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_split_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_split_list_args_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_split_with_sizes_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_split_with_sizes_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sqrt_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_square_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_squeeze_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_squeeze_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_squeeze_multiple_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_stack_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_std_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_std_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_std_mean_unbiased_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_std_unbiased_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_stft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sub_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sum_to_size_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_svd_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_svd_lowrank_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_t_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_t_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_take_along_dim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_take_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_tan_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_tanh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_tensor_split_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_tensordot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_tile_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_to_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_to_sparse_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_topk_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_torch__scaled_mm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_torch_ops_aten__efficient_attention_forward_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_torch_ops_aten__flash_attention_forward_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_torch_ops_aten__safe_softmax_default_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_trace_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_transpose_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_transpose_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_trapezoid_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_trapz_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_triangular_solve_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_tril_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_tril_indices_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_triu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_triu_indices_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_true_divide_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_trunc_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unbind_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unbind_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unflatten_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unfold_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unfold_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_uniform_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unique_consecutive_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unique_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unravel_index_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unsafe_chunk_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unsafe_split_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unsqueeze_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unsqueeze_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_var_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_var_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_var_mean_unbiased_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_var_unbiased_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_vdot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_view_as_complex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_view_as_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_view_as_real_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_view_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_view_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_vsplit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_vstack_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_where_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_xlogy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_zero__cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_zeros_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_zeros_like_cuda, test/test_ops.py::TestCommonCUDA::test_out_where_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_zero__cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_zeros_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_pointwise_tag_coverage_cuda, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acos_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acos_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acos_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acosh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acosh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acosh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acosh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asinh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asinh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asinh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asinh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asinh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atanh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atanh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atanh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atanh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_copysign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_copysign_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_copysign_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_copysign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_copysign_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_copysign_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cos_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cos_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cos_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cosh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cosh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cosh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cosh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_deg2rad_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_deg2rad_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_deg2rad_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_deg2rad_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_deg2rad_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_deg2rad_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_digamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_digamma_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_digamma_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_digamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_digamma_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_digamma_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_div_no_rounding_mode_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_div_no_rounding_mode_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_div_no_rounding_mode_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_div_no_rounding_mode_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_div_no_rounding_mode_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_div_no_rounding_mode_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfinv_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfinv_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfinv_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfinv_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfinv_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_expm1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_expm1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_expm1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_expm1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_expm1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_expm1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_float_power_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_float_power_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_float_power_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_float_power_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_float_power_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_float_power_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_i0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_i0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_i0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_i0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_i0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_ldexp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_ldexp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_ldexp_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_ldexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_ldexp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_ldexp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_lgamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_lgamma_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_lgamma_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_lgamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_lgamma_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_lgamma_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log10_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log10_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log10_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log10_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log10_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log10_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log1p_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log1p_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log1p_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log1p_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log1p_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log1p_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_logit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_logit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_logit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_logit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_logit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_logit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_std_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_std_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_std_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_std_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_std_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_var_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_var_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_var_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_var_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_var_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_3_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_3_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_3_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_3_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_3_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_4_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_4_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_4_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_4_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_4_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_4_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rad2deg_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rad2deg_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rad2deg_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rad2deg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rad2deg_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rad2deg_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_reciprocal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_reciprocal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_reciprocal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_reciprocal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_reciprocal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_reciprocal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rsqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rsqrt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rsqrt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rsqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rsqrt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rsqrt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sigmoid_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sigmoid_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sigmoid_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sigmoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sigmoid_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sigmoid_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_t_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_t_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_t_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_t_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_t_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_u_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_u_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_u_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_u_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_u_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_u_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_v_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_v_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_v_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_v_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_v_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_v_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_w_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_w_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_w_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_w_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_w_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_w_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_h_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_h_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_h_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_h_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_h_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_h_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_he_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_he_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_he_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_he_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_he_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_he_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_laguerre_polynomial_l_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_laguerre_polynomial_l_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_laguerre_polynomial_l_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_laguerre_polynomial_l_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_laguerre_polynomial_l_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_laguerre_polynomial_l_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_legendre_polynomial_p_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_legendre_polynomial_p_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_legendre_polynomial_p_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_legendre_polynomial_p_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_legendre_polynomial_p_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_legendre_polynomial_p_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_xlog1py_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_xlog1py_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_xlog1py_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_xlog1py_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_xlog1py_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_xlog1py_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_zeta_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_zeta_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_zeta_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_zeta_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_zeta_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_zeta_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sqrt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sqrt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sqrt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sqrt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tanh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tanh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tanh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tanh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_true_divide_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_true_divide_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_true_divide_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_true_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_true_divide_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_true_divide_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_complex_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_complex_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_polar_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcdiv_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcdiv_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcdiv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcdiv_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcdiv_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_allclose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_allclose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_allclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_allclose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_allclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_allclose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_and_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_and_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_and_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_and_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_and_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_left_shift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_left_shift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_left_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_left_shift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_left_shift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_not_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_not_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_not_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_not_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_not_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_or_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_or_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_or_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_or_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_or_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_right_shift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_right_shift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_right_shift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_right_shift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_xor_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_xor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_xor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_xor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_shapes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cauchy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cauchy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cauchy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exponential_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exponential_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exponential_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_float8_e4m3fn, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_float8_e4m3fnuz, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_float8_e5m2, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_float8_e5m2fnuz, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_divide_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_divide_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_divide_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_divide_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_divide_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_divide_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_divide_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_frac_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_frac_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_frac_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_frexp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_frexp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_frexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gcd_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gcd_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gcd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gcd_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gcd_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hypot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hypot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hypot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_igamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_igammac_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_imag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_imag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_imag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_istft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lcm_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lcm_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lcm_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lcm_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lcm_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lerp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lerp_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lerp_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lerp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lerp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lerp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_matrix_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_matrix_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_matrix_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_matrix_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_matrix_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_svd_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_svd_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_svdvals_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_svdvals_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_svdvals_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vecdot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vecdot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vecdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vecdot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vecdot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vector_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vector_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vector_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vector_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vector_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_normal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_normal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_normal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_native_layer_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_native_layer_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_native_layer_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nextafter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nextafter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nextafter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_alpha_dropout_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_alpha_dropout_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_alpha_dropout_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_celu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_celu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_celu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_celu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_dropout_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_dropout_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_dropout_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_elu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_elu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_elu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_elu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_gelu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_gelu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_gelu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_glu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_glu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_glu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_group_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_group_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_group_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardshrink_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardshrink_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardtanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardtanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardtanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardtanh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardtanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardtanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardtanh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hinge_embedding_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_huber_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_huber_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_huber_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_l1_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_l1_loss_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_l1_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_l1_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_l1_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_layer_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_layer_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_layer_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_leaky_relu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_leaky_relu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_leaky_relu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mish_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mish_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mish_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mse_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mse_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mse_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_nll_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_nll_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_nll_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pdist_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_poisson_nll_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_poisson_nll_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_poisson_nll_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_poisson_nll_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_poisson_nll_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_poisson_nll_loss_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_prelu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_prelu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_prelu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_selu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_selu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_selu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_selu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_smooth_l1_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softplus_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softplus_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softplus_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softshrink_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softshrink_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal__in_place_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal__in_place_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal__in_place_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal__in_place_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal__in_place_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal__in_place_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_number_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_number_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_number_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_randn_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_randn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_randn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_randn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_randn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_randn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_renorm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_renorm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_renorm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_renorm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_renorm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_erfcx_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_erfcx_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_erfcx_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_erfcx_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_erfcx_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_erfcx_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_erfcx_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_erfcx_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_1_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_1_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtri_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtri_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtri_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtri_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtri_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtri_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtri_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_spherical_bessel_j0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_spherical_bessel_j0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_spherical_bessel_j0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_spherical_bessel_j0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_spherical_bessel_j0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_spherical_bessel_j0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_spherical_bessel_j0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tensor_split_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tensor_split_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tensor_split_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tensor_split_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tensor_split_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tensor_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tensor_split_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tensor_split_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tensor_split_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tensor_split_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tensor_split_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tensor_split_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_indices_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_indices_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vdot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vdot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vdot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vdot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_complex_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_complex_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_T_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs__conversions_complex_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs__conversions_polar_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_add_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_amax_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_amin_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_arange_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_as_strided_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_atan2_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_bitwise_and_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_bitwise_left_shift_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_bitwise_or_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_bitwise_right_shift_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_bitwise_xor_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_bucketize_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_cat_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_cauchy_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_clamp_max_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_clamp_min_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_copysign_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_diag_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_diag_embed_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_diagonal_copy_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_div_floor_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_div_no_rounding_mode_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_div_trunc_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_dot_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_dsplit_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_dstack_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_eq_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_exponential_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_eye_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_fft2_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_fft_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_fftn_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_hfft2_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_hfft_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_hfftn_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_ifft2_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_ifft_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_ifftn_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_ihfft2_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_ihfft_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_ihfftn_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_irfft2_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_irfft_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_irfftn_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_rfft2_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_rfft_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_rfftn_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fliplr_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_flipud_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_float_power_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_floor_divide_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fmax_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fmin_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fmod_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_gcd_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_ge_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_geometric_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_gt_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_heaviside_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_hsplit_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_hstack_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_hypot_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_igamma_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_igammac_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_index_add_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_index_select_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_isclose_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_item_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_lcm_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_le_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_linalg_cross_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_linalg_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_linspace_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_linspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_log_normal_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_logaddexp_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_logical_and_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_logical_or_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_logical_xor_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_logspace_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_logspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_lt_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_masked_fill_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_maximum_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_mean_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_minimum_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_movedim_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_mul_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_narrow_copy_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_narrow_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_native_layer_norm_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_ne_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_neg_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nextafter_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_gelu_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_group_norm_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_hardtanh_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_hinge_embedding_loss_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_huber_loss_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_margin_ranking_loss_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_poisson_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_prelu_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_softshrink_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_triplet_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_normal__in_place_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_pow_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_remainder_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_renorm_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_reshape_as_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_reshape_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_roll_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_rot90_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_rsub_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_special_xlog1py_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_special_zeta_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_sub_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_sum_to_size_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_t_copy_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_t_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_trace_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_tril_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_triu_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_true_divide_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_unbind_copy_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_unbind_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_vdot_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_view_as_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_view_copy_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_view_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_vsplit_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_vstack_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_where_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_xlogy_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_complex_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_complex_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_complex_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_polar_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_polar_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcdiv_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcdiv_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcdiv_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcdiv_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcdiv_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcdiv_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_allclose_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_allclose_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_allclose_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_allclose_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_allclose_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_allclose_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_and_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_and_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_and_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_and_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_and_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_and_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_left_shift_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_left_shift_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_left_shift_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_left_shift_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_left_shift_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_not_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_not_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_not_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_not_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_not_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_not_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_or_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_or_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_or_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_or_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_or_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_or_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_right_shift_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_right_shift_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_right_shift_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_right_shift_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_right_shift_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_xor_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_xor_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_xor_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_xor_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_xor_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_xor_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_shapes_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cauchy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cauchy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cauchy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cauchy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_trunc_rounding_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_trunc_rounding_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_trunc_rounding_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_trunc_rounding_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_trunc_rounding_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_trunc_rounding_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_trunc_rounding_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_trunc_rounding_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_trunc_rounding_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dot_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dot_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dot_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dot_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dot_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dot_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exponential_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exponential_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exponential_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exponential_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_float8_e4m3fn, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_float8_e4m3fnuz, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_float8_e5m2, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_float8_e5m2fnuz, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft2_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft2_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft2_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_divide_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_divide_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_divide_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_divide_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_divide_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_divide_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_divide_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_divide_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_divide_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmod_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmod_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmod_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmod_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmod_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmod_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmod_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmod_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmod_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_frac_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_frac_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_frac_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_frac_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_frexp_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_frexp_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_frexp_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_frexp_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gcd_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gcd_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gcd_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gcd_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gcd_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gt_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gt_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gt_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gt_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gt_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gt_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gt_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gt_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gt_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gt_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hypot_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hypot_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hypot_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hypot_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_igamma_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_igamma_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_igammac_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_igammac_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_imag_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_imag_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_imag_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_istft_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_istft_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lcm_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lcm_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lcm_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lcm_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lcm_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_matrix_norm_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_matrix_norm_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_matrix_norm_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_matrix_norm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_matrix_norm_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_matrix_norm_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_norm_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_norm_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_norm_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_norm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_norm_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_norm_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svd_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svd_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svd_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svd_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svdvals_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svdvals_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svdvals_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svdvals_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vecdot_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vecdot_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vecdot_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vecdot_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vecdot_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vecdot_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vector_norm_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vector_norm_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vector_norm_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vector_norm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vector_norm_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vector_norm_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_normal_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_normal_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_normal_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_normal_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logaddexp2_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logaddexp2_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logaddexp2_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logaddexp2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logaddexp_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logaddexp_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logaddexp_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logaddexp_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lt_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lt_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lt_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lt_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lt_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lt_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lt_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lt_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lt_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lt_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mean_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mean_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mean_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mean_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mean_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mean_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_native_layer_norm_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_native_layer_norm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_native_layer_norm_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_native_layer_norm_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nextafter_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nextafter_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nextafter_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nextafter_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_alpha_dropout_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_alpha_dropout_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_alpha_dropout_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_alpha_dropout_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_celu_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_celu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_celu_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_celu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_dropout_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_dropout_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_dropout_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_dropout_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_elu_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_elu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_elu_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_elu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_gelu_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_gelu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_gelu_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_gelu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_glu_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_glu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_glu_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_glu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_group_norm_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_group_norm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_group_norm_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_group_norm_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardshrink_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardshrink_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardshrink_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardshrink_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardtanh_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardtanh_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardtanh_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardtanh_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardtanh_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardtanh_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardtanh_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardtanh_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hinge_embedding_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hinge_embedding_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hinge_embedding_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hinge_embedding_loss_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_huber_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_huber_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_huber_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_huber_loss_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_l1_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_l1_loss_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_l1_loss_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_l1_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_l1_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_l1_loss_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_layer_norm_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_layer_norm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_layer_norm_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_layer_norm_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_leaky_relu_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_leaky_relu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_leaky_relu_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_leaky_relu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_mish_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_mish_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_mish_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_mish_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_mse_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_mse_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_mse_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_mse_loss_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_nll_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_nll_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_nll_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_nll_loss_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pdist_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pdist_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_poisson_nll_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_poisson_nll_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_poisson_nll_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_poisson_nll_loss_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_poisson_nll_loss_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_poisson_nll_loss_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_poisson_nll_loss_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_poisson_nll_loss_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_poisson_nll_loss_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_prelu_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_prelu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_prelu_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_prelu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_selu_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_selu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_selu_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_selu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_smooth_l1_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_smooth_l1_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_smooth_l1_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_smooth_l1_loss_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softplus_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softplus_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softplus_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softplus_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softshrink_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softshrink_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softshrink_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softshrink_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_norm_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_norm_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_norm_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_norm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_norm_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_norm_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal__in_place_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal__in_place_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal__in_place_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal__in_place_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal__in_place_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal__in_place_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_number_mean_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_number_mean_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_number_mean_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_number_mean_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rad2deg_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rad2deg_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rad2deg_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rad2deg_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rad2deg_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rad2deg_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rad2deg_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rad2deg_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rad2deg_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rad2deg_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_randn_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_randn_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_randn_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_randn_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_randn_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_randn_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_randn_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_renorm_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_renorm_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_renorm_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_renorm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_renorm_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_renorm_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_round_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_round_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_round_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_round_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_round_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_round_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_round_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_round_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_round_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j0_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j0_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j0_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j0_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j0_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j0_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j0_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j0_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_erfcx_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_erfcx_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_erfcx_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_erfcx_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_erfcx_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_erfcx_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_erfcx_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_erfcx_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_ndtr_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_ndtr_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_ndtr_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_ndtr_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_ndtr_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_ndtr_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_ndtr_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_ndtr_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_1_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_1_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_1_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_1_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_1_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_1_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_1_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_1_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_1_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtri_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtri_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtri_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtri_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtri_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtri_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtri_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtri_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_zeta_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_zeta_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_zeta_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_zeta_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_zeta_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_zeta_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_zeta_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_zeta_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_mean_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_mean_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_mean_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_mean_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_mean_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_mean_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stft_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stft_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stft_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stft_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_indices_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_indices_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_indices_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_indices_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_mean_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_mean_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_mean_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_mean_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_mean_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_mean_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vdot_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vdot_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vdot_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vdot_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vdot_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vdot_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_complex_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_complex_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_complex_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_complex_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_complex_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_polar_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcdiv_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcdiv_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcdiv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcdiv_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcdiv_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_allclose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_allclose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_allclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_allclose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_allclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_allclose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_and_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_and_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_and_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_and_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_and_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_left_shift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_left_shift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_left_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_left_shift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_left_shift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_not_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_not_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_not_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_not_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_not_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_or_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_or_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_or_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_or_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_or_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_right_shift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_right_shift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_right_shift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_right_shift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_shapes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cauchy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cauchy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cauchy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_deg2rad_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_deg2rad_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_deg2rad_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_deg2rad_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_deg2rad_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_deg2rad_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_deg2rad_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_deg2rad_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_deg2rad_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_trunc_rounding_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_trunc_rounding_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_trunc_rounding_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_trunc_rounding_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_trunc_rounding_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_trunc_rounding_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_trunc_rounding_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_trunc_rounding_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfc_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exponential_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exponential_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exponential_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_float8_e4m3fn, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_float8_e4m3fnuz, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_float8_e5m2, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_float8_e5m2fnuz, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_divide_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_divide_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_divide_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_divide_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_divide_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_divide_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_divide_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_frac_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_frac_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_frac_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_frexp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_frexp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_frexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gcd_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gcd_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gcd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gcd_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gcd_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hypot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hypot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hypot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_igamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_igammac_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_imag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_imag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_imag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_istft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lcm_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lcm_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lcm_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lcm_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lcm_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lerp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lerp_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lerp_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lerp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lerp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lerp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_matrix_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_matrix_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_matrix_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_matrix_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_matrix_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_svd_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_svd_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_svdvals_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_svdvals_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_svdvals_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vecdot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vecdot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vecdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vecdot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vecdot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vector_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vector_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vector_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vector_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vector_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_normal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_normal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_normal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logaddexp2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logaddexp2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logaddexp2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logaddexp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logaddexp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logaddexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_and_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_and_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_and_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_and_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_and_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_and_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_and_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_and_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_and_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_and_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_and_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_native_layer_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_native_layer_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_native_layer_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nextafter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nextafter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nextafter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_alpha_dropout_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_alpha_dropout_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_alpha_dropout_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_celu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_celu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_celu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_celu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_dropout_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_dropout_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_dropout_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_elu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_elu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_elu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_elu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_gelu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_gelu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_gelu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_glu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_glu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_glu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_group_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_group_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_group_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardshrink_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardshrink_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hinge_embedding_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_huber_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_huber_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_huber_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_l1_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_l1_loss_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_l1_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_l1_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_l1_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_layer_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_layer_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_layer_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_leaky_relu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_leaky_relu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_leaky_relu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_mish_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_mish_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_mish_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_mse_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_mse_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_mse_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_nll_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_nll_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_nll_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pdist_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_prelu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_prelu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_prelu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_selu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_selu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_selu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_selu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_smooth_l1_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmin_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmin_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmin_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmin_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmin_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmin_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmin_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmin_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softplus_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softplus_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softplus_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softshrink_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softshrink_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal__in_place_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal__in_place_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal__in_place_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal__in_place_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal__in_place_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal__in_place_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_number_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_number_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_number_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_renorm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_renorm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_renorm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_renorm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_renorm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_erfcx_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_erfcx_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_erfcx_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_erfcx_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_erfcx_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_erfcx_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_erfcx_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_erfcx_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_ndtr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_ndtr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_ndtr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_ndtr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_ndtr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_ndtr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_indices_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_indices_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vdot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vdot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vdot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vdot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_complex_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_complex_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_complex_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_complex_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_polar_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcdiv_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcdiv_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcdiv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcdiv_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcdiv_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_allclose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_allclose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_allclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_allclose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_allclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_allclose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_arange_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_arange_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_arange_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_arange_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_arange_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_arange_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_arange_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_arange_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_and_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_and_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_and_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_and_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_and_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_left_shift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_left_shift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_left_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_left_shift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_left_shift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_not_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_not_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_not_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_not_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_not_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_or_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_or_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_or_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_or_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_or_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_right_shift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_right_shift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_right_shift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_right_shift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_xor_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_xor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_xor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_xor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_shapes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cauchy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cauchy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cauchy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_digamma_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_digamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_digamma_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_digamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_digamma_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_digamma_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_digamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_digamma_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_digamma_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exponential_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exponential_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exponential_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_float8_e4m3fn, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_float8_e4m3fnuz, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_float8_e5m2, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_float8_e5m2fnuz, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmax_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmax_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmax_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmax_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmax_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmax_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmin_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmod_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmod_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmod_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_frac_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_frac_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_frac_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_frexp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_frexp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_frexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gcd_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gcd_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gcd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gcd_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gcd_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hypot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hypot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hypot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_i0_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_i0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_i0_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_i0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_i0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_i0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_i0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_i0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_igamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_igammac_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_imag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_imag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_imag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_istft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lcm_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lcm_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lcm_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lcm_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lcm_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lerp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lerp_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lerp_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lerp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lerp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lerp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_matrix_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_matrix_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_matrix_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_matrix_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_matrix_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_svd_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_svd_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_svdvals_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_svdvals_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_svdvals_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vecdot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vecdot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vecdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vecdot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vecdot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vector_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vector_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vector_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vector_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vector_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_normal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_normal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_normal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logaddexp2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logaddexp2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logaddexp2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logaddexp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logaddexp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logaddexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_native_layer_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_native_layer_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_native_layer_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nextafter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nextafter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nextafter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_alpha_dropout_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_alpha_dropout_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_alpha_dropout_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_celu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_celu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_celu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_celu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_dropout_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_dropout_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_dropout_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_elu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_elu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_elu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_elu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_gelu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_gelu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_gelu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_glu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_glu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_glu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_group_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_group_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_group_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardshrink_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardshrink_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardtanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardtanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardtanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardtanh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardtanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardtanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardtanh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hinge_embedding_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_huber_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_huber_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_huber_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_l1_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_l1_loss_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_l1_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_l1_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_l1_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_layer_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_layer_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_layer_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_leaky_relu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_leaky_relu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_leaky_relu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mish_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mish_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mish_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mse_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mse_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mse_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_nll_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_nll_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_nll_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pdist_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_prelu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_prelu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_prelu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_selu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_selu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_selu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_selu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_smooth_l1_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softplus_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softplus_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softplus_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softshrink_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softshrink_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal__in_place_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal__in_place_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal__in_place_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal__in_place_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal__in_place_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal__in_place_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal_number_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal_number_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal_number_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rad2deg_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rad2deg_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rad2deg_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rad2deg_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rad2deg_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rad2deg_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rad2deg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rad2deg_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rad2deg_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_randn_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_randn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_randn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_randn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_randn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_randn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_remainder_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_remainder_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_remainder_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_remainder_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_remainder_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_remainder_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_remainder_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_remainder_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_renorm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_renorm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_renorm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_renorm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_renorm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_ndtr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_ndtr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_ndtr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_ndtr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_ndtr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_ndtr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_5_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_5_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_5_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_5_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_5_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_5_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_5_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtri_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtri_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtri_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtri_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtri_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtri_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtri_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_spherical_bessel_j0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_spherical_bessel_j0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_spherical_bessel_j0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_spherical_bessel_j0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_spherical_bessel_j0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_spherical_bessel_j0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_spherical_bessel_j0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_indices_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_indices_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vdot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vdot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vdot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vdot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_complex_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_complex_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_H_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_H_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_T_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___getitem___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___getitem___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___radd___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___radd___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rdiv___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rdiv___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rmatmul___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rmatmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rmod___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rmul___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rpow___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rpow___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rsub___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rsub___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__chunk_cat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__chunk_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__softmax_backward_data_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__unsafe_masked_index_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_abs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_acos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_acosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addbmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addcdiv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addcmul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addmm_decomposed_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addmm_decomposed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addmv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addmv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_alias_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_all_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_allclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_allclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_aminmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_angle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_angle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_any_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_argmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_argsort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_argwhere_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_argwhere_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_partial_views_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_asin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_asin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_asinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atleast_1d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atleast_2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atleast_3d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_baddbmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_baddbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bernoulli_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bfloat16_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_block_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_block_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bool_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_broadcast_shapes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_broadcast_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_broadcast_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_broadcast_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_broadcast_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bucketize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_byte_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cartesian_prod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cartesian_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cdouble_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cdouble_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ceil_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cfloat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cfloat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_chalf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_chalf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_char_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cholesky_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cholesky_inverse_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cholesky_inverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cholesky_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cholesky_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_chunk_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_clone_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_clone_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_column_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_combinations_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_combinations_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_conj_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_conj_physical_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_conj_physical_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_constant_pad_nd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_contiguous_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_contiguous_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_corrcoef_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_corrcoef_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_count_nonzero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_count_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cov_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cov_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cummax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cummin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumprod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumulative_trapezoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diag_embed_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diagflat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diagflat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diagonal_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diagonal_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diagonal_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diff_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diff_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dist_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_div_no_rounding_mode_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_double_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_einsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_einsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_permuted_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_permuted_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_eq_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_equal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_equal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_exp2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_exp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_exp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_expand_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_expand_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_expand_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_expm1_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_eye_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_fft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_fft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_fftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_fftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_fftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_fftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_hfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_hfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_hfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ifft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ifft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ifft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ifftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ifftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_irfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_irfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_irfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_irfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_rfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_flatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_flip_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fliplr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_flipud_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_float_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_float_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_floor_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_full_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_full_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_full_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_gather_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_gather_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ge_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_geqrf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_geqrf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_gradient_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_gradient_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_grid_sampler_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_half_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_half_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_hash_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_heaviside_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_histc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_hsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_hstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_imag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_put_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_inner_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_inner_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_int_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isfinite_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isfinite_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isinf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isnan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isnan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isneginf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isreal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isreal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_item_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_item_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_2inputs_2outputs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_binary_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_binary_return_by_ref_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_unary_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_unary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_kron_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_kron_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_kthvalue_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ldexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ldexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_le_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lerp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cholesky_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cholesky_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cond_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cond_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_det_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_det_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_eig_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_eig_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_eigh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_eigh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_eigvals_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_eigvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_eigvalsh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_householder_product_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_householder_product_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_inv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_inv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_inv_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_ldl_factor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_ldl_factor_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_ldl_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lstsq_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lstsq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lstsq_grad_oriented_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lu_factor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lu_factor_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lu_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_matrix_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_matrix_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_matrix_rank_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_matrix_rank_hermitian_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_multi_dot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_pinv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_pinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_pinv_hermitian_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_pinv_singular_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_qr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_slogdet_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_slogdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_solve_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_solve_triangular_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_svdvals_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_tensorinv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_tensorsolve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_vander_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_vander_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_vecdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_vector_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log10_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log1p_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logcumsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logcumsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logdet_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_and_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_and_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_not_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_or_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_xor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_xor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_long_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lu_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lu_unpack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lu_unpack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mH_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mH_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mT_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mT_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_argmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_cumprod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_logsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_normalize_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_prod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_softmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_std_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_sum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_var_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_matmul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_matrix_exp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_matrix_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_max_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_maximum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_meshgrid_list_of_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_meshgrid_variadic_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_min_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_minimum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_movedim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_movedim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_msort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_multinomial_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nanmean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nanmean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nanmedian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nanquantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nansum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nansum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_narrow_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_narrow_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_native_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_native_dropout_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ne_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_neg_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_empty_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_full_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_zeros_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_celu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_channel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv1d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv3d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose1d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose3d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_elu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_embedding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_l1_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_linear_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_normalize_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_circular_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_constant_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_reflect_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_replicate_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_replicate_negative_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pairwise_distance_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pixel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pixel_unshuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_rms_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_selu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_silu_complex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_silu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_softmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_softmin_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_softsign_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_tanhshrink_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_unfold_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nonzero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nonzero_static_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nonzero_static_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_fro_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_fro_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_inf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_inf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_nuc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_nuc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_normal_in_place_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_normal_in_place_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ones_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ones_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ormqr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ormqr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_outer_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_outer_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_pca_lowrank_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_pca_lowrank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_permute_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_permute_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_permute_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_pinverse_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_pinverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_positive_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_positive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_pow_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_prod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_put_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_qr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_quantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rand_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rand_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_randint_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_randint_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_randn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_randn_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_randn_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ravel_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ravel_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_reciprocal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_reciprocal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_renorm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_repeat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_repeat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_repeat_interleave_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_repeat_interleave_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_reshape_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_reshape_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_reshape_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_resize__cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_resize__cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_resize_as__cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_resize_as__cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_resolve_conj_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_resolve_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_resolve_neg_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_resolve_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_roll_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rot90_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rot90_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_round_decimals_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_round_decimals_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rsqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rsub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scalar_tensor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scalar_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scatter_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scatter_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_searchsorted_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_select_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sgn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_short_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sigmoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_hamming_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_hann_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sinc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_slice_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_slice_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_slice_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sparse_sampled_addmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_airy_ai_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_bessel_y0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_bessel_y1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_erfcx_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_zeta_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_list_args_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_list_args_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_with_sizes_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_with_sizes_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_with_sizes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_square_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_squeeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_squeeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_squeeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_squeeze_multiple_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_squeeze_multiple_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_std_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_std_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_std_mean_unbiased_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_std_unbiased_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_std_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_stft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sum_to_size_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sum_to_size_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_svd_lowrank_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_svd_lowrank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_t_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_take_along_dim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_take_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_take_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tensor_split_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tensor_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tensordot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tensordot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tile_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_to_sparse_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_to_sparse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_topk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_trace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_trace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_transpose_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_transpose_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_transpose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_transpose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_trapezoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_trapezoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_trapz_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_trapz_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_triangular_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_triangular_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tril_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_triu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_true_divide_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unbind_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unbind_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unbind_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unbind_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unflatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unflatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unfold_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unfold_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_uniform_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_uniform_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unique_consecutive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unique_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unsafe_chunk_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unsafe_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unsafe_split_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unsafe_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unsqueeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unsqueeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unsqueeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_var_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_var_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_var_mean_unbiased_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_var_unbiased_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_var_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_vdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_view_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_view_as_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_view_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_view_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_view_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_view_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_vsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_vstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_where_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_where_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_zero__cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_zero__cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_zeros_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_zeros_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_zeros_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_H_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_T_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward___getitem___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward___radd___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward___rdiv___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward___rmatmul___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward___rmod___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward___rmul___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward___rpow___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward___rsub___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward__softmax_backward_data_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_abs_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_acos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_acosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_addbmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_addcdiv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_addcmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_addmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_addmm_decomposed_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_addmv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_addr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_alias_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_angle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_as_strided_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_as_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_as_strided_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_asin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_asinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_atan2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_atan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_atanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_atleast_1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_atleast_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_atleast_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_baddbmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_bernoulli_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_bfloat16_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_block_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_bmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_broadcast_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_broadcast_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cartesian_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cdouble_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_ceil_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cfloat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_chalf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cholesky_inverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cholesky_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_clamp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_clamp_max_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_clamp_min_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_clone_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_column_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_combinations_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_conj_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_conj_physical_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_constant_pad_nd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_contiguous_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_copysign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_corrcoef_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cov_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cummax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cummin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cumsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_deg2rad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_diag_embed_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_diagflat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_diagonal_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_diagonal_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_diff_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_digamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_dist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_div_floor_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_double_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_dsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_dstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_einsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_erf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_erfc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_erfinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_exp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_exp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_expand_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_expand_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_expand_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_expm1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_fft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_fft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_fftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_fftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_hfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_hfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_hfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_ifft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_ifft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_ifftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_ifftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_ihfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_ihfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_ihfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_irfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_irfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_irfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_rfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_rfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_rfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_flatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_flip_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fliplr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_flipud_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_float_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_float_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_floor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fmod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_frac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_frexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_gather_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_gradient_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_grid_sampler_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_half_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_hsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_hstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_hypot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_inner_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_kron_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_kthvalue_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_ldexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_lerp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_lgamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_cond_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_det_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_eig_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_eigh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_eigvals_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_householder_product_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_inv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_lstsq_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_pinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_slogdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_svdvals_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_vander_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_vecdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_log10_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_log1p_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_log2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_log_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_logaddexp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_logcumsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_logdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_logit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_lu_unpack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_mH_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_mT_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_cumsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_softmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_var_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_matmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_matrix_exp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_max_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_maximum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_min_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_minimum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_mm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_movedim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_msort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_mul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_mv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nan_to_num_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nanmean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nanmedian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nanquantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nansum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_narrow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_native_batch_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_native_dropout_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_native_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_celu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_elu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_embedding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_glu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_mish_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_selu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_silu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_softmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_norm_fro_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_norm_inf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_norm_nuc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_normal_number_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_ormqr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_outer_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_pca_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_permute_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_permute_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_pinverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_polar_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_positive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_pow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_quantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_rad2deg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_ravel_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_real_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_reciprocal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_remainder_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_renorm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_repeat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_repeat_interleave_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_reshape_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_reshape_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_resolve_conj_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_resolve_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_roll_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_rot90_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_round_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_round_decimals_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_round_decimals_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_rsqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_rsub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_scatter_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_select_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sgn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sinc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_slice_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_slice_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_entr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_erfcx_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_i0e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_i1e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_log_ndtr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_ndtr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_ndtri_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_xlog1py_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_split_list_args_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_split_with_sizes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_square_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_squeeze_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_squeeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_squeeze_multiple_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_std_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_std_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_stft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sum_to_size_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_svd_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_t_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_take_along_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_take_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_tan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_tanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_tensor_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_tensordot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_tile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_to_sparse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_topk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_trace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_transpose_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_transpose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_trapz_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_triangular_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_tril_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_triu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_true_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_trunc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_unbind_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_unbind_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_unflatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_unfold_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_unsafe_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_unsafe_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_unsqueeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_var_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_var_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_var_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_vdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_view_as_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_view_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_view_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_view_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_vsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_vstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_where_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_xlogy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_zero__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_H_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_T_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input___getitem___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input___radd___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input___rdiv___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input___rmatmul___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input___rmod___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input___rmul___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input___rpow___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input___rsub___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__chunk_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__softmax_backward_data_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_abs_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_acos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_acosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_addbmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_addcdiv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_addcmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_addmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_addmm_decomposed_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_addmv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_addr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_alias_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_all_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_allclose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_aminmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_angle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_any_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_arange_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_argmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_argsort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_argwhere_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_as_strided_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_as_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_as_strided_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_asin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_asinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_atan2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_atan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_atanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_atleast_1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_atleast_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_atleast_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_baddbmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_bernoulli_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_bfloat16_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_block_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_bmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_bool_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_broadcast_shapes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_broadcast_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_broadcast_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_bucketize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_byte_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cartesian_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cauchy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cdouble_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ceil_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cfloat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_chalf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_char_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cholesky_inverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cholesky_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_clamp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_clamp_max_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_clamp_min_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_clone_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_column_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_combinations_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_conj_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_conj_physical_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_constant_pad_nd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_contiguous_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_copysign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_corrcoef_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_count_nonzero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cov_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cummax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cummin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cumsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_deg2rad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_diag_embed_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_diagflat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_diagonal_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_diagonal_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_diff_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_digamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_dist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_div_floor_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_double_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_dsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_dstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_einsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_empty_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_empty_permuted_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_empty_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_eq_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_equal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_erf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_erfc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_erfinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_exp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_exp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_expand_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_expand_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_expand_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_expm1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_exponential_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_eye_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_fft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_fft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_fftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_fftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_hfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_hfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_hfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_ifft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_ifft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_ifftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_ifftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_ihfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_ihfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_ihfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_irfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_irfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_irfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_rfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_rfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_rfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_flatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_flip_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fliplr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_flipud_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_float_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_float_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_floor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_floor_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fmod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_frac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_frexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_full_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_full_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_gather_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ge_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_geometric_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_geqrf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_gradient_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_grid_sampler_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_gt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_half_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_hash_tensor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_heaviside_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_histc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_hsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_hstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_hypot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_igamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_igammac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_index_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_index_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_index_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_index_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_index_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_index_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_index_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_index_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_index_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_inner_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_int_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isclose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isfinite_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isinf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isnan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isneginf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isposinf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isreal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_item_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_jiterator_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_jiterator_unary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_kron_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_kthvalue_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ldexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_le_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_lerp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_lgamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_cond_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_det_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_eig_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_eigh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_eigvals_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_householder_product_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_inv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_lstsq_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_pinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_slogdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_svdvals_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_vander_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_vecdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linspace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_log10_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_log1p_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_log2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_log_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_log_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logaddexp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logcumsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logical_and_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logical_not_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logical_or_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logical_xor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logspace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_long_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_lt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_lu_unpack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mH_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mT_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_argmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_cumsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_softmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_var_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_matmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_matrix_exp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_max_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_maximum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_min_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_minimum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_movedim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_msort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_multinomial_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nan_to_num_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nanmean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nanmedian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nanquantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nansum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_narrow_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_narrow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_native_batch_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_native_dropout_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_native_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ne_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_new_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_new_empty_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_new_full_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_new_ones_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_new_zeros_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nextafter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_celu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_elu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_embedding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_glu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_mish_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_selu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_silu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_softmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nonzero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nonzero_static_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_norm_fro_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_norm_inf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_norm_nuc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_normal_in_place_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_normal_number_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ones_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ones_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ormqr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_outer_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_pca_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_permute_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_permute_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_pinverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_polar_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_positive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_pow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_quantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_rad2deg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_rand_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_randint_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_randint_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_randn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_randn_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ravel_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_real_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_reciprocal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_remainder_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_renorm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_repeat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_repeat_interleave_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_reshape_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_reshape_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_resize__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_resize_as__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_resolve_conj_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_resolve_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_roll_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_rot90_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_round_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_round_decimals_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_round_decimals_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_rsqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_rsub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scalar_tensor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scatter_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_searchsorted_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_select_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sgn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_short_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_cosine_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_exponential_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_hamming_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_hann_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signbit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sinc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_slice_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_slice_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_airy_ai_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_bessel_j0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_bessel_j1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_bessel_y0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_bessel_y1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_entr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_erfcx_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_i0e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_i1e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_log_ndtr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_ndtr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_ndtri_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_xlog1py_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_zeta_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_split_list_args_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_split_with_sizes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_square_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_squeeze_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_squeeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_squeeze_multiple_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_std_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_std_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_stft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sum_to_size_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_svd_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_t_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_take_along_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_take_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_tan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_tanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_tensor_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_tensordot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_tile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_to_sparse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_topk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_trace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_transpose_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_transpose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_trapz_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_triangular_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_tril_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_triu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_true_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_trunc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unbind_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unbind_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unflatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unfold_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_uniform_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unique_consecutive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unique_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unsafe_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unsafe_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unsqueeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_var_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_var_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_var_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_vdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_view_as_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_view_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_view_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_view_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_vsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_vstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_where_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_xlogy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_zero__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_zeros_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_zeros_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_H_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_T_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad___getitem___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad___radd___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad___rdiv___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad___rmatmul___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad___rmod___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad___rmul___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad___rpow___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad___rsub___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__chunk_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__softmax_backward_data_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_abs_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_acos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_acosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_addbmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_addcdiv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_addcmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_addmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_addmm_decomposed_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_addmv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_addr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_alias_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_all_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_allclose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_aminmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_angle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_any_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_arange_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_argmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_argsort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_argwhere_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_as_strided_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_as_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_as_strided_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_asin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_asinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_atan2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_atan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_atanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_atleast_1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_atleast_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_atleast_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_baddbmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_bernoulli_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_bfloat16_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_block_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_bmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_bool_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_broadcast_shapes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_broadcast_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_broadcast_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_bucketize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_byte_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cartesian_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cauchy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cdouble_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_ceil_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cfloat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_chalf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_char_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cholesky_inverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cholesky_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_clamp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_clamp_max_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_clamp_min_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_clone_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_column_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_combinations_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_conj_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_conj_physical_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_constant_pad_nd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_contiguous_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_copysign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_corrcoef_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_count_nonzero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cov_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cummax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cummin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cumsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_deg2rad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_diag_embed_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_diagflat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_diagonal_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_diagonal_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_diff_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_digamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_dist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_div_floor_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_double_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_dsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_dstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_einsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_empty_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_empty_permuted_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_empty_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_eq_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_equal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_erf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_erfc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_erfinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_exp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_exp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_expand_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_expand_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_expand_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_expm1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_exponential_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_eye_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_fft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_fft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_fftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_fftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_hfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_hfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_hfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_ifft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_ifft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_ifftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_ifftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_ihfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_ihfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_ihfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_irfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_irfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_irfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_rfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_rfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_rfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_flatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_flip_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fliplr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_flipud_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_float_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_float_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_floor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_floor_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fmod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_frac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_frexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_full_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_full_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_gather_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_ge_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_geometric_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_geqrf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_gradient_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_grid_sampler_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_gt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_half_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_hash_tensor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_heaviside_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_histc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_hsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_hstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_hypot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_igamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_igammac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_inner_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_int_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_isclose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_isfinite_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_isin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_isinf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_isnan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_isneginf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_isposinf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_isreal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_item_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_jiterator_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_jiterator_unary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_kron_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_kthvalue_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_ldexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_le_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_lerp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_lgamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_cond_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_det_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_eig_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_eigh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_eigvals_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_householder_product_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_inv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_lstsq_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_pinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_slogdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_svdvals_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_vander_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_vecdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linspace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log10_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log1p_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logaddexp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logcumsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logical_and_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logical_not_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logical_or_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logical_xor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logspace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_long_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_lt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_lu_unpack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mH_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mT_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_argmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_cumsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_softmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_var_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_matmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_matrix_exp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_max_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_maximum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_min_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_minimum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_movedim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_msort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_multinomial_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nan_to_num_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nanmean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nanmedian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nanquantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nansum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_narrow_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_narrow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_native_batch_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_native_dropout_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_native_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_ne_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_new_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_new_empty_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_new_full_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_new_ones_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_new_zeros_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nextafter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_celu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_elu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_embedding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_glu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_mish_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_selu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_silu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_softmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nonzero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nonzero_static_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_norm_fro_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_norm_inf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_norm_nuc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_normal_in_place_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_normal_number_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_ones_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_ones_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_ormqr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_outer_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_pca_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_permute_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_permute_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_pinverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_polar_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_positive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_pow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_quantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_rad2deg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_rand_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_randint_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_randint_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_randn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_randn_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_ravel_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_real_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_reciprocal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_remainder_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_renorm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_repeat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_repeat_interleave_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_reshape_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_reshape_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_resize__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_resize_as__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_resolve_conj_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_resolve_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_roll_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_rot90_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_round_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_round_decimals_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_round_decimals_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_rsqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_rsub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_scalar_tensor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_scatter_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_searchsorted_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_select_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sgn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_short_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_cosine_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_exponential_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_hamming_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_hann_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signbit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sinc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_slice_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_slice_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_airy_ai_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_bessel_j0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_bessel_j1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_bessel_y0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_bessel_y1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_entr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_erfcx_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_i0e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_i1e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_log_ndtr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_ndtr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_ndtri_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_xlog1py_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_zeta_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_split_list_args_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_split_with_sizes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_square_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_squeeze_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_squeeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_squeeze_multiple_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_std_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_std_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_stft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sum_to_size_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_svd_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_t_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_take_along_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_take_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_tan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_tanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_tensor_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_tensordot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_tile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_to_sparse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_topk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_trace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_transpose_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_transpose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_trapz_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_triangular_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_tril_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_triu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_true_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_trunc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unbind_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unbind_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unflatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unfold_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_uniform_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unique_consecutive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unique_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unsafe_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unsafe_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unsqueeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_var_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_var_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_var_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_vdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_view_as_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_view_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_view_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_view_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_vsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_vstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_where_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_xlogy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_zero__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_zeros_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_zeros_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_H_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_T_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator___getitem___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator___radd___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator___rdiv___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator___rmatmul___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator___rmod___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator___rmul___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator___rpow___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator___rsub___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__chunk_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__softmax_backward_data_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_abs_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_acos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_acosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_addbmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_addcdiv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_addcmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_addmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_addmm_decomposed_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_addmv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_addr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_alias_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_all_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_allclose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_aminmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_angle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_any_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_arange_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_argmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_argsort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_argwhere_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_as_strided_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_as_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_as_strided_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_asin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_asinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_atan2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_atan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_atanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_atleast_1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_atleast_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_atleast_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_baddbmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_bernoulli_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_bfloat16_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_block_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_bmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_bool_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_broadcast_shapes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_broadcast_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_broadcast_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_bucketize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_byte_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cartesian_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cauchy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cdouble_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ceil_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cfloat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_chalf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_char_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cholesky_inverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cholesky_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_clamp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_clamp_max_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_clamp_min_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_clone_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_column_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_combinations_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_conj_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_conj_physical_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_constant_pad_nd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_contiguous_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_copysign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_corrcoef_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_count_nonzero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cov_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cummax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cummin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cumsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_deg2rad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_diag_embed_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_diagflat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_diagonal_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_diagonal_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_diff_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_digamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_dist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_div_floor_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_double_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_dsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_dstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_einsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_empty_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_empty_permuted_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_empty_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_eq_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_equal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_erf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_erfc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_erfinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_exp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_exp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_expand_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_expand_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_expand_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_expm1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_exponential_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_eye_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_fft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_fft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_fftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_fftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_hfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_hfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_hfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_ifft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_ifft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_ifftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_ifftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_ihfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_ihfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_ihfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_irfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_irfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_irfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_rfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_rfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_rfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_flatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_flip_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fliplr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_flipud_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_float_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_float_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_floor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_floor_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fmod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_frac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_frexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_full_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_full_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_gather_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ge_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_geometric_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_geqrf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_gradient_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_grid_sampler_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_gt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_half_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_hash_tensor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_heaviside_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_histc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_hsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_hstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_hypot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_igamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_igammac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_index_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_index_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_index_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_index_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_index_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_index_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_index_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_index_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_index_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_inner_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_int_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isclose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isfinite_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isinf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isnan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isneginf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isposinf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isreal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_item_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_jiterator_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_jiterator_unary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_kron_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_kthvalue_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ldexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_le_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_lerp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_lgamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_cond_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_det_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_eig_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_eigh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_eigvals_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_householder_product_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_inv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_lstsq_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_pinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_slogdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_svdvals_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_vander_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_vecdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linspace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_log10_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_log1p_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_log2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_log_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_log_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logaddexp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logcumsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logical_and_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logical_not_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logical_or_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logical_xor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logspace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_long_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_lt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_lu_unpack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_mH_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_mT_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_argmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_cumsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_softmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_var_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_matmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_matrix_exp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_max_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_maximum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_min_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_minimum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_mm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_movedim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_msort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_mul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_multinomial_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_mv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nan_to_num_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nanmean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nanmedian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nanquantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nansum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_narrow_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_narrow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_native_batch_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_native_dropout_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_native_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ne_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_new_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_new_empty_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_new_full_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_new_ones_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_new_zeros_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nextafter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_celu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_elu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_embedding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_glu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_mish_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_selu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_silu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_softmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nonzero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nonzero_static_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_norm_fro_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_norm_inf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_norm_nuc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_normal_in_place_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_normal_number_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ones_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ones_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ormqr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_outer_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_pca_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_permute_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_permute_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_pinverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_polar_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_positive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_pow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_quantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_rad2deg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_rand_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_randint_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_randint_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_randn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_randn_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ravel_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_real_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_reciprocal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_remainder_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_renorm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_repeat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_repeat_interleave_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_reshape_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_reshape_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_resize__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_resize_as__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_resolve_conj_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_resolve_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_roll_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_rot90_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_round_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_round_decimals_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_round_decimals_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_rsqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_rsub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_scalar_tensor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_scatter_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_searchsorted_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_select_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sgn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_short_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_cosine_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_exponential_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_hamming_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_hann_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signbit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sinc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_slice_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_slice_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_airy_ai_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_bessel_j0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_bessel_j1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_bessel_y0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_bessel_y1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_entr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_erfcx_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_i0e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_i1e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_log_ndtr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_ndtr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_ndtri_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_xlog1py_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_zeta_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_split_list_args_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_split_with_sizes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_square_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_squeeze_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_squeeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_squeeze_multiple_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_std_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_std_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_stft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sum_to_size_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_svd_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_t_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_take_along_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_take_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_tan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_tanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_tensor_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_tensordot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_tile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_to_sparse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_topk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_trace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_transpose_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_transpose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_trapz_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_triangular_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_tril_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_triu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_true_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_trunc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unbind_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unbind_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unflatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unfold_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_uniform_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unique_consecutive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unique_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unsafe_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unsafe_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unsqueeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_var_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_var_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_var_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_vdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_view_as_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_view_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_view_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_view_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_vsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_vstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_where_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_xlogy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_zero__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_zeros_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_zeros_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_H_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_T_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay___getitem___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay___radd___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay___rdiv___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay___rmatmul___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay___rmod___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay___rmul___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay___rpow___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay___rsub___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__chunk_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__softmax_backward_data_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_abs_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_acos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_acosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_addbmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_addcdiv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_addcmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_addmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_addmm_decomposed_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_addmv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_addr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_alias_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_all_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_allclose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_aminmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_angle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_any_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_arange_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_argmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_argsort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_argwhere_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_as_strided_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_as_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_as_strided_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_asin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_asinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_atan2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_atan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_atanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_atleast_1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_atleast_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_atleast_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_baddbmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_bernoulli_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_bfloat16_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_block_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_bmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_bool_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_broadcast_shapes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_broadcast_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_broadcast_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_bucketize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_byte_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cartesian_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cauchy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cdouble_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_ceil_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cfloat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_chalf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_char_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cholesky_inverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cholesky_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_clamp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_clamp_max_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_clamp_min_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_clone_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_column_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_combinations_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_conj_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_conj_physical_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_constant_pad_nd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_contiguous_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_copysign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_corrcoef_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_count_nonzero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cov_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cummax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cummin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cumsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_deg2rad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_diag_embed_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_diagflat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_diagonal_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_diagonal_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_diff_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_digamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_dist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_div_floor_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_double_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_dsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_dstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_einsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_empty_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_empty_permuted_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_empty_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_eq_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_equal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_erf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_erfc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_erfinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_exp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_exp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_expand_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_expand_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_expand_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_expm1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_exponential_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_eye_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_fft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_fft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_fftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_fftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_hfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_hfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_hfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_ifft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_ifft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_ifftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_ifftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_ihfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_ihfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_ihfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_irfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_irfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_irfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_rfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_rfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_rfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_flatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_flip_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fliplr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_flipud_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_float_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_float_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_floor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_floor_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fmod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_frac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_frexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_full_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_full_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_gather_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_ge_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_geometric_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_geqrf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_gradient_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_grid_sampler_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_gt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_half_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_hash_tensor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_heaviside_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_histc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_hsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_hstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_hypot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_igamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_igammac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_index_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_index_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_index_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_index_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_index_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_index_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_index_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_index_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_index_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_inner_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_int_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_isclose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_isfinite_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_isin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_isinf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_isnan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_isneginf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_isposinf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_isreal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_item_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_jiterator_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_jiterator_unary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_kron_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_kthvalue_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_ldexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_le_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_lerp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_lgamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_cond_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_det_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_eig_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_eigh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_eigvals_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_householder_product_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_inv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_lstsq_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_pinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_slogdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_svdvals_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_vander_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_vecdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linspace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_log10_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_log1p_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_log2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_log_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_log_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logaddexp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logcumsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logical_and_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logical_not_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logical_or_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logical_xor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logspace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_long_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_lt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_lu_unpack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mH_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mT_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_argmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_cumsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_softmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_var_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_matmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_matrix_exp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_max_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_maximum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_min_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_minimum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_movedim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_msort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_multinomial_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nan_to_num_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nanmean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nanmedian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nanquantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nansum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_narrow_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_narrow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_native_batch_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_native_dropout_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_native_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_ne_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_new_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_new_empty_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_new_full_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_new_ones_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_new_zeros_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nextafter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_celu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_elu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_embedding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_glu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_mish_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_selu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_silu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_softmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nonzero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nonzero_static_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_norm_fro_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_norm_inf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_norm_nuc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_normal_in_place_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_normal_number_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_ones_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_ones_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_ormqr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_outer_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_pca_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_permute_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_permute_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_pinverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_polar_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_positive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_pow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_quantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_rad2deg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_rand_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_randint_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_randint_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_randn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_randn_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_ravel_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_real_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_reciprocal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_remainder_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_renorm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_repeat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_repeat_interleave_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_reshape_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_reshape_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_resize__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_resize_as__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_resolve_conj_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_resolve_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_roll_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_rot90_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_round_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_round_decimals_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_round_decimals_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_rsqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_rsub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_scalar_tensor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_scatter_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_searchsorted_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_select_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sgn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_short_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_cosine_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_exponential_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_hamming_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_hann_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signbit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sinc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_slice_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_slice_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_airy_ai_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_bessel_j0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_bessel_j1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_bessel_y0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_bessel_y1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_entr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_erfcx_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_i0e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_i1e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_log_ndtr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_ndtr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_ndtri_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_xlog1py_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_zeta_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_split_list_args_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_split_with_sizes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_square_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_squeeze_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_squeeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_squeeze_multiple_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_std_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_std_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_stft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sum_to_size_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_svd_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_t_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_take_along_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_take_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_tan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_tanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_tensor_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_tensordot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_tile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_to_sparse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_topk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_trace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_transpose_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_transpose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_trapz_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_triangular_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_tril_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_triu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_true_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_trunc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unbind_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unbind_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unflatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unfold_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_uniform_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unique_consecutive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unique_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unsafe_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unsafe_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unsqueeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_var_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_var_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_var_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_vdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_view_as_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_view_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_view_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_view_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_vsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_vstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_where_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_xlogy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_zero__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_zeros_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_zeros_like_cuda_float32, test/test_ops.py::TestMathBitsCUDA::test_conj_view_H_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_T_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view___getitem___cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view___radd___cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view___rdiv___cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view___rmatmul___cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view___rmul___cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view___rpow___cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view___rsub___cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__chunk_cat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_T_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_bfloat16_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_bool_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_byte_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_cdouble_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_cfloat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_chalf_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_char_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_double_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_float_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_half_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_int_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_long_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_short_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_abs_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_acos_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_acosh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_add_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_addcdiv_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_addcmul_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_addr_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_alias_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_all_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_allclose_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_any_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_as_strided_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_as_strided_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_as_strided_partial_views_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_as_strided_scatter_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_asin_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_asinh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_atan_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_atanh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_atleast_1d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_atleast_2d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_atleast_3d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_block_diag_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_broadcast_tensors_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_broadcast_to_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_cat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_chunk_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_clone_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_column_stack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_conj_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_conj_physical_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_constant_pad_nd_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_contiguous_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_cos_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_cosh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_count_nonzero_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_cumprod_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_cumsum_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_diag_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_diag_embed_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_diagonal_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_diagonal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_diagonal_scatter_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_div_no_rounding_mode_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_dot_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_dsplit_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_dstack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_empty_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_empty_like_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_empty_strided_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_eq_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_equal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_exp2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_exp_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_expand_as_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_expand_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_expand_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_expm1_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_eye_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_fft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_fft_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_fftn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_fftshift_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_hfft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_hfft_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_hfftn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_ifft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_ifft_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_ifftn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_ifftshift_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_irfft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_irfft_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_irfftn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fill_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_flatten_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_flip_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fliplr_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_flipud_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_float_power_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_hsplit_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_hstack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_imag_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_index_add_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_index_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_index_fill_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_index_select_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_isclose_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_isfinite_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_isinf_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_isnan_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_isreal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_istft_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_item_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_lerp_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linalg_cross_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linalg_diagonal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linalg_matrix_norm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linalg_norm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linalg_svd_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linalg_svdvals_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linalg_vecdot_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linalg_vector_norm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linspace_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linspace_tensor_overload_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_log10_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_log1p_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_log2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_log_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_logical_and_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_logical_not_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_logical_or_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_logical_xor_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_logspace_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_logsumexp_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_masked_fill_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_mean_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_meshgrid_list_of_tensors_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_meshgrid_variadic_tensors_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_movedim_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_mul_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_narrow_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_narrow_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_ne_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_neg_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_new_empty_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_new_empty_strided_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_new_full_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_new_ones_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_new_zeros_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_nn_functional_channel_shuffle_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_nn_functional_l1_loss_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_nn_functional_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_nn_functional_pairwise_distance_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_nn_functional_pixel_shuffle_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_nn_functional_pixel_unshuffle_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_nn_functional_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_nn_functional_softmin_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_nn_functional_tanhshrink_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_norm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_normal__in_place_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_ones_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_permute_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_permute_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_positive_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_pow_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_prod_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_randn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_ravel_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_real_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_reciprocal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_renorm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_repeat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_reshape_as_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_reshape_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_roll_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_rot90_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_rsqrt_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_rsub_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_sgn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_sigmoid_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_sin_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_sinc_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_sinh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_special_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_special_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_split_with_sizes_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_sqrt_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_square_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_squeeze_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_squeeze_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_squeeze_multiple_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_stack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_std_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_std_mean_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_stft_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_sub_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_sum_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_sum_to_size_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_t_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_t_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_take_along_dim_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_tan_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_tanh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_tensor_split_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_to_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_trace_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_transpose_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_transpose_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_tril_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_triu_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_true_divide_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_unbind_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_unbind_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_unflatten_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_unfold_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_unfold_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_unsqueeze_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_unsqueeze_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_var_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_var_mean_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_vdot_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_view_as_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_view_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_view_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_vsplit_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_vstack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_where_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_zeros_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__unsafe_masked_index_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_abs_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_acos_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_acosh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_add_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_addbmm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_addcdiv_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_addcmul_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_addmm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_addmm_decomposed_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_addmv_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_addr_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_alias_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_all_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_allclose_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_angle_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_any_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_argwhere_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_as_strided_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_as_strided_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_as_strided_partial_views_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_as_strided_scatter_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_asin_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_asinh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_atan_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_atanh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_atleast_1d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_atleast_2d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_atleast_3d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_baddbmm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_bfloat16_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_block_diag_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_bmm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_bool_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_broadcast_tensors_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_broadcast_to_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_byte_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cartesian_prod_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cdouble_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cfloat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_chalf_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_char_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cholesky_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cholesky_inverse_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cholesky_solve_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_chunk_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_clone_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_column_stack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_combinations_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_conj_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_conj_physical_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_constant_pad_nd_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_contiguous_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_corrcoef_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cos_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cosh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_count_nonzero_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cov_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cross_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cumprod_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cumsum_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cumulative_trapezoid_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_diag_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_diag_embed_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_diagflat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_diagonal_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_diagonal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_diagonal_scatter_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_diff_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_dist_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_div_no_rounding_mode_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_dot_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_double_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_dsplit_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_dstack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_einsum_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_empty_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_empty_like_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_empty_permuted_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_empty_strided_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_eq_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_equal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_exp2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_exp_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_expand_as_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_expand_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_expand_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_expm1_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_eye_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_fft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_fft_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_fftn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_fftshift_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_hfft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_hfft_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_hfftn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_ifft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_ifft_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_ifftn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_ifftshift_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_irfft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_irfft_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_irfftn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fill_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_flatten_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_flip_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fliplr_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_flipud_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_float_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_float_power_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_full_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_full_like_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_gather_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_geqrf_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_gradient_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_half_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_hsplit_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_hstack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_imag_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_index_add_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_index_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_index_fill_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_index_put_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_index_select_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_inner_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_int_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_isclose_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_isfinite_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_isinf_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_isnan_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_isreal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_istft_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_item_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_jiterator_2inputs_2outputs_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_jiterator_binary_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_jiterator_binary_return_by_ref_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_jiterator_unary_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_kron_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_ldexp_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_lerp_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_cholesky_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_cholesky_ex_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_cond_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_cross_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_det_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_diagonal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_eig_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_eigh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_eigvals_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_eigvalsh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_householder_product_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_inv_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_inv_ex_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_ldl_factor_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_ldl_factor_ex_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_ldl_solve_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_lstsq_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_lstsq_grad_oriented_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_lu_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_lu_factor_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_lu_factor_ex_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_lu_solve_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_matrix_norm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_matrix_power_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_matrix_rank_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_matrix_rank_hermitian_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_multi_dot_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_norm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_pinv_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_pinv_hermitian_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_pinv_singular_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_qr_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_slogdet_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_solve_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_solve_ex_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_solve_triangular_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_svd_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_svdvals_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_tensorinv_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_tensorsolve_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_vander_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_vecdot_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_vector_norm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linspace_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linspace_tensor_overload_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_log10_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_log1p_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_log2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_log_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_logcumsumexp_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_logdet_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_logical_and_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_logical_not_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_logical_or_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_logical_xor_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_logspace_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_logsumexp_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_long_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_lu_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_lu_solve_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_lu_unpack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_mH_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_mT_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_cumprod_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_cumsum_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_fill_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_logsumexp_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_mean_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_normalize_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_prod_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_scatter_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_select_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_std_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_sum_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_var_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_matmul_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_matrix_exp_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_mean_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_meshgrid_list_of_tensors_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_meshgrid_variadic_tensors_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_mm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_movedim_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_mul_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_mv_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nanmean_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nansum_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_narrow_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_narrow_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_ne_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_neg_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_new_empty_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_new_empty_strided_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_new_full_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_new_ones_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_new_zeros_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_channel_shuffle_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_conv1d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_conv2d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_conv3d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_conv_transpose1d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_conv_transpose2d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_conv_transpose3d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_l1_loss_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_linear_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_normalize_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pad_circular_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pad_constant_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pad_reflect_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pad_replicate_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pad_replicate_negative_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pairwise_distance_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pixel_shuffle_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pixel_unshuffle_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_rms_norm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_silu_complex_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_softmin_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_softsign_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_tanhshrink_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_unfold_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nonzero_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nonzero_static_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_norm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_norm_fro_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_norm_inf_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_norm_nuc_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_normal_in_place_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_ones_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_ones_like_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_ormqr_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_outer_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_pca_lowrank_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_permute_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_permute_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_pinverse_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_positive_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_pow_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_prod_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_put_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_qr_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_rand_like_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_randn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_randn_like_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_ravel_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_real_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_reciprocal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_renorm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_repeat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_repeat_interleave_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_reshape_as_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_reshape_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_resize__cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_resize_as__cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_resolve_conj_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_resolve_neg_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_roll_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_rot90_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_rsqrt_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_rsub_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_scalar_tensor_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_scatter_add_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_scatter_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_select_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_sgn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_short_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_sigmoid_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_sin_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_sinc_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_sinh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_slice_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_sparse_sampled_addmm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_split_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_split_list_args_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_split_with_sizes_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_split_with_sizes_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_sqrt_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_square_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_squeeze_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_squeeze_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_squeeze_multiple_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_stack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_std_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_std_mean_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_std_mean_unbiased_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_std_unbiased_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_stft_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_sub_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_sum_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_sum_to_size_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_svd_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_svd_lowrank_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_t_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_t_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_take_along_dim_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_take_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_tan_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_tanh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_tensor_split_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_tensordot_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_tile_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_to_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_to_sparse_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_trace_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_transpose_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_transpose_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_trapezoid_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_trapz_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_triangular_solve_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_tril_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_triu_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_true_divide_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_unbind_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_unbind_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_unflatten_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_unfold_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_unfold_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_uniform_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_unsafe_chunk_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_unsafe_split_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_unsqueeze_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_unsqueeze_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_var_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_var_mean_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_var_mean_unbiased_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_var_unbiased_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_vdot_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_view_as_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_view_as_real_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_view_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_view_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_vsplit_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_vstack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_where_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_zero__cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_zeros_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_zeros_like_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_H_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_T_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view___getitem___cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view___radd___cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view___rdiv___cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view___rmatmul___cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view___rmul___cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view___rpow___cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view___rsub___cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__chunk_cat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_T_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_bfloat16_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_bool_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_byte_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_cdouble_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_cfloat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_chalf_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_char_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_double_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_float_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_half_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_int_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_long_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_short_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_abs_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_acos_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_acosh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_add_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_addcdiv_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_addcmul_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_addr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_alias_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_all_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_allclose_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_any_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_as_strided_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_as_strided_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_as_strided_partial_views_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_as_strided_scatter_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_asin_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_asinh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_atan_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_atanh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_atleast_1d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_atleast_2d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_atleast_3d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_block_diag_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_broadcast_tensors_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_broadcast_to_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_cat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_chunk_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_clone_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_column_stack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_conj_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_conj_physical_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_constant_pad_nd_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_contiguous_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_cos_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_cosh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_count_nonzero_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_cumprod_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_cumsum_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_diag_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_diag_embed_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_diagonal_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_diagonal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_diagonal_scatter_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_div_no_rounding_mode_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_dot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_dsplit_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_dstack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_empty_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_empty_like_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_empty_strided_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_eq_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_equal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_exp2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_exp_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_expand_as_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_expand_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_expand_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_expm1_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_eye_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_fft2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_fft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_fftn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_fftshift_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_hfft2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_hfft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_hfftn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_ifft2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_ifft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_ifftn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_ifftshift_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_irfft2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_irfft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_irfftn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fill_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_flatten_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_flip_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fliplr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_flipud_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_float_power_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_hsplit_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_hstack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_imag_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_index_add_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_index_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_index_fill_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_index_select_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_isclose_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_isfinite_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_isinf_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_isnan_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_isreal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_istft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_item_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_lerp_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_cross_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_diagonal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_matrix_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_svd_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_svdvals_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_vecdot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_vector_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linspace_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linspace_tensor_overload_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_log10_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_log1p_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_log2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_log_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_logical_and_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_logical_not_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_logical_or_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_logical_xor_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_logspace_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_logspace_tensor_overload_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_logsumexp_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_masked_fill_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_mean_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_meshgrid_list_of_tensors_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_meshgrid_variadic_tensors_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_movedim_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_mul_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_narrow_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_narrow_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_ne_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_neg_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_new_empty_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_new_empty_strided_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_new_full_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_new_ones_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_new_zeros_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_channel_shuffle_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_l1_loss_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_pairwise_distance_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_softmin_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_tanhshrink_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_normal__in_place_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_ones_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_permute_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_permute_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_positive_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_pow_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_prod_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_randn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_ravel_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_real_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_reciprocal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_renorm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_repeat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_reshape_as_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_reshape_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_roll_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_rot90_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_rsqrt_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_rsub_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sgn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sigmoid_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sin_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sinc_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sinh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_special_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_special_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_split_with_sizes_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sqrt_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_square_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_squeeze_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_squeeze_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_squeeze_multiple_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_stack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_std_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_std_mean_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_stft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sub_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sum_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sum_to_size_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_t_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_t_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_take_along_dim_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_tan_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_tanh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_tensor_split_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_to_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_trace_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_transpose_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_transpose_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_tril_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_triu_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_true_divide_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_unbind_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_unbind_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_unflatten_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_unfold_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_unfold_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_unsqueeze_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_unsqueeze_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_var_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_var_mean_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_vdot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_view_as_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_view_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_view_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_vsplit_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_vstack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_where_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_zeros_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__unsafe_masked_index_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_abs_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_acos_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_acosh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_add_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_addbmm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_addcdiv_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_addcmul_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_addmm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_addmm_decomposed_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_addmv_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_addr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_alias_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_all_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_allclose_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_angle_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_any_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_argwhere_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_as_strided_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_as_strided_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_as_strided_partial_views_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_as_strided_scatter_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_asin_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_asinh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_atan_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_atanh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_atleast_1d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_atleast_2d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_atleast_3d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_baddbmm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_bfloat16_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_block_diag_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_bmm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_bool_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_broadcast_tensors_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_broadcast_to_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_byte_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cartesian_prod_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cdouble_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cfloat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_chalf_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_char_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cholesky_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cholesky_inverse_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cholesky_solve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_chunk_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_clone_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_column_stack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_combinations_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_conj_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_conj_physical_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_constant_pad_nd_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_contiguous_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_corrcoef_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cos_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cosh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_count_nonzero_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cov_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cross_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cumprod_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cumsum_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cumulative_trapezoid_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diag_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diag_embed_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diagflat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diagonal_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diagonal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diagonal_scatter_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diff_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_dist_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_div_no_rounding_mode_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_dot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_double_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_dsplit_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_dstack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_einsum_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_empty_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_empty_like_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_empty_permuted_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_empty_strided_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_eq_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_equal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_exp2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_exp_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_expand_as_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_expand_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_expand_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_expm1_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_eye_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_fft2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_fft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_fftn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_fftshift_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_hfft2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_hfft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_hfftn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_ifft2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_ifft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_ifftn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_ifftshift_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_irfft2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_irfft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_irfftn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fill_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_flatten_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_flip_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fliplr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_flipud_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_float_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_float_power_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_full_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_full_like_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_gather_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_geqrf_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_gradient_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_half_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_hsplit_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_hstack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_imag_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_index_add_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_index_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_index_fill_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_index_put_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_index_select_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_inner_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_int_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_isclose_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_isfinite_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_isinf_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_isnan_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_isreal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_istft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_item_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_jiterator_2inputs_2outputs_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_jiterator_binary_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_jiterator_binary_return_by_ref_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_jiterator_unary_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_kron_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_ldexp_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_lerp_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_cholesky_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_cholesky_ex_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_cond_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_cross_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_det_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_diagonal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_eig_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_eigh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_eigvals_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_eigvalsh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_householder_product_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_inv_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_inv_ex_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_ldl_factor_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_ldl_factor_ex_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_ldl_solve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_lstsq_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_lstsq_grad_oriented_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_lu_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_lu_factor_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_lu_factor_ex_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_lu_solve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_matrix_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_matrix_power_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_matrix_rank_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_matrix_rank_hermitian_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_multi_dot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_pinv_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_pinv_hermitian_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_pinv_singular_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_qr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_slogdet_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_solve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_solve_ex_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_solve_triangular_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_svd_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_svdvals_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_tensorinv_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_tensorsolve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_vander_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_vecdot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_vector_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linspace_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linspace_tensor_overload_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_log10_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_log1p_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_log2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_log_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_logcumsumexp_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_logdet_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_logical_and_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_logical_not_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_logical_or_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_logical_xor_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_logspace_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_logspace_tensor_overload_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_logsumexp_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_long_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_lu_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_lu_solve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_lu_unpack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_mH_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_mT_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_cumprod_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_cumsum_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_fill_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_logsumexp_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_mean_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_normalize_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_prod_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_scatter_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_select_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_std_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_sum_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_var_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_matmul_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_matrix_exp_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_mean_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_meshgrid_list_of_tensors_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_meshgrid_variadic_tensors_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_mm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_movedim_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_mul_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_mv_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nanmean_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nansum_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_narrow_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_narrow_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_ne_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_neg_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_new_empty_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_new_empty_strided_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_new_full_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_new_ones_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_new_zeros_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_channel_shuffle_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_conv1d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_conv2d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_conv3d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_conv_transpose1d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_conv_transpose2d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_conv_transpose3d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_l1_loss_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_linear_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_normalize_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_pad_circular_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_pad_constant_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_pad_reflect_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_pad_replicate_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_pad_replicate_negative_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_pairwise_distance_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_rms_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_silu_complex_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_softmin_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_softsign_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_tanhshrink_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_unfold_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nonzero_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nonzero_static_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_norm_fro_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_norm_inf_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_norm_nuc_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_normal_in_place_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_ones_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_ones_like_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_ormqr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_outer_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_pca_lowrank_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_permute_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_permute_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_pinverse_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_positive_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_pow_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_prod_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_put_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_qr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_rand_like_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_randn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_randn_like_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_ravel_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_real_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_reciprocal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_renorm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_repeat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_repeat_interleave_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_reshape_as_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_reshape_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_resize__cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_resize_as__cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_resolve_conj_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_resolve_neg_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_roll_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_rot90_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_rsqrt_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_rsub_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_scalar_tensor_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_scatter_add_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_scatter_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_select_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sgn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_short_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sigmoid_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sin_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sinc_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sinh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_slice_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sparse_sampled_addmm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_split_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_split_list_args_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_split_with_sizes_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_split_with_sizes_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sqrt_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_square_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_squeeze_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_squeeze_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_squeeze_multiple_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_stack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_std_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_std_mean_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_std_mean_unbiased_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_std_unbiased_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_stft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sub_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sum_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sum_to_size_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_svd_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_svd_lowrank_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_t_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_t_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_take_along_dim_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_take_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_tan_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_tanh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_tensor_split_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_tensordot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_tile_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_to_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_to_sparse_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_trace_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_transpose_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_transpose_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_trapezoid_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_trapz_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_triangular_solve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_tril_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_triu_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_true_divide_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unbind_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unbind_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unflatten_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unfold_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unfold_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_uniform_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unsafe_chunk_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unsafe_split_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unsqueeze_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unsqueeze_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_var_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_var_mean_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_var_mean_unbiased_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_var_unbiased_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_vdot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_view_as_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_view_as_real_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_view_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_view_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_vsplit_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_vstack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_where_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_zero__cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_zeros_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_zeros_like_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_view_H_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_T_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view___getitem___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view___radd___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view___rdiv___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view___rmatmul___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view___rmod___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view___rmul___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view___rpow___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view___rsub___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__batch_norm_with_update_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__chunk_cat_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__native_batch_norm_legit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_T_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_bfloat16_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_bool_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_byte_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_cdouble_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_cfloat_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_chalf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_char_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_complex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_double_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_float_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_half_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_int_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_long_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_polar_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_short_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_abs_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_acos_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_acosh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_add_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_addcdiv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_addcmul_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_addr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_alias_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_all_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_allclose_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_amax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_amin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_any_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_arange_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_as_strided_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_as_strided_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_as_strided_partial_views_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_as_strided_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_asin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_asinh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_atan2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_atan_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_atanh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_atleast_1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_atleast_2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_atleast_3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_block_diag_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_broadcast_tensors_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_broadcast_to_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_bucketize_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_cat_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_cauchy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_ceil_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_chunk_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_clamp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_clamp_max_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_clamp_min_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_clone_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_column_stack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_conj_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_conj_physical_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_constant_pad_nd_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_contiguous_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_copysign_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_cos_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_cosh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_count_nonzero_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_cumprod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_cumsum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_deg2rad_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_diag_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_diag_embed_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_diagonal_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_diagonal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_diagonal_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_digamma_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_div_floor_rounding_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_div_no_rounding_mode_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_div_trunc_rounding_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_dot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_dsplit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_dstack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_empty_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_empty_like_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_empty_strided_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_eq_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_equal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_erf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_erfc_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_erfinv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_exp2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_exp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_expand_as_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_expand_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_expand_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_expm1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_exponential_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_eye_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_fft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_fft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_fftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_fftshift_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_hfft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_hfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_hfftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ifft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ifft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ifftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ifftshift_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ihfft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ihfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ihfftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_irfft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_irfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_irfftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_rfft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_rfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_rfftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fill_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_flatten_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_flip_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fliplr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_flipud_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_float_power_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_floor_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_floor_divide_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fmin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fmod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_frac_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_frexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_ge_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_geometric_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_gt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_heaviside_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_hsplit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_hstack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_hypot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_i0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_igamma_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_igammac_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_index_add_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_index_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_index_fill_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_index_select_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_isclose_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_isfinite_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_isinf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_isnan_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_isneginf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_isposinf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_isreal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_item_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_le_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_lerp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_lgamma_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_cross_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_diagonal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_matrix_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_svd_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_svdvals_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_vecdot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_vector_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linspace_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linspace_tensor_overload_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_log10_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_log1p_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_log2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_log_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_log_normal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_logaddexp2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_logaddexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_logical_and_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_logical_not_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_logical_or_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_logical_xor_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_logspace_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_logspace_tensor_overload_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_logsumexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_lt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_masked_fill_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_maximum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_meshgrid_list_of_tensors_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_meshgrid_variadic_tensors_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_minimum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_movedim_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_mul_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nan_to_num_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_narrow_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_narrow_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_native_layer_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_ne_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_neg_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_new_empty_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_new_empty_strided_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_new_full_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_new_ones_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_new_zeros_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nextafter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_alpha_dropout_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_celu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_channel_shuffle_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_dropout_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_elu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_gelu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_glu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_group_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_hardshrink_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_hardtanh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_huber_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_l1_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_layer_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_leaky_relu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_mish_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_mse_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_nll_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_pairwise_distance_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_pdist_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_pixel_shuffle_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_prelu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_relu6_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_relu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_selu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_softplus_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_softshrink_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_tanhshrink_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_threshold_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_normal__in_place_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_normal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_normal_number_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_ones_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_permute_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_permute_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_positive_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_pow_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_prod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_rad2deg_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_randn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_ravel_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_real_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_reciprocal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_remainder_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_renorm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_repeat_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_reshape_as_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_reshape_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_roll_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_rot90_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_round_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_rsqrt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_rsub_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_select_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sgn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sigmoid_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sign_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_signbit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sinc_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sinh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_bessel_j0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_bessel_j1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_entr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_erfcx_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_i0e_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_i1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_i1e_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_log_ndtr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_logit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_multigammaln_mvlgamma_p_1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_ndtr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_ndtri_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_spherical_bessel_j0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_xlog1py_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_zeta_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_split_with_sizes_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sqrt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_square_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_squeeze_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_squeeze_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_squeeze_multiple_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_stack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_std_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_std_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_stft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sub_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sum_to_size_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_t_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_t_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_take_along_dim_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_tan_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_tanh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_tensor_split_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_to_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_trace_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_transpose_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_transpose_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_tril_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_triu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_true_divide_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_trunc_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_unbind_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_unbind_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_unflatten_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_unfold_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_unfold_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_unsqueeze_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_unsqueeze_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_var_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_var_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_vdot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_view_as_complex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_view_as_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_view_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_view_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_vsplit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_vstack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_where_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_xlogy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_zeros_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__segment_reduce_lengths_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__segment_reduce_offsets_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__softmax_backward_data_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__unsafe_masked_index_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__unsafe_masked_index_put_accumulate_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__upsample_bilinear2d_aa_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_abs_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_acos_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_acosh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_add_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addbmm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addcdiv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addcmul_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addmm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addmm_decomposed_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addmv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_alias_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_all_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_allclose_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_amax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_amin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_aminmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_angle_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_any_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_arange_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_argmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_argmin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_argsort_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_argwhere_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_as_strided_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_as_strided_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_as_strided_partial_views_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_as_strided_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_asin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_asinh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_atan2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_atan_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_atanh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_atleast_1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_atleast_2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_atleast_3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_baddbmm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_bernoulli_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_bfloat16_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_block_diag_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_bmm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_bool_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_broadcast_tensors_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_broadcast_to_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_bucketize_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_byte_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cartesian_prod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cat_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cauchy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cdist_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cdouble_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_ceil_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cfloat_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_chalf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_char_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cholesky_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cholesky_inverse_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cholesky_solve_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_chunk_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_clamp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_clamp_max_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_clamp_min_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_clone_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_column_stack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_combinations_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_complex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_conj_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_conj_physical_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_constant_pad_nd_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_contiguous_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_copysign_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_corrcoef_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cos_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cosh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_count_nonzero_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cov_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cross_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cummax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cummin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cumprod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cumsum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cumulative_trapezoid_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_deg2rad_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_diag_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_diag_embed_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_diagflat_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_diagonal_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_diagonal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_diagonal_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_diff_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_digamma_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_dist_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_div_floor_rounding_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_div_no_rounding_mode_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_div_trunc_rounding_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_dot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_double_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_dsplit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_dstack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_einsum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_empty_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_empty_like_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_empty_permuted_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_empty_strided_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_eq_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_equal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_erf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_erfc_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_erfinv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_exp2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_exp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_expand_as_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_expand_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_expand_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_expm1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_exponential_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_eye_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_fft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_fft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_fftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_fftshift_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_hfft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_hfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_hfftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_ifft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_ifft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_ifftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_ifftshift_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_ihfft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_ihfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_ihfftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_irfft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_irfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_irfftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_rfft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_rfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_rfftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fill_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_flatten_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_flip_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fliplr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_flipud_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_float_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_float_power_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_floor_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_floor_divide_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fmin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fmod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_frac_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_frexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_full_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_full_like_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_gather_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_ge_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_geometric_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_geqrf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_gradient_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_grid_sampler_2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_grid_sampler_3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_gt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_half_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_hash_tensor_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_heaviside_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_histc_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_hsplit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_hstack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_hypot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_i0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_igamma_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_igammac_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_add_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_fill_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_put_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_reduce_amax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_reduce_amin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_reduce_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_reduce_prod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_select_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_inner_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_int_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isclose_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isfinite_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isinf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isnan_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isneginf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isposinf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isreal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_item_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_jiterator_2inputs_2outputs_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_jiterator_4inputs_with_extra_args_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_jiterator_binary_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_jiterator_binary_return_by_ref_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_jiterator_unary_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_kron_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_kthvalue_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_ldexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_le_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_lerp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_lgamma_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_cholesky_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_cholesky_ex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_cond_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_cross_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_det_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_diagonal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_eig_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_eigh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_eigvals_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_eigvalsh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_householder_product_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_inv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_inv_ex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_ldl_factor_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_ldl_factor_ex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_ldl_solve_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_lstsq_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_lstsq_grad_oriented_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_lu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_lu_factor_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_lu_factor_ex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_lu_solve_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_matrix_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_matrix_power_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_matrix_rank_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_matrix_rank_hermitian_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_multi_dot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_norm_subgradients_at_zero_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_pinv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_pinv_hermitian_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_pinv_singular_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_qr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_slogdet_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_solve_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_solve_ex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_solve_triangular_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_svd_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_svdvals_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_tensorinv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_tensorsolve_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_vander_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_vecdot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_vector_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linspace_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linspace_tensor_overload_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_log10_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_log1p_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_log2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_log_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_log_normal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_log_softmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logaddexp2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logaddexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logcumsumexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logdet_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logical_and_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logical_not_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logical_or_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logical_xor_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logspace_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logspace_tensor_overload_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logsumexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_long_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_lt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_lu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_lu_solve_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_lu_unpack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mH_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mT_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_amax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_amin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_argmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_argmin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_cumprod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_cumsum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_fill_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_log_softmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_logaddexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_logsumexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_median_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_normalize_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_prod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_select_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_softmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_softmin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_std_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_sum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_var_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_matmul_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_matrix_exp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_max_binary_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_max_pool2d_with_indices_backward_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_max_reduction_no_dim_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_max_reduction_with_dim_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_maximum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_median_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_meshgrid_list_of_tensors_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_meshgrid_variadic_tensors_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_min_binary_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_min_reduction_no_dim_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_min_reduction_with_dim_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_minimum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mode_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_movedim_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_msort_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mul_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_multinomial_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nan_to_num_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nanmean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nanmedian_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nanquantile_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nansum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_narrow_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_narrow_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_native_batch_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_native_dropout_backward_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_native_layer_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_ne_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_neg_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_new_empty_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_new_empty_strided_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_new_full_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_new_ones_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_new_zeros_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nextafter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_alpha_dropout_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_avg_pool1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_avg_pool2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_avg_pool3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_batch_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_bilinear_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_binary_cross_entropy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_celu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_channel_shuffle_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_conv1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_conv2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_conv3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_conv_transpose1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_conv_transpose2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_conv_transpose3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_cosine_embedding_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_cosine_similarity_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_cross_entropy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_ctc_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_dropout2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_dropout3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_dropout_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_elu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_embedding_bag_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_embedding_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_fractional_max_pool2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_fractional_max_pool3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_gaussian_nll_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_gelu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_glu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_grid_sample_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_group_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_hardshrink_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_hardsigmoid_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_hardswish_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_hardtanh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_huber_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_instance_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_interpolate_area_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_interpolate_bicubic_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_interpolate_bilinear_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_interpolate_linear_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_interpolate_nearest_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_interpolate_trilinear_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_kl_div_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_l1_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_layer_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_leaky_relu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_linear_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_local_response_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_logsigmoid_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_max_pool1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_max_pool2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_max_pool3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_max_unpool1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_max_unpool1d_grad_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_max_unpool2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_max_unpool2d_grad_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_max_unpool3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_max_unpool3d_grad_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_mish_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_mse_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_multi_head_attention_forward_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_multi_margin_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_multilabel_margin_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_nll_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_normalize_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pad_circular_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pad_constant_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pad_reflect_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pad_replicate_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pad_replicate_negative_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pairwise_distance_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pdist_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pixel_shuffle_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_prelu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_relu6_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_relu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_rms_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_rrelu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_selu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_silu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_soft_margin_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_softmin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_softplus_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_softshrink_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_softsign_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_tanhshrink_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_threshold_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_unfold_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_upsample_bilinear_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_upsample_nearest_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nonzero_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nonzero_static_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_norm_fro_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_norm_inf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_norm_nuc_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_normal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_normal_in_place_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_normal_number_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_ones_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_ones_like_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_ormqr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_outer_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_pca_lowrank_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_permute_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_permute_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_pinverse_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_polar_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_polygamma_polygamma_n_0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_polygamma_polygamma_n_1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_polygamma_polygamma_n_2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_polygamma_polygamma_n_3_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_polygamma_polygamma_n_4_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_positive_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_pow_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_prod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_put_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_qr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_quantile_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_rad2deg_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_rand_like_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_randint_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_randint_like_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_randn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_randn_like_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_ravel_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_real_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_reciprocal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_remainder_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_renorm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_repeat_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_repeat_interleave_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_reshape_as_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_reshape_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_resize__cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_resize_as__cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_resolve_conj_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_resolve_neg_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_roll_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_rot90_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_round_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_round_decimals_0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_round_decimals_3_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_round_decimals_neg_3_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_rsqrt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_rsub_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_scalar_tensor_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_scatter_add_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_scatter_reduce_amax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_scatter_reduce_amin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_scatter_reduce_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_scatter_reduce_prod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_scatter_reduce_sum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_searchsorted_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_select_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_select_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sgn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_short_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sigmoid_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sign_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_bartlett_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_blackman_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_cosine_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_exponential_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_gaussian_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_general_cosine_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_general_hamming_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_hamming_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_hann_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_kaiser_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_nuttall_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signbit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sinc_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sinh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_slice_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_slice_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_softmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sort_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sparse_mm_reduce_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sparse_sampled_addmm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_airy_ai_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_bessel_j0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_bessel_j1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_bessel_y0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_bessel_y1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_chebyshev_polynomial_t_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_chebyshev_polynomial_u_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_chebyshev_polynomial_v_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_chebyshev_polynomial_w_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_entr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_erfcx_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_hermite_polynomial_h_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_hermite_polynomial_he_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_i0e_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_i1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_i1e_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_laguerre_polynomial_l_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_legendre_polynomial_p_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_log_ndtr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_modified_bessel_i0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_modified_bessel_i1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_modified_bessel_k0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_modified_bessel_k1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_ndtr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_ndtri_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_scaled_modified_bessel_k0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_scaled_modified_bessel_k1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_spherical_bessel_j0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_xlog1py_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_zeta_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_split_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_split_list_args_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_split_with_sizes_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_split_with_sizes_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sqrt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_square_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_squeeze_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_squeeze_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_squeeze_multiple_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_stack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_std_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_std_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_std_mean_unbiased_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_std_unbiased_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_stft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sub_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sum_to_size_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_svd_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_svd_lowrank_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_t_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_t_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_take_along_dim_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_take_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_tan_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_tanh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_tensor_split_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_tensordot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_tile_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_to_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_to_sparse_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_topk_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_trace_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_transpose_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_transpose_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_trapezoid_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_trapz_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_triangular_solve_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_tril_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_triu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_true_divide_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_trunc_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unbind_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unbind_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unflatten_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unfold_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unfold_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_uniform_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unique_consecutive_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unique_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unsafe_chunk_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unsafe_split_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unsqueeze_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unsqueeze_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_var_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_var_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_var_mean_unbiased_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_var_unbiased_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_vdot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_view_as_complex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_view_as_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_view_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_view_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_vsplit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_vstack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_where_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_xlogy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_zero__cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_zeros_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_zeros_like_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_fake_H_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_T_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake___getitem___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake___radd___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake___rand___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake___rdiv___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake___rmatmul___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake___rmod___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake___rmul___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake___ror___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake___rpow___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake___rsub___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake___rxor___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake__batch_norm_with_update_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake__chunk_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake__softmax_backward_data_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake__unsafe_masked_index_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_abs_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_acos_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_acosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_addbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_addcdiv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_addcmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_addmm_decomposed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_addmv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_addr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_alias_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_all_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_allclose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_aminmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_angle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_any_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_arange_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_argmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_argmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_argsort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_argwhere_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_as_strided_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_as_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_as_strided_partial_views_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_as_strided_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_asin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_asinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_atan2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_atan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_atanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_atleast_1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_atleast_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_atleast_3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_H_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_T_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___getitem___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___radd___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___rand___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___rdiv___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___rmatmul___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___rmod___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___rmul___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___ror___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___rpow___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___rsub___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___rxor___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast__batch_norm_with_update_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast__chunk_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast__softmax_backward_data_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast__unsafe_masked_index_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_abs_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_acos_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_acosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addcdiv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addcmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addmm_decomposed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addmv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_alias_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_all_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_allclose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_aminmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_angle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_any_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_arange_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_argmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_argmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_argsort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_argwhere_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_as_strided_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_as_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_as_strided_partial_views_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_as_strided_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_asin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_asinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_atan2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_atan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_atanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_atleast_1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_atleast_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_atleast_3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_baddbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bernoulli_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bfloat16_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bincount_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bitwise_and_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bitwise_left_shift_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bitwise_not_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bitwise_or_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bitwise_right_shift_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bitwise_xor_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_block_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bool_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_broadcast_shapes_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_broadcast_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_broadcast_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bucketize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_byte_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cartesian_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cauchy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cdist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cdouble_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_ceil_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cfloat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_chalf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_char_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cholesky_inverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cholesky_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_clamp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_clamp_max_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_clamp_min_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_clone_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_column_stack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_combinations_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_conj_physical_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_constant_pad_nd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_contiguous_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_copysign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_corrcoef_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cos_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_count_nonzero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cov_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cross_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cummax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cummin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_deg2rad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_diag_embed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_diagflat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_diagonal_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_diagonal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_diagonal_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_diff_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_digamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_dist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_div_floor_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_div_trunc_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_double_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_dsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_dstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_einsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_empty_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_empty_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_empty_permuted_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_empty_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_eq_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_equal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_erf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_erfc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_erfinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_exp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_expand_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_expand_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_expand_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_expm1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_exponential_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_eye_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_fft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_fft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_fftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_fftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_hfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_hfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_hfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_ifft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_ifft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_ifftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_ifftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_ihfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_ihfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_ihfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_irfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_irfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_irfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_rfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_rfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_rfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_flatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_flip_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fliplr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_flipud_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_float_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_float_power_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_floor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_floor_divide_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fmod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_frac_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_frexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_full_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_full_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_gather_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_gcd_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_ge_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_geometric_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_geqrf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_gradient_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_grid_sampler_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_grid_sampler_3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_gt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_half_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_hash_tensor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_heaviside_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_histc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_hsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_hstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_hypot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_i0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_igamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_igammac_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_imag_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_index_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_index_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_index_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_index_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_index_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_index_reduce_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_index_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_index_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_index_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_inner_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_int_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_isclose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_isfinite_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_isin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_isinf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_isnan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_isneginf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_isposinf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_isreal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_istft_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_item_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_jiterator_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_jiterator_unary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_kron_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_kthvalue_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_lcm_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_ldexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_le_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_lerp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_lgamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_cond_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_cross_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_det_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_diagonal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_eig_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_eigh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_eigvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_householder_product_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_inv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_inv_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_lstsq_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_lu_factor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_matrix_power_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_multi_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_pinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_slogdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_solve_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_svdvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_tensorinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_vander_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_vecdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_vector_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log10_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log1p_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logaddexp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logcumsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logical_and_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logical_not_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logical_or_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logical_xor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_long_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_lt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_lu_unpack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_mH_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_mT_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_argmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_argmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_log_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_matmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_matrix_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_max_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_maximum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_min_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_minimum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_mm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_movedim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_msort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_mul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_multinomial_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_mv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nan_to_num_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nanmean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nanmedian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nanquantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nansum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_narrow_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_narrow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_native_batch_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_native_dropout_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_native_layer_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_ne_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_new_empty_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_new_empty_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_new_full_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_new_ones_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_new_zeros_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nextafter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_celu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_elu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_embedding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_gelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_glu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_linear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_mish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_one_hot_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pdist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_prelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_relu6_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_selu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_silu_complex_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_silu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_softplus_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_softsign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_threshold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nonzero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nonzero_static_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_norm_fro_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_norm_inf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_norm_nuc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_normal_in_place_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_normal_number_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_ones_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_ones_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_ormqr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_outer_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_pca_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_permute_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_permute_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_pinverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_polar_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_positive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_pow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_quantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_rad2deg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_rand_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_randint_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_randint_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_randn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_randn_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_ravel_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_real_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_reciprocal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_remainder_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_renorm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_repeat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_repeat_interleave_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_reshape_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_reshape_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_resize__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_resize_as__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_resolve_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_resolve_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_roll_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_rot90_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_round_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_round_decimals_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_round_decimals_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_rsqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_rsub_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_scalar_tensor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_scatter_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_searchsorted_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_select_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sgn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_short_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_blackman_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_cosine_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_exponential_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_hamming_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_hann_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signbit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sinc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_slice_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_slice_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_softmax_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_airy_ai_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_bessel_j0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_bessel_j1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_bessel_y0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_bessel_y1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_entr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_erfcx_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_i0e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_i1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_i1e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_log_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_ndtri_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_xlog1py_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_zeta_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_split_list_args_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_split_with_sizes_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_square_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_squeeze_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_squeeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_squeeze_multiple_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_stack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_std_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_std_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_std_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_stft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sub_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sum_to_size_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_svd_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_t_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_take_along_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_take_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_tan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_tanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_tensor_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_tensordot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_tile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_to_sparse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_topk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_torch__scaled_mm_cuda_float8_e4m3fn, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_trace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_transpose_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_transpose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_trapz_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_triangular_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_tril_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_tril_indices_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_triu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_triu_indices_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_true_divide_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_trunc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unbind_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unbind_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unflatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unfold_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_uniform_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unique_consecutive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unique_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unravel_index_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unsafe_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unsafe_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unsqueeze_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unsqueeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_var_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_var_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_var_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_vdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_view_as_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_view_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_view_as_real_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_view_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_view_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_vsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_vstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_where_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_xlogy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_zero__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_zeros_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_zeros_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_baddbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_bernoulli_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_bfloat16_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_bincount_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_bitwise_and_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_bitwise_left_shift_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_bitwise_not_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_bitwise_or_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_bitwise_right_shift_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_bitwise_xor_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_block_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_bmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_bool_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_broadcast_shapes_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_broadcast_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_broadcast_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_bucketize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_byte_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cartesian_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cauchy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cdist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cdouble_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_ceil_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cfloat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_chalf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_char_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cholesky_inverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cholesky_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_clamp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_clamp_max_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_clamp_min_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_clone_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_column_stack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_combinations_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_conj_physical_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_constant_pad_nd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_contiguous_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_copysign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_corrcoef_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cos_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_count_nonzero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cov_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cross_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_H_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_T_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___getitem___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___radd___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___rdiv___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___rmatmul___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___rmod___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___rmul___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___rpow___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___rsub___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp__batch_norm_with_update_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp__softmax_backward_data_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp__unsafe_masked_index_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_abs_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_acos_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_acosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_addbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_addcdiv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_addcmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_addmm_decomposed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_addmv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_addr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_alias_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_angle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_as_strided_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_as_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_as_strided_partial_views_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_as_strided_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_asin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_asinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_atan2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_atan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_atanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_atleast_1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_atleast_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_atleast_3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_baddbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_bernoulli_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_bfloat16_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_block_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_bmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_broadcast_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_broadcast_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cartesian_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cdist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cdouble_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_ceil_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cfloat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_chalf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cholesky_inverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cholesky_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_clamp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_clamp_max_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_clamp_min_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_clone_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_column_stack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_combinations_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_conj_physical_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_constant_pad_nd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_contiguous_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_copysign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_corrcoef_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cos_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cov_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cross_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cummax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cummin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_deg2rad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_diag_embed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_diagflat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_diagonal_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_diagonal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_diagonal_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_diff_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_digamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_dist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_div_floor_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_div_trunc_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_double_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_dsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_dstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_einsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_erf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_erfc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_erfinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_exp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_expand_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_expand_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_expand_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_expm1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_fft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_fft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_fftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_fftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_hfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_hfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_hfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_ifft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_ifft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_ifftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_ifftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_ihfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_ihfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_ihfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_irfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_irfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_irfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_rfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_rfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_rfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_flatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_flip_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fliplr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_flipud_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_float_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_float_power_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_floor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fmod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_frac_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_frexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_gather_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_gradient_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_grid_sampler_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_grid_sampler_3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_half_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_hsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_hstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_hypot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_i0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_reduce_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_inner_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_kron_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_kthvalue_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_ldexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_lerp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_lgamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_cond_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_cross_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_det_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_diagonal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_eig_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_eigh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_eigvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_householder_product_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_inv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_inv_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_lstsq_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_lu_factor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_matrix_power_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_multi_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_pinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_slogdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_solve_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_svdvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_tensorinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_vander_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_vecdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_vector_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_log10_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_log1p_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_log2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_log_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_log_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_logaddexp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_logcumsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_logdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_logit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_lu_unpack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_mH_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_mT_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_log_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_matmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_matrix_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_max_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_maximum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_min_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_minimum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_mm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_movedim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_msort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_mul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_mv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nan_to_num_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nanmean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nanmedian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nanquantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nansum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_narrow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_native_batch_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_native_dropout_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_native_layer_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_celu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_elu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_embedding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_gelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_glu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_linear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_mish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pdist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_prelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_relu6_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_selu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_silu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_softplus_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_softsign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_threshold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_norm_fro_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_norm_inf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_norm_nuc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_normal_number_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_ormqr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_outer_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_pca_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_permute_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_permute_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_pinverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_polar_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_positive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_pow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_quantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_rad2deg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_ravel_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_real_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_reciprocal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_remainder_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_renorm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_repeat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_repeat_interleave_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_reshape_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_reshape_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_resolve_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_resolve_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_roll_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_rot90_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_round_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_round_decimals_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_round_decimals_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_rsqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_rsub_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_scatter_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_select_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sgn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sinc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_slice_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_slice_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_softmax_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_entr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_erfcx_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_i0e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_i1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_i1e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_log_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_ndtri_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_xlog1py_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_split_list_args_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_split_with_sizes_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_square_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_squeeze_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_squeeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_squeeze_multiple_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_stack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_std_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_std_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_std_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_stft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sub_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sum_to_size_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_svd_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_t_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_take_along_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_take_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_tan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_tanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_tensor_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_tensordot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_tile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_to_sparse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_topk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_trace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_transpose_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_transpose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_trapz_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_triangular_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_tril_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_triu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_true_divide_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_trunc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_unbind_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_unbind_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_unflatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_unfold_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_unsafe_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_unsafe_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_unsqueeze_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_unsqueeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_var_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_var_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_var_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_vdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_view_as_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_view_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_view_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_view_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_vsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_vstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_where_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_xlogy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_zero__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_H_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_T_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp___getitem___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp___radd___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp___rdiv___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp___rmatmul___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp___rmod___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp___rmul___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp___rpow___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp___rsub___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp__batch_norm_with_update_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp__softmax_backward_data_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp__unsafe_masked_index_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_abs_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_acos_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_acosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_addbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_addcdiv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_addcmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_addmm_decomposed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_addmv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_addr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_alias_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_angle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_as_strided_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_as_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_as_strided_partial_views_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_as_strided_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_asin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_asinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_atan2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_atan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_atanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_atleast_1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_atleast_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_atleast_3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_baddbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_bernoulli_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_bfloat16_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_block_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_bmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_broadcast_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_broadcast_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cartesian_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cdist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cdouble_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_ceil_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cfloat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_chalf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cholesky_inverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cholesky_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_clamp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_clamp_max_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_clamp_min_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_clone_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_column_stack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_combinations_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_conj_physical_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_constant_pad_nd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_contiguous_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_copysign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_corrcoef_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cos_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cov_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cross_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cummax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cummin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_deg2rad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_diag_embed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_diagflat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_diagonal_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_diagonal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_diagonal_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_diff_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_digamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_dist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_div_floor_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_div_trunc_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_double_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_dsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_dstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_einsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_erf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_erfc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_erfinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_exp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_expand_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_expand_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_expand_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_expm1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_fft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_fft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_fftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_fftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_hfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_hfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_hfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_ifft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_ifft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_ifftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_ifftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_ihfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_ihfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_ihfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_irfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_irfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_irfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_rfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_rfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_rfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_flatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_flip_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fliplr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_flipud_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_float_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_float_power_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_floor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fmod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_frac_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_frexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_gather_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_gradient_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_grid_sampler_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_grid_sampler_3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_half_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_hsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_hstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_hypot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_i0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_reduce_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_inner_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_kron_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_kthvalue_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_ldexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_lerp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_lgamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_cond_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_cross_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_det_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_diagonal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_eig_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_eigh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_eigvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_householder_product_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_inv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_inv_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_lstsq_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_lu_factor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_matrix_power_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_multi_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_pinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_slogdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_solve_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_svdvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_tensorinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_vander_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_vecdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_vector_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_log10_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_log1p_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_log2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_log_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_log_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_logaddexp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_logcumsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_logdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_logit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_lu_unpack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mH_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mT_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_log_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_matmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_matrix_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_max_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_maximum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_min_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_minimum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_movedim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_msort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nan_to_num_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nanmean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nanmedian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nanquantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nansum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_narrow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_native_batch_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_native_dropout_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_native_layer_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_celu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_elu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_embedding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_gelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_glu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_linear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_mish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_pdist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_prelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_relu6_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_selu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_silu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_softplus_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_softsign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_threshold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_norm_fro_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_norm_inf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_norm_nuc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_normal_number_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_ormqr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_outer_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_pca_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_permute_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_permute_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_pinverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_polar_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_positive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_pow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_quantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_rad2deg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_ravel_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_real_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_reciprocal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_remainder_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_renorm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_repeat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_repeat_interleave_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_reshape_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_reshape_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_resolve_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_resolve_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_roll_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_rot90_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_round_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_round_decimals_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_round_decimals_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_rsqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_rsub_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_scatter_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_select_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sgn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sinc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_slice_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_slice_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_softmax_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_special_entr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_special_erfcx_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_special_i0e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_special_i1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_special_i1e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_special_log_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_special_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_special_ndtri_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_special_xlog1py_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_split_list_args_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_split_with_sizes_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_square_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_squeeze_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_squeeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_squeeze_multiple_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_stack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_std_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_std_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_std_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_stft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sub_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sum_to_size_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_svd_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_t_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_take_along_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_take_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_tan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_tanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_tensor_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_tensordot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_tile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_to_sparse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_topk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_trace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_transpose_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_transpose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_trapz_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_triangular_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_tril_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_triu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_true_divide_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_trunc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unbind_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unbind_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unflatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unfold_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unsafe_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unsafe_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unsqueeze_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unsqueeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_var_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_var_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_var_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_vdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_view_as_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_view_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_view_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_view_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_vsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_vstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_where_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_xlogy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_zero__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cummax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cummin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_deg2rad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_diag_embed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_diagflat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_diagonal_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_diagonal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_diagonal_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_diff_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_digamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_dist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_div_floor_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_div_trunc_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_double_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_dsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_dstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_einsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_empty_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_empty_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_empty_permuted_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_empty_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_eq_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_equal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_erf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_erfc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_erfinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_exp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_expand_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_expand_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_expand_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_expm1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_exponential_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_eye_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_fft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_fft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_fftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_fftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_hfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_hfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_hfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_ifft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_ifft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_ifftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_ifftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_ihfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_ihfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_ihfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_irfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_irfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_irfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_rfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_rfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_rfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_flatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_flip_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fliplr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_flipud_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_float_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_float_power_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_floor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_floor_divide_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fmod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_frac_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_frexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_full_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_full_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_gather_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_gcd_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_ge_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_geometric_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_geqrf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_gradient_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_grid_sampler_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_grid_sampler_3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_gt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_half_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_hash_tensor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_heaviside_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_histc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_hsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_hstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_hypot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_i0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_igamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_igammac_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_imag_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_fake_index_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_index_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_index_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_index_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_index_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_index_reduce_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_index_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_index_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_index_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_inner_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_int_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isclose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isfinite_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isinf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isnan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isneginf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isposinf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isreal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_istft_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_fake_item_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_jiterator_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_jiterator_unary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_kron_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_kthvalue_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_lcm_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_ldexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_le_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_lerp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_lgamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_cond_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_cross_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_det_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_diagonal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_eig_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_eigh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_eigvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_householder_product_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_inv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_inv_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_lstsq_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_lu_factor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_matrix_power_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_multi_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_pinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_slogdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_solve_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_svdvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_tensorinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_vander_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_vecdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_vector_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_log10_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_log1p_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_log2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_log_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_log_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_log_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logaddexp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logcumsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logical_and_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logical_not_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logical_or_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logical_xor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_long_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_lt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_lu_unpack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_mH_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_mT_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_argmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_argmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_log_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_matmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_matrix_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_max_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_maximum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_min_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_minimum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_mm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_movedim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_msort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_mul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_multinomial_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_mv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nan_to_num_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nanmean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nanmedian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nanquantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nansum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_narrow_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_narrow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_native_batch_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_native_dropout_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_native_layer_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_ne_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_new_empty_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_new_empty_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_new_full_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_new_ones_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_new_zeros_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nextafter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_celu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_elu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_embedding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_gelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_glu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_linear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_mish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_one_hot_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pdist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_prelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_relu6_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_selu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_silu_complex_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_silu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_softplus_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_softsign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_threshold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nonzero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nonzero_static_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_norm_fro_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_norm_inf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_norm_nuc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_normal_in_place_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_normal_number_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_ones_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_ones_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_ormqr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_outer_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_pca_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_permute_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_permute_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_pinverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_polar_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_positive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_pow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_quantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_rad2deg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_rand_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_randint_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_randint_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_randn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_randn_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_ravel_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_real_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_reciprocal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_remainder_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_renorm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_repeat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_repeat_interleave_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_reshape_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_reshape_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_resize__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_resize_as__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_resolve_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_resolve_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_roll_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_rot90_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_round_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_round_decimals_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_round_decimals_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_rsqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_rsub_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_scalar_tensor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_scatter_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_searchsorted_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_select_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sgn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_short_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_blackman_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_cosine_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_exponential_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_hamming_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_hann_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signbit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sinc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_slice_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_slice_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_softmax_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_airy_ai_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_bessel_j0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_bessel_j1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_bessel_y0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_bessel_y1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_entr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_erfcx_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_i0e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_i1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_i1e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_log_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_ndtri_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_xlog1py_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_zeta_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_split_list_args_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_split_with_sizes_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_square_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_squeeze_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_squeeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_squeeze_multiple_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_stack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_std_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_std_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_std_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_stft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sub_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sum_to_size_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_svd_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_t_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_take_along_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_take_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_tan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_tanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_tensor_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_tensordot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_tile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_to_sparse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_topk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_torch__scaled_mm_cuda_float8_e4m3fn, test/test_ops.py::TestFakeTensorCUDA::test_fake_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_fake_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_trace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_transpose_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_transpose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_trapz_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_triangular_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_tril_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_tril_indices_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_triu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_triu_indices_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_true_divide_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_trunc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unbind_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unbind_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unflatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unfold_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_uniform_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unique_consecutive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unique_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unravel_index_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_unsafe_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unsafe_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unsqueeze_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unsqueeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_var_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_var_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_var_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_vdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_view_as_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_view_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_view_as_real_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_fake_view_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_view_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_vsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_vstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_where_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_xlogy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_zero__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_zeros_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_zeros_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_H_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_T_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___getitem___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___radd___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___rand___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___rdiv___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___rmatmul___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___rmod___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___rmul___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___ror___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___rpow___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___rsub___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___rxor___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__batch_norm_with_update_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__chunk_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__softmax_backward_data_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__unsafe_masked_index_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_abs_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_acos_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_acosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_addbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_addcdiv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_addcmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_addmm_decomposed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_addmv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_addr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_alias_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_all_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_allclose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_aminmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_angle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_any_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_arange_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_argmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_argmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_argsort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_argwhere_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_as_strided_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_as_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_as_strided_partial_views_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_as_strided_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_asin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_asinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_atan2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_atan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_atanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_atleast_1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_atleast_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_atleast_3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_baddbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bernoulli_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bfloat16_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bincount_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bitwise_and_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bitwise_left_shift_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bitwise_not_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bitwise_or_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bitwise_right_shift_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bitwise_xor_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_block_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bool_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_broadcast_shapes_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_broadcast_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_broadcast_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bucketize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_byte_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cartesian_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cauchy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cdist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cdouble_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_ceil_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cfloat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_chalf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_char_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cholesky_inverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cholesky_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_clamp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_clamp_max_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_clamp_min_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_clone_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_column_stack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_combinations_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_conj_physical_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_constant_pad_nd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_contiguous_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_copysign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_corrcoef_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cos_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_count_nonzero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cov_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cross_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cummax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cummin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_deg2rad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diag_embed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diagflat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diagonal_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diagonal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diagonal_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diff_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_digamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_dist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_div_floor_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_div_trunc_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_double_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_dsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_dstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_einsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_empty_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_empty_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_empty_permuted_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_empty_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_eq_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_equal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_erf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_erfc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_erfinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_exp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_expand_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_expand_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_expand_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_expm1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_exponential_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_eye_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_fft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_fft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_fftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_fftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_hfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_hfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_hfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_ifft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_ifft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_ifftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_ifftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_ihfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_ihfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_ihfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_irfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_irfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_irfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_rfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_rfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_rfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_flatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_flip_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fliplr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_flipud_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_float_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_float_power_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_floor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_floor_divide_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fmod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_frac_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_frexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_full_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_full_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_gather_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_gcd_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_ge_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_geometric_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_geqrf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_gradient_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_grid_sampler_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_grid_sampler_3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_gt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_half_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_hash_tensor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_heaviside_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_histc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_hsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_hstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_hypot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_i0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_igamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_igammac_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_imag_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_reduce_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_inner_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_int_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_isclose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_isfinite_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_isin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_isinf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_isnan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_isneginf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_isposinf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_isreal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_istft_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_item_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_jiterator_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_jiterator_unary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_kron_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_kthvalue_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_lcm_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_ldexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_le_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_lerp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_lgamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_cond_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_cross_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_det_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_diagonal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_eig_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_eigh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_eigvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_householder_product_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_inv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_inv_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_lstsq_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_lu_factor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_matrix_power_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_multi_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_pinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_slogdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_solve_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_svdvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_tensorinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_vander_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_vecdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_vector_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_log10_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_log1p_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_log2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_log_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_log_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_log_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logaddexp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logcumsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logical_and_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logical_not_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logical_or_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logical_xor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_long_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_lt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_lu_unpack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mH_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mT_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_argmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_argmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_log_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_matmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_matrix_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_max_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_maximum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_min_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_minimum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_movedim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_msort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_multinomial_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nan_to_num_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nanmean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nanmedian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nanquantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nansum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_narrow_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_narrow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_native_batch_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_native_dropout_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_native_layer_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_ne_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_new_empty_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_new_empty_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_new_full_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_new_ones_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_new_zeros_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nextafter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_celu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_elu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_embedding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_gelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_glu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_linear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_mish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_one_hot_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_pdist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_prelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_relu6_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_selu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_silu_complex_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_silu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_softplus_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_softsign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_threshold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nonzero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nonzero_static_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_norm_fro_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_norm_inf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_norm_nuc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_normal_in_place_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_normal_number_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_ones_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_ones_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_ormqr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_outer_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_pca_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_permute_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_permute_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_pinverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_polar_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_positive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_pow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_quantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_rad2deg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_rand_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_randint_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_randint_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_randn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_randn_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_ravel_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_real_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_reciprocal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_remainder_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_renorm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_repeat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_repeat_interleave_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_reshape_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_reshape_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_resize__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_resize_as__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_resolve_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_resolve_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_roll_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_rot90_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_round_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_round_decimals_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_round_decimals_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_rsqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_rsub_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_scalar_tensor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_scatter_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_searchsorted_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_select_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sgn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_short_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_blackman_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_cosine_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_exponential_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_hamming_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_hann_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signbit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sinc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_slice_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_slice_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_softmax_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_airy_ai_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_bessel_j0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_bessel_j1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_bessel_y0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_bessel_y1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_entr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_erfcx_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_i0e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_i1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_i1e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_log_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_ndtri_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_xlog1py_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_zeta_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_split_list_args_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_split_with_sizes_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_square_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_squeeze_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_squeeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_squeeze_multiple_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_stack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_std_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_std_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_std_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_stft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sub_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sum_to_size_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_svd_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_t_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_take_along_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_take_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_tan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_tanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_tensor_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_tensordot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_tile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_to_sparse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_topk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_torch__scaled_mm_cuda_float8_e4m3fn, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_trace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_transpose_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_transpose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_trapz_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_triangular_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_tril_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_tril_indices_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_triu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_triu_indices_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_true_divide_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_trunc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unbind_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unbind_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unflatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unfold_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_uniform_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unique_consecutive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unique_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unravel_index_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unsafe_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unsafe_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unsqueeze_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unsqueeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_var_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_var_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_var_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_vdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_view_as_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_view_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_view_as_real_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_view_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_view_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_vsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_vstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_where_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_xlogy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_zero__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_zeros_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_zeros_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_bool, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_complex32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_bool, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_complex32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_bool, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_complex32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_bool, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_complex32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_bool, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_complex32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_uint8, test/test_ops.py::TestTagsCUDA::test_tags_H_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_T_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags___getitem___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags___radd___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags___rand___cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags___rdiv___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags___rmatmul___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags___rmod___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags___rmul___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags___ror___cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags___rpow___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags___rsub___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags___rxor___cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__batch_norm_with_update_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__chunk_cat_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_T_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_bfloat16_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_bool_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_byte_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_cdouble_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_cfloat_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_chalf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_char_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_complex_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_double_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_float_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_half_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_int_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_long_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_polar_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_short_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_abs_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_acos_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_acosh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_add_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_addcdiv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_addcmul_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_addr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_alias_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_all_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_allclose_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_amax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_amin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_any_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_arange_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_as_strided_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_as_strided_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_as_strided_partial_views_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_as_strided_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_asin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_asinh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_atan2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_atan_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_atanh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_atleast_1d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_atleast_2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_atleast_3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_bitwise_and_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_bitwise_left_shift_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_bitwise_not_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_bitwise_or_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_bitwise_right_shift_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_bitwise_xor_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_block_diag_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_broadcast_shapes_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_broadcast_tensors_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_broadcast_to_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_bucketize_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_cat_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_cauchy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_ceil_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_chunk_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_clamp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_clamp_max_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_clamp_min_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_clone_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_column_stack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_conj_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_conj_physical_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_constant_pad_nd_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_contiguous_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_copysign_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_cos_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_cosh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_count_nonzero_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_cumprod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_cumsum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_deg2rad_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_diag_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_diag_embed_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_diagonal_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_diagonal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_diagonal_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_digamma_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_div_floor_rounding_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_div_trunc_rounding_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_dot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_dsplit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_dstack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_empty_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_empty_like_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_empty_strided_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_eq_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_equal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_erf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_erfc_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_erfinv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_exp2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_exp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_expand_as_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_expand_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_expand_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_expm1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_exponential_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_eye_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_fft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_fft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_fftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_fftshift_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_hfft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_hfft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_hfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_ifft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_ifft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_ifftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_ifftshift_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_ihfft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_ihfft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_ihfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_irfft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_irfft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_irfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_rfft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_rfft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_rfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fill_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_flatten_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_flip_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fliplr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_flipud_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_float_power_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_floor_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_floor_divide_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fmin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fmod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_frac_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_frexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_gcd_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_ge_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_geometric_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_gt_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_heaviside_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_hsplit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_hstack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_hypot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_i0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_igamma_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_igammac_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_imag_cuda_complex64, test/test_ops.py::TestTagsCUDA::test_tags__refs_index_add_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_index_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_index_fill_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_index_select_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_isclose_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_isfinite_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_isinf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_isnan_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_isneginf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_isposinf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_isreal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_istft_cuda_complex64, test/test_ops.py::TestTagsCUDA::test_tags__refs_item_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_lcm_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_le_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_lerp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_lgamma_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linalg_cross_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linalg_diagonal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linalg_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linalg_svd_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linalg_svdvals_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linalg_vecdot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linalg_vector_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linspace_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_log10_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_log1p_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_log2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_log_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_log_normal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logaddexp2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logaddexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logical_and_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logical_not_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logical_or_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logical_xor_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logspace_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logsumexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_lt_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_masked_fill_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_maximum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_minimum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_movedim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_mul_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nan_to_num_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_narrow_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_narrow_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_native_layer_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_ne_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_neg_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_new_empty_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_new_empty_strided_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_new_full_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_new_ones_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_new_zeros_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nextafter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_celu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_dropout_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_elu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_gelu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_glu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_mish_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_pdist_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_prelu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_relu6_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_relu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_selu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_softmax_with_dtype_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_softplus_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_threshold_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_normal__in_place_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_normal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_normal_number_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_ones_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_permute_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_permute_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_positive_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_pow_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_prod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_rad2deg_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_randn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_ravel_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_real_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_reciprocal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_remainder_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_renorm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_repeat_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_reshape_as_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_reshape_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_roll_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_rot90_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_round_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_rsqrt_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_rsub_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_select_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sgn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sigmoid_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sign_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_signbit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sinc_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sinh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_softmax_with_dtype_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_bessel_j0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_bessel_j1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_entr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_erfcx_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_i0e_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_i1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_i1e_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_log_ndtr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_logit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_ndtr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_ndtri_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_softmax_with_dtype_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_xlog1py_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_zeta_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_split_with_sizes_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sqrt_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_square_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_squeeze_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_squeeze_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_squeeze_multiple_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_stack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_std_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_std_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_stft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sub_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sum_to_size_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_t_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_t_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_take_along_dim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_tan_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_tanh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_tensor_split_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_to_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_trace_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_transpose_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_transpose_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_tril_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_tril_indices_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_triu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_triu_indices_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_true_divide_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_trunc_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_unbind_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_unbind_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_unflatten_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_unfold_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_unfold_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_unsqueeze_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_unsqueeze_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_var_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_var_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_vdot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_view_as_complex_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_view_as_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_view_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_view_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_vsplit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_vstack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_where_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_xlogy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_zeros_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__softmax_backward_data_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__unsafe_masked_index_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_abs_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_acos_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_acosh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_add_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addbmm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addcdiv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addcmul_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addmm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addmm_decomposed_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addmv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_alias_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_all_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_allclose_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_amax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_amin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_aminmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_angle_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_any_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_arange_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_argmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_argmin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_argsort_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_argwhere_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_as_strided_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_as_strided_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_as_strided_partial_views_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_as_strided_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_asin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_asinh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_atan2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_atan_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_atanh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_atleast_1d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_atleast_2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_atleast_3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_baddbmm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_bernoulli_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_bfloat16_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_bincount_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_bitwise_and_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_bitwise_left_shift_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_bitwise_not_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_bitwise_or_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_bitwise_right_shift_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_bitwise_xor_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_block_diag_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_bmm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_bool_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_broadcast_shapes_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_broadcast_tensors_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_broadcast_to_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_bucketize_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_byte_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cartesian_prod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cat_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cauchy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cdist_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cdouble_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_ceil_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cfloat_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_chalf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_char_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cholesky_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cholesky_inverse_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cholesky_solve_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_chunk_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_clamp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_clamp_max_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_clamp_min_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_clone_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_column_stack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_combinations_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_complex_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_conj_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_conj_physical_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_constant_pad_nd_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_contiguous_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_copysign_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_corrcoef_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cos_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cosh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_count_nonzero_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cov_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cross_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cummax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cummin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cumprod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cumsum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_deg2rad_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_diag_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_diag_embed_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_diagflat_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_diagonal_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_diagonal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_diagonal_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_diff_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_digamma_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_dist_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_div_floor_rounding_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_div_trunc_rounding_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_dot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_double_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_dsplit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_dstack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_einsum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_empty_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_empty_like_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_empty_permuted_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_empty_strided_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_eq_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_equal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_erf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_erfc_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_erfinv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_exp2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_exp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_expand_as_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_expand_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_expand_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_expm1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_exponential_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_eye_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_fft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_fft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_fftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_fftshift_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_hfft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_hfft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_hfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_ifft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_ifft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_ifftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_ifftshift_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_ihfft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_ihfft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_ihfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_irfft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_irfft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_irfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_rfft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_rfft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_rfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fill_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_flatten_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_flip_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fliplr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_flipud_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_float_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_float_power_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_floor_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_floor_divide_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fmin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fmod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_frac_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_frexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_full_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_full_like_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_gather_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_gcd_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_ge_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_geometric_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_geqrf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_gradient_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_grid_sampler_2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_grid_sampler_3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_gt_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_half_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_hash_tensor_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_heaviside_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_histc_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_hsplit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_hstack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_hypot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_i0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_igamma_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_igammac_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_imag_cuda_complex64, test/test_ops.py::TestTagsCUDA::test_tags_index_add_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_index_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_index_fill_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_index_put_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_index_reduce_amax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_index_reduce_amin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_index_reduce_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_index_reduce_prod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_index_select_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_inner_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_int_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_isclose_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_isfinite_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_isin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_isinf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_isnan_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_isneginf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_isposinf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_isreal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_istft_cuda_complex64, test/test_ops.py::TestTagsCUDA::test_tags_item_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_jiterator_binary_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_jiterator_unary_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_kron_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_kthvalue_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_lcm_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_ldexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_le_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_lerp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_lgamma_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_cholesky_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_cond_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_cross_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_det_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_diagonal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_eig_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_eigh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_eigvals_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_householder_product_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_inv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_inv_ex_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_lstsq_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_lu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_lu_factor_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_lu_solve_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_matrix_power_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_multi_dot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_pinv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_qr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_slogdet_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_solve_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_solve_ex_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_svd_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_svdvals_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_tensorinv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_vander_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_vecdot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_vector_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linspace_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_log10_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_log1p_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_log2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_log_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_log_normal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_log_softmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logaddexp2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logaddexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logcumsumexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logdet_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logical_and_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logical_not_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logical_or_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logical_xor_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logspace_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logsumexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_long_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_lt_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_lu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_lu_solve_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_lu_unpack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mH_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mT_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_amax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_amin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_argmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_argmin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_cumprod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_cumsum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_fill_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_log_softmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_logaddexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_logsumexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_median_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_normalize_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_prod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_select_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_softmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_softmin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_std_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_sum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_var_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_matmul_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_matrix_exp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_max_binary_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_maximum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_median_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_min_binary_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_minimum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mode_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_movedim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_msort_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mul_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_multinomial_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nan_to_num_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nanmean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nanmedian_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nanquantile_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nansum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_narrow_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_narrow_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_native_batch_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_native_dropout_backward_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_native_layer_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_ne_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_neg_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_new_empty_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_new_empty_strided_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_new_full_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_new_ones_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_new_zeros_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nextafter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_celu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_dropout_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_elu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_embedding_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_gelu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_glu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_linear_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_mish_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_normalize_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_one_hot_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_pdist_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_prelu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_relu6_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_relu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_selu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_silu_complex_cuda_complex64, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_silu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_softmin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_softplus_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_softsign_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_threshold_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_unfold_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nonzero_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nonzero_static_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_norm_fro_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_norm_inf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_norm_nuc_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_normal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_normal_in_place_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_normal_number_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_ones_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_ones_like_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_ormqr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_outer_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_pca_lowrank_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_permute_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_permute_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_pinverse_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_polar_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_positive_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_pow_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_prod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_put_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_qr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_quantile_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_rad2deg_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_rand_like_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_randint_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_randint_like_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_randn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_randn_like_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_ravel_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_real_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_reciprocal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_remainder_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_renorm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_repeat_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_repeat_interleave_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_reshape_as_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_reshape_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_resize__cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_resize_as__cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_resolve_conj_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_resolve_neg_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_roll_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_rot90_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_round_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_round_decimals_0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_round_decimals_3_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_rsqrt_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_rsub_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_scalar_tensor_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_scatter_add_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_searchsorted_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_select_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_select_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sgn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_short_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sigmoid_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sign_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_blackman_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_cosine_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_exponential_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_hamming_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_hann_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signbit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sinc_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sinh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_slice_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_slice_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_softmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_softmax_with_dtype_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sort_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_airy_ai_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_bessel_j0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_bessel_j1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_bessel_y0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_bessel_y1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_entr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_erfcx_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_i0e_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_i1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_i1e_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_log_ndtr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_ndtr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_ndtri_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_xlog1py_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_zeta_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_split_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_split_list_args_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_split_with_sizes_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sqrt_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_square_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_squeeze_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_squeeze_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_squeeze_multiple_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_stack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_std_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_std_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_std_mean_unbiased_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_std_unbiased_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_stft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sub_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sum_to_size_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_svd_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_svd_lowrank_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_t_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_t_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_take_along_dim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_take_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_tan_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_tanh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_tensor_split_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_tensordot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_tile_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_to_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_to_sparse_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_topk_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_torch__scaled_mm_cuda_float8_e4m3fn, test/test_ops.py::TestTagsCUDA::test_tags_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_ops.py::TestTagsCUDA::test_tags_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_trace_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_transpose_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_transpose_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_trapezoid_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_trapz_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_triangular_solve_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_tril_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_tril_indices_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_triu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_triu_indices_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_true_divide_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_trunc_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unbind_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unbind_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unflatten_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unfold_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unfold_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_uniform_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unique_consecutive_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unique_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unravel_index_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_unsafe_chunk_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unsafe_split_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unsqueeze_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unsqueeze_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_var_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_var_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_var_mean_unbiased_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_var_unbiased_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_vdot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_view_as_complex_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_view_as_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_view_as_real_cuda_complex64, test/test_ops.py::TestTagsCUDA::test_tags_view_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_view_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_vsplit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_vstack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_where_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_xlogy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_zero__cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_zeros_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_zeros_like_cuda_float32 2025-10-10T01:24:30.9210542Z 2025-10-10T01:24:32.8731973Z Running inductor/test_torchinductor_opinfo 2/11 ... [2025-10-10 01:24:32.872606] 2025-10-10T01:24:32.8732782Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:24:32.8735102Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'not serial', '--shard-id=2', '--num-shards=11', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:24:32.873092] 2025-10-10T01:25:53.3009121Z 2025-10-10T01:25:53.3009939Z inductor/test_aot_inductor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_1.1_3445a10b048e315b_.log 2025-10-10T01:25:53.3381873Z Running 882 items in this shard: test/inductor/test_aot_inductor.py::AOTInductorLoggingTest::test_shape_env_reuse, test/inductor/test_aot_inductor.py::AOTInductorLoggingTest::test_shape_env_reuse_zero_consts_use_consts_asm_false, test/inductor/test_aot_inductor.py::TestAOTInductorConfig::test_compile_standalone_explicit_set, test/inductor/test_aot_inductor.py::TestAOTInductorConfig::test_compile_standalone_package_cpp_false_raises, test/inductor/test_aot_inductor.py::TestAOTInductorConfig::test_compile_standalone_sets_package_cpp, test/inductor/test_aot_inductor.py::TestAOTInductorConfig::test_no_compile_standalone, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__int_mm_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_m_32_n_64_q_group_32_num_groups_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_m_32_n_64_q_group_32_num_groups_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_m_32_n_64_q_group_64_num_groups_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_m_32_n_64_q_group_64_num_groups_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_32_num_groups_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_32_num_groups_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_64_num_groups_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_64_num_groups_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_add_complex_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_addmm_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_addmm_multiple_dynamic_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aliased_buffer_reuse_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_amp_fallback_random_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aot_inductor_consts_cpp_build_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_constant_tensor_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_constant_tensor_name_collision_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_codegen_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_cpp_kernel_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_fp8_dtype_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_sym_inputs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_user_defined_triton_kernel_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printing_model_inputs_codegen_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_profiler_enable_kernel_profile_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_profiler_enable_kernel_profile_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_runtime_asserts_backed_symint_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_runtime_asserts_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_user_defined_triton_kernel_profiling_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_assert_async_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_assert_tensor_meta_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_autotune_int64_user_defined_triton_kernel_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_autotune_with_constant_folding_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_autotuning_args_reuse_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_backward_no_op_logging_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_bmm_multiple_dynamic_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_bool_input_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_boolean_indexing_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_buffer_mutation_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_buffer_mutation_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_buffer_mutation_3_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_buffer_mutation_4_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_buffer_mutation_and_force_mmap_weights_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_buffer_reuse_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_clamp_decomposition_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_composed_dynamic_size_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_mismatched_branch_output_dynamic_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_mismatched_branch_output_dynamic_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_nested_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_non_tensor_predicates_dynamic_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_non_tensor_predicates_dynamic_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_share_predicte_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_simple_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_symint_input_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_unbacked_symint_closure_dynamic_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_unbacked_symint_closure_dynamic_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_use_buffers_from_outer_scope_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_multiple_outputs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_outer_code_before_after_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_parameters_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_reinterpret_view_inputs_outputs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_consecutive_compiles_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_constant_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_constant_folding_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_constant_folding_with_update_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_constant_original_fqn_and_dtype_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_constant_type_propagation_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_conv3d_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_conv_freezing_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_convolution_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_copy_non_blocking_is_pinned_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_custom_op_in_subgraph_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_d2h_copy_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_deconv_freezing_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dup_unbacked_sym_decl_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dup_unbacked_sym_decl_with_refinement_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_duplicate_constant_folding_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_duplicated_params_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dynamic_cat_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dynamic_scalar_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dynamic_smem_above_default_limit_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_embedding_bag_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_empty_cat_dtype_promotion_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_empty_constant_folding_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_empty_graph_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_extract_constants_map_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fake_tensor_device_validation_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fallback_kernel_with_symexpr_output_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fallback_mem_leak_fix_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fft_c2c_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fill__fallback_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_foreach_multiple_dynamic_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fp8_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fp8_view_of_param_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fqn_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_free_inactive_buffer_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_freezing_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fx_gm_return_tuple_validation_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_index_put_fallback_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_index_put_with_none_index_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_inf_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_input_codegen_with_sympy_expr_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_int_list_input_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_issue_140766_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_large_dynamic_dim_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_large_grid_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_large_mmaped_weights_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_large_weight_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_libtorch_free_so_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_linear_dynamic_maxautotune_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_linear_freezing_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_load_package_multiple_gpus_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_masked_select_dynamic_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_misaligned_input_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_misaligned_input_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_misc_1_max_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_misc_1_max_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_missing_cubin_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_missing_output_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_model_modified_weights_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_multi_device_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_multiple_output_alias_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_nan_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_narrow_fallback_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_nested_tensor_from_jagged_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_no_args_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_non_contiguous_output_alias_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_non_default_gpu_device_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_non_tensor_input_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_none_args_aot_codegen_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_normal_functional_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_on_gpu_device1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_output_misaligned_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_output_path_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_output_path_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_pad_fallback_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_pad_non_zero_memory_leak_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_poi_multiple_dynamic_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_profile_benchmark_harness_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_proxy_executor_abs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_proxy_executor_hann_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_proxy_executor_permute_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_proxy_executor_squeeze_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_pytree_inputs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_quanatized_int8_linear_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_quantized_linear_bias_none_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_quantized_linear_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_repeat_interleave_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_repeat_output_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_repeated_calling_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_repeated_user_defined_triton_kernel_embed_kernel_binary_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_repeated_user_defined_triton_kernel_embed_kernel_binary_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_replace_unbacked_symbol_with_backed_expr_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_replicate_on_devices_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_return_constant_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_return_view_constant_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_reuse_kernel_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_reuse_kernel_dynamic_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_rocm_triton_autotuning_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_run_with_grad_enabled_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_runtime_checks_complex_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_runtime_checks_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_runtime_checks_device_type_failed_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_runtime_checks_dtype_failed_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_runtime_checks_fp8_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_runtime_checks_large_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_runtime_checks_shape_failed_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_same_backing_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_scaled_dot_product_efficient_attention_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_scaled_grouped_mm_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_scatter_fallback_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_scatter_reduce_fallback_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_sdpa_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_sdpa_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_seq_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_shifted_constraint_ranges_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_simple_dynamic_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_simple_embed_kernel_binary_False_max_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_simple_embed_kernel_binary_False_max_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_simple_embed_kernel_binary_True_max_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_simple_embed_kernel_binary_True_max_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_simple_multi_arch_embed_kernel_binary_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_simple_split_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_size_from_multi_output_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_size_with_unbacked_add_and_mul_expr_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_size_with_unbacked_add_expr_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_size_with_unbacked_add_expr_transitive_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_small_constant_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_so_without_weight_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_stft_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_stride_with_unbacked_expr_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_subclasses_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_sym_expr_indexing_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_sym_i64_input_codegen_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_symbool_item_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_symfloat_item_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_symint_item_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_sympy_cpp_printer_min_max_minmax0_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_sympy_cpp_printer_min_max_minmax1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_torchvision_transforms_functional_tensor_resize_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_autotuning_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_dynamic_launcher_grid_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_dynamic_launcher_grid_infer_from_tensor_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_bool_param_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_dynamic_grid_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_dynamic_shape_with_div_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_equal_to_1_arg_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_equal_to_1_float_arg_dynamic_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_equal_to_1_float_arg_dynamic_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_extern_kernel_arg_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_multi_output_arg_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_on_device_tma_dynamic_False_tma_version_new_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_on_device_tma_dynamic_False_tma_version_old_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_on_device_tma_dynamic_True_tma_version_new_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_on_device_tma_dynamic_True_tma_version_old_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_reinterpret_view_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_reinterpret_view_mem_leak_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_sympy_expr_arg_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_sympy_fn_like_arg_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_new_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_old_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_tma_descriptor_1d_dynamic_True_tma_version_new_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_tma_descriptor_1d_dynamic_True_tma_version_old_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_tma_descriptor_2d_dynamic_False_tma_version_new_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_tma_descriptor_2d_dynamic_False_tma_version_old_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_new_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_old_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_weird_param_order_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_with_none_input_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_with_none_inputs_and_equal_to_1_arg_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_mutated_autotuning_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_next_power_of_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_unbounded_expr_substitutions_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_update_constant_buffer_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_update_constant_buffer_simple_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_update_inactive_constant_buffer_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_update_user_managed_buffer_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_upper_bound_i64_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_using_model_name_for_files_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_view_outputs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_weight_on_disk_legacy_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_nested_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_simple_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_conv_dynamic_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_conv_dynamic_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_mixed_device_dynamic_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_mixed_device_dynamic_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_outer_buffers_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_outer_code_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_parameters_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_pytree_inputs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_sym_expr_cond_dynamic_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_sym_expr_cond_dynamic_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_unbacked_symint_closure_dynamic_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_unbacked_symint_closure_dynamic_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_with_cudagraphs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_with_no_triton_profiler_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_with_offset_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_with_profiler_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_zero_grid_with_backed_symbols_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_zero_grid_with_unbacked_symbols_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_zero_size_buffer_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_zero_size_weight_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__int_mm_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_m_32_n_64_q_group_32_num_groups_1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_m_32_n_64_q_group_32_num_groups_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_m_32_n_64_q_group_64_num_groups_1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_m_32_n_64_q_group_64_num_groups_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_32_num_groups_1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_32_num_groups_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_64_num_groups_1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_64_num_groups_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_add_complex_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_addmm_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_addmm_multiple_dynamic_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aliased_buffer_reuse_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_amp_fallback_random_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aot_inductor_consts_cpp_build_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_constant_tensor_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_constant_tensor_name_collision_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_debug_printer_codegen_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_debug_printer_cpp_kernel_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_debug_printer_fp8_dtype_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_debug_printer_sym_inputs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_debug_printer_user_defined_triton_kernel_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_debug_printing_model_inputs_codegen_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_profiler_enable_kernel_profile_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_profiler_enable_kernel_profile_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_runtime_asserts_backed_symint_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_runtime_asserts_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_user_defined_triton_kernel_profiling_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_assert_async_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_assert_tensor_meta_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_autotune_int64_user_defined_triton_kernel_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_autotune_with_constant_folding_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_autotuning_args_reuse_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_backward_no_op_logging_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_bmm_multiple_dynamic_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_bool_input_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_boolean_indexing_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_buffer_mutation_1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_buffer_mutation_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_buffer_mutation_3_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_buffer_mutation_4_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_buffer_mutation_and_force_mmap_weights_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_buffer_reuse_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_clamp_decomposition_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_composed_dynamic_size_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_mismatched_branch_output_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_mismatched_branch_output_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_nested_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_non_tensor_predicates_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_non_tensor_predicates_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_share_predicte_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_simple_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_symint_input_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_unbacked_symint_closure_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_unbacked_symint_closure_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_use_buffers_from_outer_scope_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_with_multiple_outputs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_with_outer_code_before_after_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_with_parameters_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_with_reinterpret_view_inputs_outputs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_consecutive_compiles_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_constant_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_constant_folding_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_constant_folding_with_update_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_constant_original_fqn_and_dtype_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_constant_type_propagation_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_conv3d_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_conv_freezing_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_convolution_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_copy_non_blocking_is_pinned_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_custom_op_in_subgraph_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_d2h_copy_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_deconv_freezing_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_dup_unbacked_sym_decl_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_dup_unbacked_sym_decl_with_refinement_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_duplicate_constant_folding_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_duplicated_params_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_dynamic_cat_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_dynamic_scalar_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_dynamic_smem_above_default_limit_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_embedding_bag_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_empty_cat_dtype_promotion_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_empty_constant_folding_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_empty_graph_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_extract_constants_map_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fake_tensor_device_validation_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fallback_kernel_with_symexpr_output_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fallback_mem_leak_fix_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fft_c2c_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fill__fallback_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_foreach_multiple_dynamic_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fp8_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fp8_view_of_param_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fqn_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_free_inactive_buffer_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_freezing_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fx_gm_return_tuple_validation_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_index_put_fallback_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_index_put_with_none_index_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_inf_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_input_codegen_with_sympy_expr_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_int_list_input_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_issue_140766_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_large_dynamic_dim_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_large_grid_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_large_mmaped_weights_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_large_weight_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_libtorch_free_so_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_linear_dynamic_maxautotune_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_linear_freezing_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_load_package_multiple_gpus_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_masked_select_dynamic_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_misaligned_input_1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_misaligned_input_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_misc_1_max_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_misc_1_max_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_missing_cubin_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_missing_output_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_model_modified_weights_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_multi_device_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_multiple_output_alias_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_nan_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_narrow_fallback_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_nested_tensor_from_jagged_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_no_args_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_non_contiguous_output_alias_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_non_default_gpu_device_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_non_tensor_input_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_none_args_aot_codegen_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_normal_functional_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_on_gpu_device1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_output_misaligned_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_output_path_1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_output_path_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_pad_fallback_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_pad_non_zero_memory_leak_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_poi_multiple_dynamic_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_profile_benchmark_harness_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_proxy_executor_abs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_proxy_executor_hann_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_proxy_executor_permute_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_proxy_executor_squeeze_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_pytree_inputs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_quanatized_int8_linear_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_quantized_linear_bias_none_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_quantized_linear_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_repeat_interleave_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_repeat_output_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_repeated_calling_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_repeated_user_defined_triton_kernel_embed_kernel_binary_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_repeated_user_defined_triton_kernel_embed_kernel_binary_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_replace_unbacked_symbol_with_backed_expr_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_replicate_on_devices_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_return_constant_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_return_view_constant_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_reuse_kernel_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_reuse_kernel_dynamic_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_rocm_triton_autotuning_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_run_with_grad_enabled_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_complex_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_device_type_failed_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_dtype_failed_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_fp8_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_large_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_shape_failed_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_same_backing_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_scaled_dot_product_efficient_attention_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_scaled_grouped_mm_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_scatter_fallback_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_scatter_reduce_fallback_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_sdpa_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_sdpa_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_seq_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_shifted_constraint_ranges_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_simple_dynamic_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_simple_embed_kernel_binary_False_max_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_simple_embed_kernel_binary_False_max_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_simple_embed_kernel_binary_True_max_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_simple_embed_kernel_binary_True_max_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_simple_multi_arch_embed_kernel_binary_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_simple_split_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_size_from_multi_output_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_size_with_unbacked_add_and_mul_expr_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_size_with_unbacked_add_expr_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_size_with_unbacked_add_expr_transitive_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_small_constant_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_so_without_weight_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_stft_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_stride_with_unbacked_expr_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_subclasses_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_sym_expr_indexing_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_sym_i64_input_codegen_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_symbool_item_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_symfloat_item_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_symint_item_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_sympy_cpp_printer_min_max_minmax0_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_sympy_cpp_printer_min_max_minmax1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_torchvision_transforms_functional_tensor_resize_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_autotuning_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_dynamic_launcher_grid_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_dynamic_launcher_grid_infer_from_tensor_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_bool_param_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_dynamic_grid_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_dynamic_shape_with_div_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_equal_to_1_arg_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_equal_to_1_float_arg_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_equal_to_1_float_arg_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_extern_kernel_arg_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_multi_output_arg_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_on_device_tma_dynamic_False_tma_version_new_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_on_device_tma_dynamic_False_tma_version_old_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_on_device_tma_dynamic_True_tma_version_new_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_on_device_tma_dynamic_True_tma_version_old_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_reinterpret_view_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_reinterpret_view_mem_leak_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_sympy_expr_arg_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_sympy_fn_like_arg_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_new_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_old_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_1d_dynamic_True_tma_version_new_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_1d_dynamic_True_tma_version_old_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_2d_dynamic_False_tma_version_new_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_2d_dynamic_False_tma_version_old_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_new_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_old_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_weird_param_order_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_with_none_input_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_with_none_inputs_and_equal_to_1_arg_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_mutated_autotuning_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_next_power_of_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_unbounded_expr_substitutions_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_update_constant_buffer_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_update_constant_buffer_simple_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_update_inactive_constant_buffer_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_update_user_managed_buffer_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_upper_bound_i64_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_using_model_name_for_files_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_view_outputs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_weight_on_disk_legacy_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_nested_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_simple_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_conv_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_conv_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_mixed_device_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_mixed_device_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_outer_buffers_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_outer_code_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_parameters_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_pytree_inputs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_sym_expr_cond_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_sym_expr_cond_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_unbacked_symint_closure_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_unbacked_symint_closure_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_with_cudagraphs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_with_no_triton_profiler_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_with_offset_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_with_profiler_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_zero_grid_with_backed_symbols_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_zero_grid_with_unbacked_symbols_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_zero_size_buffer_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_zero_size_weight_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__int_mm_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__weight_int4pack_mm_m_32_n_64_q_group_32_num_groups_1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__weight_int4pack_mm_m_32_n_64_q_group_32_num_groups_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__weight_int4pack_mm_m_32_n_64_q_group_64_num_groups_1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__weight_int4pack_mm_m_32_n_64_q_group_64_num_groups_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_32_num_groups_1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_32_num_groups_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_64_num_groups_1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_64_num_groups_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_add_complex_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_addmm_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_addmm_multiple_dynamic_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aliased_buffer_reuse_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_amp_fallback_random_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aot_inductor_consts_cpp_build_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_constant_tensor_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_constant_tensor_name_collision_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_debug_printer_codegen_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_debug_printer_cpp_kernel_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_debug_printer_fp8_dtype_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_debug_printer_sym_inputs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_debug_printer_user_defined_triton_kernel_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_debug_printing_model_inputs_codegen_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_profiler_enable_kernel_profile_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_profiler_enable_kernel_profile_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_runtime_asserts_backed_symint_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_runtime_asserts_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_user_defined_triton_kernel_profiling_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_assert_async_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_assert_tensor_meta_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_autotune_int64_user_defined_triton_kernel_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_autotune_with_constant_folding_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_autotuning_args_reuse_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_backward_no_op_logging_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_bmm_multiple_dynamic_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_bool_input_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_boolean_indexing_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_buffer_mutation_1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_buffer_mutation_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_buffer_mutation_3_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_buffer_mutation_4_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_buffer_mutation_and_force_mmap_weights_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_buffer_reuse_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_clamp_decomposition_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_composed_dynamic_size_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_mismatched_branch_output_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_mismatched_branch_output_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_nested_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_non_tensor_predicates_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_non_tensor_predicates_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_share_predicte_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_simple_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_symint_input_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_unbacked_symint_closure_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_unbacked_symint_closure_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_use_buffers_from_outer_scope_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_with_multiple_outputs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_with_outer_code_before_after_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_with_parameters_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_with_reinterpret_view_inputs_outputs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_consecutive_compiles_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_constant_folding_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_constant_folding_with_update_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_constant_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_constant_original_fqn_and_dtype_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_constant_type_propagation_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_conv3d_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_conv_freezing_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_convolution_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_copy_non_blocking_is_pinned_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_custom_op_in_subgraph_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_d2h_copy_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_deconv_freezing_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_dup_unbacked_sym_decl_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_dup_unbacked_sym_decl_with_refinement_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_duplicate_constant_folding_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_duplicated_params_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_dynamic_cat_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_dynamic_scalar_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_dynamic_smem_above_default_limit_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_embedding_bag_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_empty_cat_dtype_promotion_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_empty_constant_folding_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_empty_graph_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_extract_constants_map_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fake_tensor_device_validation_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fallback_kernel_with_symexpr_output_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fallback_mem_leak_fix_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fft_c2c_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fill__fallback_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_foreach_multiple_dynamic_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fp8_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fp8_view_of_param_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fqn_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_free_inactive_buffer_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_freezing_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fx_gm_return_tuple_validation_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_index_put_fallback_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_index_put_with_none_index_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_inf_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_input_codegen_with_sympy_expr_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_int_list_input_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_issue_140766_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_large_dynamic_dim_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_large_grid_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_large_mmaped_weights_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_large_weight_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_libtorch_free_so_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_linear_dynamic_maxautotune_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_linear_freezing_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_load_package_multiple_gpus_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_masked_select_dynamic_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_misaligned_input_1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_misaligned_input_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_misc_1_max_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_misc_1_max_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_missing_cubin_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_missing_output_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_model_modified_weights_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_multi_device_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_multiple_output_alias_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_nan_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_narrow_fallback_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_nested_tensor_from_jagged_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_no_args_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_non_contiguous_output_alias_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_non_default_gpu_device_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_non_tensor_input_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_none_args_aot_codegen_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_normal_functional_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_on_gpu_device1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_output_misaligned_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_output_path_1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_output_path_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_pad_fallback_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_pad_non_zero_memory_leak_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_poi_multiple_dynamic_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_profile_benchmark_harness_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_proxy_executor_abs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_proxy_executor_hann_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_proxy_executor_permute_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_proxy_executor_squeeze_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_pytree_inputs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_quanatized_int8_linear_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_quantized_linear_bias_none_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_quantized_linear_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_repeat_interleave_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_repeat_output_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_repeated_calling_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_repeated_user_defined_triton_kernel_embed_kernel_binary_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_repeated_user_defined_triton_kernel_embed_kernel_binary_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_replace_unbacked_symbol_with_backed_expr_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_replicate_on_devices_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_return_constant_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_return_view_constant_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_reuse_kernel_dynamic_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_reuse_kernel_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_rocm_triton_autotuning_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_run_with_grad_enabled_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_runtime_checks_complex_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_runtime_checks_device_type_failed_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_runtime_checks_dtype_failed_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_runtime_checks_fp8_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_runtime_checks_large_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_runtime_checks_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_runtime_checks_shape_failed_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_same_backing_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_scaled_dot_product_efficient_attention_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_scaled_grouped_mm_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_scatter_fallback_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_scatter_reduce_fallback_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_sdpa_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_sdpa_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_seq_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_shifted_constraint_ranges_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_simple_dynamic_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_simple_embed_kernel_binary_False_max_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_simple_embed_kernel_binary_False_max_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_simple_embed_kernel_binary_True_max_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_simple_embed_kernel_binary_True_max_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_simple_multi_arch_embed_kernel_binary_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_simple_split_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_size_from_multi_output_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_size_with_unbacked_add_and_mul_expr_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_size_with_unbacked_add_expr_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_size_with_unbacked_add_expr_transitive_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_small_constant_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_so_without_weight_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_stft_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_stride_with_unbacked_expr_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_subclasses_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_sym_expr_indexing_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_sym_i64_input_codegen_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_symbool_item_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_symfloat_item_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_symint_item_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_sympy_cpp_printer_min_max_minmax0_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_sympy_cpp_printer_min_max_minmax1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_torchvision_transforms_functional_tensor_resize_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_autotuning_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_dynamic_launcher_grid_infer_from_tensor_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_dynamic_launcher_grid_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_bool_param_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_dynamic_grid_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_dynamic_shape_with_div_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_equal_to_1_arg_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_equal_to_1_float_arg_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_equal_to_1_float_arg_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_extern_kernel_arg_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_multi_output_arg_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_on_device_tma_dynamic_False_tma_version_new_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_on_device_tma_dynamic_False_tma_version_old_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_on_device_tma_dynamic_True_tma_version_new_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_on_device_tma_dynamic_True_tma_version_old_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_reinterpret_view_mem_leak_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_reinterpret_view_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_sympy_expr_arg_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_sympy_fn_like_arg_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_new_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_old_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_1d_dynamic_True_tma_version_new_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_1d_dynamic_True_tma_version_old_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_2d_dynamic_False_tma_version_new_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_2d_dynamic_False_tma_version_old_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_new_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_old_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_weird_param_order_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_with_none_input_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_with_none_inputs_and_equal_to_1_arg_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_mutated_autotuning_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_next_power_of_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_unbounded_expr_substitutions_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_update_constant_buffer_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_update_constant_buffer_simple_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_update_inactive_constant_buffer_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_update_user_managed_buffer_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_upper_bound_i64_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_using_model_name_for_files_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_view_outputs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_weight_on_disk_legacy_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_nested_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_simple_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_conv_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_conv_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_mixed_device_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_mixed_device_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_outer_buffers_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_outer_code_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_parameters_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_pytree_inputs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_sym_expr_cond_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_sym_expr_cond_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_unbacked_symint_closure_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_unbacked_symint_closure_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_with_cudagraphs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_with_no_triton_profiler_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_with_offset_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_with_profiler_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_zero_grid_with_backed_symbols_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_zero_grid_with_unbacked_symbols_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_zero_size_buffer_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_zero_size_weight_mps 2025-10-10T01:25:53.3737616Z 2025-10-10T01:25:57.1511914Z Running inductor/test_torchinductor_opinfo 5/11 ... [2025-10-10 01:25:57.150627] 2025-10-10T01:25:57.1512519Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:25:57.1514797Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'not serial', '--shard-id=5', '--num-shards=11', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:25:57.151076] 2025-10-10T01:27:20.2614574Z 2025-10-10T01:27:20.2615772Z inductor/test_torchinductor_opinfo 5/11 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_5.11_2db6fc2ad4709efb_.log 2025-10-10T01:27:20.2845038Z Running 323 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___getitem___cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmul___cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rsub___cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rsub___cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rxor___cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__segment_reduce_offsets_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__softmax_backward_data_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__upsample_bilinear2d_aa_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addcmul_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addcmul_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addr_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_allclose_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amax_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amax_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argsort_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_copy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_partial_views_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asin_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asin_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atanh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bernoulli_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bincount_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_left_shift_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_xor_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_block_diag_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_block_diag_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bmm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bmm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cat_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cauchy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_chalf_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_char_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_chunk_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_combinations_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cos_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cosh_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_count_nonzero_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cov_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumprod_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumulative_trapezoid_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diag_embed_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagflat_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_digamma_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_floor_rounding_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_floor_rounding_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_no_rounding_mode_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_trunc_rounding_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dsplit_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dsplit_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_einsum_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_einsum_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_like_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_permuted_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_strided_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erf_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp2_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eye_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftshift_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft2_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftn_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfftn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfftn_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fill_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flatten_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flipud_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_power_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_power_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_power_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmax_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmin_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmin_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmod_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_frexp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_like_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gather_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gcd_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ge_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_geometric_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_geometric_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gradient_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_grid_sampler_2d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_grid_sampler_3d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_half_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_heaviside_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hsplit_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hstack_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_add_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_mean_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_prod_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_select_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isclose_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isnan_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isposinf_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isreal_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isreal_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_item_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_unary_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_unary_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_kthvalue_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lcm_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ldexp_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lgamma_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cholesky_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cholesky_ex_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_diagonal_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_lstsq_grad_oriented_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_tensorinv_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_tensorsolve_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_with_dtype_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logaddexp_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logcumsumexp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_and_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_and_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logspace_tensor_overload_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_long_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lu_solve_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mT_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mT_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amax_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amax_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amin_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumprod_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumprod_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumsum_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumsum_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_logaddexp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_prod_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_std_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_sum_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_var_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_binary_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_binary_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_pool2d_with_indices_backward_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_no_dim_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_median_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_no_dim_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_no_dim_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_msort_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_msort_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mul_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_multinomial_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nanmedian_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nanquantile_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nansum_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_neg_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_strided_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_strided_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_avg_pool2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_max_pool1d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_avg_pool1d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_avg_pool1d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv_transpose3d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_dropout_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_gaussian_nll_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_gelu_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_group_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardswish_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_area_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_area_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_bilinear_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_trilinear_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_trilinear_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_kl_div_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool1d_grad_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool2d_grad_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool2d_grad_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_circular_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_circular_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_constant_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_scaled_dot_product_attention_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_smooth_l1_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softshrink_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_tanhshrink_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_threshold_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_threshold_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_static_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_fro_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_inf_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_like_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_pca_lowrank_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_pinverse_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polar_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rand_like_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randint_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randint_like_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_real_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reciprocal_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_as_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_neg_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rot90_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsub_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scalar_tensor_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_amax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_amin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_sum_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_select_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_select_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_short_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_hann_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinc_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_scatter_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_with_dtype_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_with_dtype_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sparse_sampled_addmm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_airy_ai_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_airy_ai_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_airy_ai_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_u_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i0e_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i0e_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1e_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_log_ndtr_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_i0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtr_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtri_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_xlog1py_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_zeta_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_zeta_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_list_args_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_copy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_square_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_std_unbiased_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_stft_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_svd_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_take_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tan_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tanh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tensor_split_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tensordot_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_topk_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_torch_ops_aten__flash_attention_forward_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trace_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trapezoid_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_indices_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_triu_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trunc_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unflatten_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_consecutive_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsafe_split_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsafe_split_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vdot_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_as_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_where_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_xlogy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zero__cuda_float16 2025-10-10T01:27:20.3062670Z 2025-10-10T01:27:24.2792545Z Running inductor/test_torchinductor_opinfo 6/11 ... [2025-10-10 01:27:24.278560] 2025-10-10T01:27:24.2793051Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:27:24.2794119Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'not serial', '--shard-id=6', '--num-shards=11', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:27:24.278941] 2025-10-10T01:29:07.4716646Z 2025-10-10T01:29:07.4718267Z inductor/test_torchinductor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_1.1_ccc835528304573b_.log 2025-10-10T01:29:07.5213904Z Running 979 items in this shard: test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast1_broadcast1, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast1_broadcast2, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast1_broadcast3, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast1_dense, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast1_double, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast1_int, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast1_strided, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast1_transposed, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast2_broadcast1, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast2_broadcast2, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast2_broadcast3, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast2_dense, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast2_double, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast2_int, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast2_strided, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast2_transposed, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast3_broadcast1, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast3_broadcast2, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast3_broadcast3, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast3_dense, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast3_double, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast3_int, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast3_strided, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast3_transposed, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_dense_broadcast1, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_dense_broadcast2, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_dense_broadcast3, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_dense_dense, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_dense_double, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_dense_int, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_dense_strided, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_dense_transposed, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_double_broadcast1, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_double_broadcast2, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_double_broadcast3, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_double_dense, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_double_double, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_double_int, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_double_strided, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_double_transposed, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_int_broadcast1, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_int_broadcast2, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_int_broadcast3, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_int_dense, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_int_double, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_int_int, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_int_strided, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_int_transposed, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_strided_broadcast1, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_strided_broadcast2, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_strided_broadcast3, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_strided_dense, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_strided_double, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_strided_int, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_strided_strided, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_strided_transposed, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_transposed_broadcast1, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_transposed_broadcast2, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_transposed_broadcast3, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_transposed_dense, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_transposed_double, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_transposed_int, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_transposed_strided, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_transposed_transposed, test/inductor/test_torchinductor.py::GPUTests::test_AllenaiLongformerBase_repro_cuda, test/inductor/test_torchinductor.py::GPUTests::test__dyn_quant_matmul_4bit_cuda, test/inductor/test_torchinductor.py::GPUTests::test__dyn_quant_pack_4bit_weight_cuda, test/inductor/test_torchinductor.py::GPUTests::test__unsafe_masked_index_cuda, test/inductor/test_torchinductor.py::GPUTests::test__unsafe_masked_index_put_accumulate_cuda, test/inductor/test_torchinductor.py::GPUTests::test_abs_cuda, test/inductor/test_torchinductor.py::GPUTests::test_adaptive_avg_pool1d_argmax_cuda, test/inductor/test_torchinductor.py::GPUTests::test_adaptive_avg_pool2d1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_adaptive_avg_pool2d2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_adaptive_avg_pool2d_low_prec_cuda, test/inductor/test_torchinductor.py::GPUTests::test_adaptive_avg_pool_errors_with_long_cuda, test/inductor/test_torchinductor.py::GPUTests::test_adaptive_avg_pool_with_output_size_0_cuda, test/inductor/test_torchinductor.py::GPUTests::test_adaptive_max_pool2d1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_adaptive_max_pool2d2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_adaptive_max_pool2d3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_adaptive_pool_errors_with_long_cuda, test/inductor/test_torchinductor.py::GPUTests::test_add_complex10_cuda, test/inductor/test_torchinductor.py::GPUTests::test_add_complex3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_add_complex4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_add_complex5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_add_complex6_cuda, test/inductor/test_torchinductor.py::GPUTests::test_add_complex7_cuda, test/inductor/test_torchinductor.py::GPUTests::test_add_complex8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_add_complex9_cuda, test/inductor/test_torchinductor.py::GPUTests::test_add_complex_cuda, test/inductor/test_torchinductor.py::GPUTests::test_add_complex_strided_fallback_cuda, test/inductor/test_torchinductor.py::GPUTests::test_add_const_float_cuda, test/inductor/test_torchinductor.py::GPUTests::test_add_const_int_cuda, test/inductor/test_torchinductor.py::GPUTests::test_add_inplace_permuted_cuda, test/inductor/test_torchinductor.py::GPUTests::test_adding_tensor_offsets_cuda, test/inductor/test_torchinductor.py::GPUTests::test_addmm_cuda, test/inductor/test_torchinductor.py::GPUTests::test_addmv_cuda, test/inductor/test_torchinductor.py::GPUTests::test_alexnet_prefix_cuda, test/inductor/test_torchinductor.py::GPUTests::test_aliased_buffer_reuse_cuda, test/inductor/test_torchinductor.py::GPUTests::test_allow_reuse_active_if_under_peak_cuda, test/inductor/test_torchinductor.py::GPUTests::test_allow_reuse_disable_if_exceed_peak_cuda, test/inductor/test_torchinductor.py::GPUTests::test_angle_cuda, test/inductor/test_torchinductor.py::GPUTests::test_any_cuda, test/inductor/test_torchinductor.py::GPUTests::test_aoti_eager_cache_hit_cuda, test/inductor/test_torchinductor.py::GPUTests::test_aoti_eager_dtype_device_layout_cuda, test/inductor/test_torchinductor.py::GPUTests::test_aoti_eager_override_registration_cuda, test/inductor/test_torchinductor.py::GPUTests::test_aoti_eager_support_out_cuda, test/inductor/test_torchinductor.py::GPUTests::test_aoti_eager_support_str_cuda, test/inductor/test_torchinductor.py::GPUTests::test_aoti_eager_with_persistent_cache_cuda, test/inductor/test_torchinductor.py::GPUTests::test_aoti_eager_with_scalar_cuda, test/inductor/test_torchinductor.py::GPUTests::test_arange1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_arange2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_arange3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_arange4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_arange5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_arange6_cuda, test/inductor/test_torchinductor.py::GPUTests::test_argmax_argmin1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_argmax_argmin2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_argmax_argmin3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_argmax_argmin_with_duplicates_cuda, test/inductor/test_torchinductor.py::GPUTests::test_argmax_argmin_with_nan_cuda, test/inductor/test_torchinductor.py::GPUTests::test_argmax_min_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_argmax_to_float_cuda, test/inductor/test_torchinductor.py::GPUTests::test_as_strided_cuda, test/inductor/test_torchinductor.py::GPUTests::test_as_strided_on_views_cuda, test/inductor/test_torchinductor.py::GPUTests::test_as_strided_scatter_cuda, test/inductor/test_torchinductor.py::GPUTests::test_assert_alignment_op_name_fail_cuda, test/inductor/test_torchinductor.py::GPUTests::test_assert_alignment_op_name_pass_cuda, test/inductor/test_torchinductor.py::GPUTests::test_assert_size_stride_op_name_fail_cuda, test/inductor/test_torchinductor.py::GPUTests::test_assert_size_stride_op_name_pass_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d6_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d7_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d_backward2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d_backward3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d_backward4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d_backward_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool3d_backward2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool3d_backward3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool3d_backward4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool3d_backward_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool_errors_with_uint_cuda, test/inductor/test_torchinductor.py::GPUTests::test_baddbmm_cuda, test/inductor/test_torchinductor.py::GPUTests::test_batch_norm_2d_2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_batch_norm_2d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bernoulli1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bernoulli2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bfloat16_to_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bitwise2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bitwise3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bitwise_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bmm1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bmm2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bool_cuda, test/inductor/test_torchinductor.py::GPUTests::test_both_scalars_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_add_autotune_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_broadcast_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_computed_offsets_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_default_kwargs_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int16_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int16_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int16_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int16_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int16_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int32_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int32_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int32_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int32_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int32_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int64_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int64_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int64_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int64_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int64_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int8_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int8_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int8_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int8_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int8_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_uint8_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_uint8_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_uint8_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_uint8_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_uint8_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_nd_tiling_False_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_nd_tiling_True_cuda, test/inductor/test_torchinductor.py::GPUTests::test_buffer_batch_norm_cuda, test/inductor/test_torchinductor.py::GPUTests::test_buffer_copied_in_graph_cuda, test/inductor/test_torchinductor.py::GPUTests::test_buffer_copied_in_graph_with_different_shapes_cuda, test/inductor/test_torchinductor.py::GPUTests::test_buffer_use_after_remove_cuda, test/inductor/test_torchinductor.py::GPUTests::test_builtins_round_cuda, test/inductor/test_torchinductor.py::GPUTests::test_builtins_round_float_ndigits_neg_cuda, test/inductor/test_torchinductor.py::GPUTests::test_builtins_round_float_ndigits_pos_cuda, test/inductor/test_torchinductor.py::GPUTests::test_builtins_round_float_ndigits_zero_cuda, test/inductor/test_torchinductor.py::GPUTests::test_builtins_round_int_ndigits_pos_cuda, test/inductor/test_torchinductor.py::GPUTests::test_builtins_round_int_ndigits_zero_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_empty_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_empty_index_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_extern_kernel_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_inplace_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_negative_dim_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_of_loops_and_extern_kernel_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_single_empty_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_unbacked_2d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_unbacked_empty_1d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_unbacked_legacy_empty_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_upcasting_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cauchy_cuda, test/inductor/test_torchinductor.py::GPUTests::test_check_stack_no_cycles_cuda, test/inductor/test_torchinductor.py::GPUTests::test_chunk_recompiles_cuda, test/inductor/test_torchinductor.py::GPUTests::test_clamp_cuda, test/inductor/test_torchinductor.py::GPUTests::test_clamp_type_promotion_cuda, test/inductor/test_torchinductor.py::GPUTests::test_clamp_type_promotion_non_tensor_cuda, test/inductor/test_torchinductor.py::GPUTests::test_clone_cuda, test/inductor/test_torchinductor.py::GPUTests::test_compar_cuda, test/inductor/test_torchinductor.py::GPUTests::test_complex_fallback_cuda, test/inductor/test_torchinductor.py::GPUTests::test_complex_from_real_imag_cuda, test/inductor/test_torchinductor.py::GPUTests::test_complex_memory_overlap_cuda, test/inductor/test_torchinductor.py::GPUTests::test_computed_buffer_inlining_cuda, test/inductor/test_torchinductor.py::GPUTests::test_concat_add_inplace_cuda, test/inductor/test_torchinductor.py::GPUTests::test_config_option_dont_assume_alignment_cuda, test/inductor/test_torchinductor.py::GPUTests::test_config_option_dont_assume_alignment_cudagraphs_cuda, test/inductor/test_torchinductor.py::GPUTests::test_config_option_dont_assume_alignment_recompiles_cuda, test/inductor/test_torchinductor.py::GPUTests::test_consecutive_split_cumprod_cuda, test/inductor/test_torchinductor.py::GPUTests::test_consecutive_split_cumsum_cuda, test/inductor/test_torchinductor.py::GPUTests::test_const_int32_to_float_cuda, test/inductor/test_torchinductor.py::GPUTests::test_constant_pad_1d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_constant_pad_2d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_constant_pad_2d_strides_nonpositive_cuda, test/inductor/test_torchinductor.py::GPUTests::test_constant_pad_3d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_constant_pad_fill_dtype_cuda, test/inductor/test_torchinductor.py::GPUTests::test_constant_pad_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_constant_pad_nd_inplace_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv1d_depthwise_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv1d_with_permute_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv2d_backward_channels_last_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv2d_channels_last_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv3d_channels_last_use_block_ptr_False_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv3d_channels_last_use_block_ptr_True_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv3d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv_backward_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv_bn_fuse_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv_functional_bn_fuse_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv_inference_heuristics_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv_shape_check_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv_with_as_strided_cuda, test/inductor/test_torchinductor.py::GPUTests::test_convolution1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_convolution2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_convolution3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_convolution4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_convolution5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_copy_non_blocking_is_pinned_use_cat_False_cuda, test/inductor/test_torchinductor.py::GPUTests::test_copy_non_blocking_is_pinned_use_cat_True_cuda, test/inductor/test_torchinductor.py::GPUTests::test_copy_with_scalar_src_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cos_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cpu_scalar_with_cpu_scalar_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cpu_scalar_with_cpu_tensor_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cpu_scalar_with_gpu_tensor_cpp_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cpu_scalar_with_gpu_tensor_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cpu_scalar_with_gpu_tensor_dynamic_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cpu_tensor_with_cpu_tensor_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cpu_tensor_with_gpu_tensor_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cudnn_rnn_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cummin_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cumprod_zero_dim_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cumsum_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cumsum_inf_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cumsum_no_mask_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cumsum_pattern_matcher_issue_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cumsum_zero_dim_cuda, test/inductor/test_torchinductor.py::GPUTests::test_custom_op_1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_custom_op_2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_custom_op_3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_custom_op_default_layout_constraint_cuda, test/inductor/test_torchinductor.py::GPUTests::test_custom_op_fixed_layout_channels_last_cuda, test/inductor/test_torchinductor.py::GPUTests::test_custom_op_fixed_layout_sequential_cuda, test/inductor/test_torchinductor.py::GPUTests::test_custom_op_unbacked_symints_cuda, test/inductor/test_torchinductor.py::GPUTests::test_custom_scan_op_compiled_cuda, test/inductor/test_torchinductor.py::GPUTests::test_custom_scan_op_cuda, test/inductor/test_torchinductor.py::GPUTests::test_custom_scan_op_multi_input_cuda, test/inductor/test_torchinductor.py::GPUTests::test_custom_scan_would_split_cuda, test/inductor/test_torchinductor.py::GPUTests::test_data_type_propogation_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dense_mask_index_cuda, test/inductor/test_torchinductor.py::GPUTests::test_deterministic_codegen_cuda, test/inductor/test_torchinductor.py::GPUTests::test_deterministic_codegen_on_graph_break_cuda, test/inductor/test_torchinductor.py::GPUTests::test_deterministic_codegen_with_suffix_cuda, test/inductor/test_torchinductor.py::GPUTests::test_device_assert_cuda, test/inductor/test_torchinductor.py::GPUTests::test_diagonal_copy_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dist_bf16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dist_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div6_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div7_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div9_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div_by_zero_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div_precision_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div_presicion_accuracy_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div_prim_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div_softmax_symfloat_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div_zero_dim_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dont_constant_fold_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dropout2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dropout3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dropout_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dropout_deterministic_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dropout_trivial_0_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dropout_trivial_1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtype_mismatch_issue_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtype_sympy_expr_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_bfloat16_bfloat16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_bfloat16_float16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_bfloat16_float32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_bfloat16_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_bfloat16_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_bfloat16_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_bfloat16_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_bfloat16_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_bfloat16_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float16_bfloat16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float16_float16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float16_float32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float16_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float16_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float16_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float16_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float16_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float16_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float32_bfloat16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float32_float16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float32_float32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float32_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float32_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float32_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float32_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float32_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float32_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float64_bfloat16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float64_float16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float64_float32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float64_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float64_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float64_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float64_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float64_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float64_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_fusion_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int16_bfloat16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int16_float16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int16_float32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int16_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int16_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int16_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int16_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int16_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int16_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int32_bfloat16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int32_float16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int32_float32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int32_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int32_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int32_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int32_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int32_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int32_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int64_bfloat16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int64_float16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int64_float32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int64_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int64_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int64_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int64_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int64_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int64_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int8_bfloat16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int8_float16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int8_float32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int8_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int8_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int8_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int8_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int8_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int8_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_uint8_bfloat16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_uint8_float16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_uint8_float32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_uint8_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_uint8_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_uint8_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_uint8_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_uint8_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_uint8_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_elu_cuda, test/inductor/test_torchinductor.py::GPUTests::test_embedding_bag_byte_unpack_cuda, test/inductor/test_torchinductor.py::GPUTests::test_embedding_bag_cuda, test/inductor/test_torchinductor.py::GPUTests::test_embedding_cuda, test/inductor/test_torchinductor.py::GPUTests::test_embedding_sparse_cuda, test/inductor/test_torchinductor.py::GPUTests::test_empty1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_empty2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_empty_strided_cuda, test/inductor/test_torchinductor.py::GPUTests::test_emulate_precision_triton_fp_fusion_cuda, test/inductor/test_torchinductor.py::GPUTests::test_erfc_cuda, test/inductor/test_torchinductor.py::GPUTests::test_erfinv_cuda, test/inductor/test_torchinductor.py::GPUTests::test_exact_stride_cuda, test/inductor/test_torchinductor.py::GPUTests::test_exp2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_exp_cuda, test/inductor/test_torchinductor.py::GPUTests::test_expand_as_cuda, test/inductor/test_torchinductor.py::GPUTests::test_expand_cuda, test/inductor/test_torchinductor.py::GPUTests::test_expanded_reduction_cuda, test/inductor/test_torchinductor.py::GPUTests::test_expm1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fallback_mutable_op_basic_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fallback_mutable_op_list_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fallback_mutable_op_list_tensor_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fallback_mutable_op_no_mutated_tensors_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fallback_mutable_op_with_return_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fft_real_input_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fft_real_input_real_output_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fill1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fill2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_flip_cat_cuda, test/inductor/test_torchinductor.py::GPUTests::test_flip_cuda, test/inductor/test_torchinductor.py::GPUTests::test_float16_to_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_float32_to_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_float_index_expression_cuda, test/inductor/test_torchinductor.py::GPUTests::test_float_index_expression_type_promotion_cuda, test/inductor/test_torchinductor.py::GPUTests::test_float_repr_dynamic_shapes_cuda, test/inductor/test_torchinductor.py::GPUTests::test_floordiv_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fmin_fmax_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fmod_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fmod_zero_dim_cuda, test/inductor/test_torchinductor.py::GPUTests::test_forced_buffer_realize_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fractional_max_pool2d1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fractional_max_pool2d2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fractional_max_pool2d3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fractional_max_pool2d4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fractional_max_pool2d5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_full_boolean_cuda, test/inductor/test_torchinductor.py::GPUTests::test_full_like_cuda, test/inductor/test_torchinductor.py::GPUTests::test_full_like_sliced_cuda, test/inductor/test_torchinductor.py::GPUTests::test_full_like_transposed_cuda, test/inductor/test_torchinductor.py::GPUTests::test_full_truncation_cuda, test/inductor/test_torchinductor.py::GPUTests::test_functionalize_rng_wrappers_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fuse_large_params_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fuse_tiled_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fusing_write_into_disjoint_read_cuda, test/inductor/test_torchinductor.py::GPUTests::test_gather1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_gather2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_gather3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_gather_scatter_cuda, test/inductor/test_torchinductor.py::GPUTests::test_gelu_cuda, test/inductor/test_torchinductor.py::GPUTests::test_generate_rand_fp8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_generated_code_has_alignment_assert_cuda, test/inductor/test_torchinductor.py::GPUTests::test_generated_code_has_size_stride_assert_cuda, test/inductor/test_torchinductor.py::GPUTests::test_getitem_cuda, test/inductor/test_torchinductor.py::GPUTests::test_glu_cuda, test/inductor/test_torchinductor.py::GPUTests::test_gpu_scalar_with_cpu_tensor_cuda, test/inductor/test_torchinductor.py::GPUTests::test_gpu_scalar_with_gpu_tensor_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_arange1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_arange2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_argmax_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_both_scalars_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_constant_tensor1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_constant_tensor2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_misaligned_input_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_mutation_real_name_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_no_inputs_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_pad_dynamic_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_refcount_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_scalar_inputs_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_unbacked_symint_as_output_cuda, test/inductor/test_torchinductor.py::GPUTests::test_grid_sampler_2d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_grid_sampler_expand_preserves_view_cuda, test/inductor/test_torchinductor.py::GPUTests::test_hardsigmoid_cuda, test/inductor/test_torchinductor.py::GPUTests::test_hardswish_cuda, test/inductor/test_torchinductor.py::GPUTests::test_hardtanh_cuda, test/inductor/test_torchinductor.py::GPUTests::test_horizonal_fusion1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_horizonal_fusion2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_dynamic_shapes_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_propagation_abs_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_propagation_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_propagation_device_assert_masked_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_propagation_flip_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_propagation_floordiv_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_propagation_nested_indirect_indexing_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_propagation_remainder_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_put1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_put2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_put3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_put4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_put_as_masked_fill_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_put_deterministic_fallback_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_put_failed_reinplace_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_put_fallback1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_put_fallback2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_put_index_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_put_reinplace_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_remainder_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_select_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_tensor_cuda, test/inductor/test_torchinductor.py::GPUTests::test_indirect_load_broadcast_cuda, test/inductor/test_torchinductor.py::GPUTests::test_inductor_assert_cuda, test/inductor/test_torchinductor.py::GPUTests::test_inductor_layout_optimization_input_mutations_cuda, test/inductor/test_torchinductor.py::GPUTests::test_inductor_multiple_specializations_cuda, test/inductor/test_torchinductor.py::GPUTests::test_inductor_triton_bucketize_respects_masking_cuda, test/inductor/test_torchinductor.py::GPUTests::test_inf_cuda, test/inductor/test_torchinductor.py::GPUTests::test_inner_fn_str_and_stride_cuda, test/inductor/test_torchinductor.py::GPUTests::test_inplace_activations_cuda, test/inductor/test_torchinductor.py::GPUTests::test_inplace_add_cuda, test/inductor/test_torchinductor.py::GPUTests::test_inplace_flip_cuda, test/inductor/test_torchinductor.py::GPUTests::test_inplace_mixed_dtype_ops_cuda, test/inductor/test_torchinductor.py::GPUTests::test_inplace_resize_as_cuda, test/inductor/test_torchinductor.py::GPUTests::test_inplace_where_pointwise_cuda, test/inductor/test_torchinductor.py::GPUTests::test_input_mutation1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_input_mutation2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_input_mutation3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_input_mutation4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_input_mutation5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_insignificant_strides_cuda, test/inductor/test_torchinductor.py::GPUTests::test_int8_weight_only_quant_cuda, test/inductor/test_torchinductor.py::GPUTests::test_int_input_dynamic_shapes_cuda, test/inductor/test_torchinductor.py::GPUTests::test_invalid_operand_issue1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_isin_tensor_scalar_cuda, test/inductor/test_torchinductor.py::GPUTests::test_isinf2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_isinf_cuda, test/inductor/test_torchinductor.py::GPUTests::test_issue102546_cuda, test/inductor/test_torchinductor.py::GPUTests::test_kernel_names_cuda, test/inductor/test_torchinductor.py::GPUTests::test_kwargs_cuda, test/inductor/test_torchinductor.py::GPUTests::test_l1_loss_cuda, test/inductor/test_torchinductor.py::GPUTests::test_large_broadcast_reduction_cuda, test/inductor/test_torchinductor.py::GPUTests::test_large_grid_use_block_ptr_False_cuda, test/inductor/test_torchinductor.py::GPUTests::test_large_grid_use_block_ptr_True_cuda, test/inductor/test_torchinductor.py::GPUTests::test_large_offset_pointwise_cuda, test/inductor/test_torchinductor.py::GPUTests::test_large_pointwise_cuda, test/inductor/test_torchinductor.py::GPUTests::test_large_strided_reduction_cuda, test/inductor/test_torchinductor.py::GPUTests::test_large_tensor_reduction_cuda, test/inductor/test_torchinductor.py::GPUTests::test_layer_norm_cuda, test/inductor/test_torchinductor.py::GPUTests::test_leaky_relu_cuda, test/inductor/test_torchinductor.py::GPUTests::test_lerp_cuda, test/inductor/test_torchinductor.py::GPUTests::test_lgamma_cuda, test/inductor/test_torchinductor.py::GPUTests::test_like_channels_last_cuda, test/inductor/test_torchinductor.py::GPUTests::test_like_rands2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_like_rands3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_like_rands_cuda, test/inductor/test_torchinductor.py::GPUTests::test_like_rands_sliced_cuda, test/inductor/test_torchinductor.py::GPUTests::test_linear1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_linear2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_linear_dynamic_maxautotune_cuda, test/inductor/test_torchinductor.py::GPUTests::test_linear_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_linear_mixed_dtype_cuda, test/inductor/test_torchinductor.py::GPUTests::test_linspace1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_linspace2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_linspace3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_linspace4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_list_clearing_cuda, test/inductor/test_torchinductor.py::GPUTests::test_log1p_cuda, test/inductor/test_torchinductor.py::GPUTests::test_log2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_log_fp64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_log_softmax_cuda, test/inductor/test_torchinductor.py::GPUTests::test_logaddexp_cuda, test/inductor/test_torchinductor.py::GPUTests::test_logcumsumexp_cuda, test/inductor/test_torchinductor.py::GPUTests::test_logcumsumexp_zero_dim_cuda, test/inductor/test_torchinductor.py::GPUTests::test_logsumexp_cuda, test/inductor/test_torchinductor.py::GPUTests::test_long_tensor_cuda, test/inductor/test_torchinductor.py::GPUTests::test_low_memory_max_pool_dilation_1_dim_2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_low_memory_max_pool_dilation_1_dim_3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_low_memory_max_pool_dilation_2_dim_2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_low_memory_max_pool_dilation_2_dim_3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mark_dynamic_with_hint_override_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mark_unbacked_with_hint_override_cuda, test/inductor/test_torchinductor.py::GPUTests::test_masked_fill_cuda, test/inductor/test_torchinductor.py::GPUTests::test_masked_fill_promotion_cuda, test/inductor/test_torchinductor.py::GPUTests::test_masked_scatter_cuda, test/inductor/test_torchinductor.py::GPUTests::test_matmul_layer_norm_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_min_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d6_dilation_1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d6_dilation_2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d7_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d_with_indices_backward2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d_with_indices_backward3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d_with_indices_backward4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d_with_indices_backward5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d_with_indices_backward6_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d_with_indices_backward_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mean_cuda, test/inductor/test_torchinductor.py::GPUTests::test_min_max_reduction_cuda, test/inductor/test_torchinductor.py::GPUTests::test_min_max_reduction_nan_cuda, test/inductor/test_torchinductor.py::GPUTests::test_misaligned_address_issue1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mix_device_index_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mixed_mm2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mixed_mm3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mixed_mm_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mm_mixed_dtype_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mm_views_cuda, test/inductor/test_torchinductor.py::GPUTests::test_move_arange_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mul_index_expr_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mul_softmax_symfloat_cuda, test/inductor/test_torchinductor.py::GPUTests::test_multi_device_cuda, test/inductor/test_torchinductor.py::GPUTests::test_multi_gpu_device_cuda, test/inductor/test_torchinductor.py::GPUTests::test_multi_gpu_recompile_on_index_cuda, test/inductor/test_torchinductor.py::GPUTests::test_multi_threading_cuda, test/inductor/test_torchinductor.py::GPUTests::test_multilayer_any_cuda, test/inductor/test_torchinductor.py::GPUTests::test_multilayer_prime_size_cuda, test/inductor/test_torchinductor.py::GPUTests::test_multilayer_sum_low_prec_cuda, test/inductor/test_torchinductor.py::GPUTests::test_multilayer_var_cuda, test/inductor/test_torchinductor.py::GPUTests::test_multilayer_var_lowp_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mutable_custom_op_fixed_layout2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mutable_custom_op_fixed_layout_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mutations_loop_fusion_cuda, test/inductor/test_torchinductor.py::GPUTests::test_nan_sort_stable_False_descending_False_cuda, test/inductor/test_torchinductor.py::GPUTests::test_nan_sort_stable_False_descending_True_cuda, test/inductor/test_torchinductor.py::GPUTests::test_nan_sort_stable_True_descending_False_cuda, test/inductor/test_torchinductor.py::GPUTests::test_nan_sort_stable_True_descending_True_cuda, test/inductor/test_torchinductor.py::GPUTests::test_nan_to_num_cuda, test/inductor/test_torchinductor.py::GPUTests::test_narrow_cuda, test/inductor/test_torchinductor.py::GPUTests::test_needs_contiguous_strides_cuda, test/inductor/test_torchinductor.py::GPUTests::test_neg_index_cuda, test/inductor/test_torchinductor.py::GPUTests::test_neg_max_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_new_empty_cuda, test/inductor/test_torchinductor.py::GPUTests::test_new_empty_strided_cuda, test/inductor/test_torchinductor.py::GPUTests::test_new_ones_cuda, test/inductor/test_torchinductor.py::GPUTests::test_nll_loss_backward_cuda, test/inductor/test_torchinductor.py::GPUTests::test_nll_loss_forward_cuda, test/inductor/test_torchinductor.py::GPUTests::test_no_mega_fusion_during_lowering_cuda, test/inductor/test_torchinductor.py::GPUTests::test_no_op_reduction_cuda, test/inductor/test_torchinductor.py::GPUTests::test_no_specization_over_symbolic_value_cuda, test/inductor/test_torchinductor.py::GPUTests::test_nonzero_unbacked_refinement_cuda, test/inductor/test_torchinductor.py::GPUTests::test_norm_constant_overflow_cuda, test/inductor/test_torchinductor.py::GPUTests::test_one_hot_cuda, test/inductor/test_torchinductor.py::GPUTests::test_output_strides_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pad_cast_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pad_single_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pad_view_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pattern_matcher_multi_user_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pattern_matcher_unbacked_cuda, test/inductor/test_torchinductor.py::GPUTests::test_permute1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_permute2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_philox_rand_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pixel_shuffle_channels_last_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_airy_ai_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_bessel_j0_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_bessel_j1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_bessel_y0_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_bessel_y1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_chebyshev_polynomial_t_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_chebyshev_polynomial_u_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_chebyshev_polynomial_v_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_chebyshev_polynomial_w_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_digamma_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_entr_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_erf_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_erfc_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_erfcx_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_erfinv_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_exp2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_expit_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_expm1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_gammainc_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_gammaincc_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_gammaln_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_hermite_polynomial_h_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_hermite_polynomial_he_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_i0_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_i0e_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_i1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_i1e_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_laguerre_polynomial_l_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_legendre_polynomial_p_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_log1p_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_log_ndtr_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_logit_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_modified_bessel_i0_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_modified_bessel_i1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_modified_bessel_k0_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_modified_bessel_k1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_multigammaln_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_ndtr_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_ndtri_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_polygamma_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_psi_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_round_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_scaled_modified_bessel_k0_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_scaled_modified_bessel_k1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_shifted_chebyshev_polynomial_t_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_shifted_chebyshev_polynomial_u_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_shifted_chebyshev_polynomial_v_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_shifted_chebyshev_polynomial_w_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_sinc_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_spherical_bessel_j0_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_xlog1py_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_xlogy_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_zeta_cuda, test/inductor/test_torchinductor.py::GPUTests::test_polar_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pow1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pow2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pow3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pow_by_natural_log2_dynamic_shapes_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pow_int_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pow_symfloat_cuda, test/inductor/test_torchinductor.py::GPUTests::test_prepare_softmax_with_fast_math_cuda, test/inductor/test_torchinductor.py::GPUTests::test_prod_cuda, test/inductor/test_torchinductor.py::GPUTests::test_profiler_mark_wrapper_call_cuda, test/inductor/test_torchinductor.py::GPUTests::test_rand_like_deterministic_cuda, test/inductor/test_torchinductor.py::GPUTests::test_randint_cuda, test/inductor/test_torchinductor.py::GPUTests::test_randint_distribution_cuda, test/inductor/test_torchinductor.py::GPUTests::test_randint_int64_mod_cuda, test/inductor/test_torchinductor.py::GPUTests::test_randint_kernel_count_cuda, test/inductor/test_torchinductor.py::GPUTests::test_randn_generator_cuda, test/inductor/test_torchinductor.py::GPUTests::test_randn_like_empty_cuda, test/inductor/test_torchinductor.py::GPUTests::test_randn_with_dtype_and_device_cuda, test/inductor/test_torchinductor.py::GPUTests::test_reduction1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_reduction2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_reduction3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_reduction4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_reduction5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_reduction_config_limit_cuda, test/inductor/test_torchinductor.py::GPUTests::test_reflection_pad2d_backward_cuda, test/inductor/test_torchinductor.py::GPUTests::test_reflection_pad2d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_reinterpret_dtypeview_cuda, test/inductor/test_torchinductor.py::GPUTests::test_relu_cuda, test/inductor/test_torchinductor.py::GPUTests::test_remainder_cuda, test/inductor/test_torchinductor.py::GPUTests::test_remove_no_ops_cuda, test/inductor/test_torchinductor.py::GPUTests::test_remove_noop_clone_cuda, test/inductor/test_torchinductor.py::GPUTests::test_remove_noop_copy_cuda, test/inductor/test_torchinductor.py::GPUTests::test_remove_noop_slice1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_remove_noop_slice_cuda, test/inductor/test_torchinductor.py::GPUTests::test_remove_noop_slice_scatter_cuda, test/inductor/test_torchinductor.py::GPUTests::test_remove_noop_view_default_cuda, test/inductor/test_torchinductor.py::GPUTests::test_remove_noop_view_dtype_cuda, test/inductor/test_torchinductor.py::GPUTests::test_repeat_as_strided_cuda, test/inductor/test_torchinductor.py::GPUTests::test_repeat_cuda, test/inductor/test_torchinductor.py::GPUTests::test_repeat_interleave_2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_repeat_interleave_Tensor_decomp_int32_nd_1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_repeat_interleave_Tensor_decomp_int32_nd_2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_repeat_interleave_Tensor_decomp_int64_nd_1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_repeat_interleave_Tensor_decomp_int64_nd_2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_repeat_interleave_cuda, test/inductor/test_torchinductor.py::GPUTests::test_replication_pad_errors_with_bool_cuda, test/inductor/test_torchinductor.py::GPUTests::test_require_stride_expanded_cuda, test/inductor/test_torchinductor.py::GPUTests::test_resize_as_cuda, test/inductor/test_torchinductor.py::GPUTests::test_resize_cuda, test/inductor/test_torchinductor.py::GPUTests::test_reuse_buffers_with_aliasing_cuda, test/inductor/test_torchinductor.py::GPUTests::test_roi_align_cuda, test/inductor/test_torchinductor.py::GPUTests::test_roll_cuda, test/inductor/test_torchinductor.py::GPUTests::test_round_correctness_cuda, test/inductor/test_torchinductor.py::GPUTests::test_round_cuda, test/inductor/test_torchinductor.py::GPUTests::test_rsqrt_cuda, test/inductor/test_torchinductor.py::GPUTests::test_rsqrt_dynamic_shapes_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scalar_cpu_tensor_arg_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scalar_input_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scalar_output_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scaled_dot_product_attention_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scaled_dot_product_efficient_attention_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter6_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter_add1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter_add2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter_add3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter_bf16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter_reduce1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter_reduce2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter_reduce3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scheduler_vertical_fusion1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sdpa_prefer_nd_tiling_False_use_block_ptr_False_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sdpa_prefer_nd_tiling_False_use_block_ptr_True_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sdpa_prefer_nd_tiling_True_use_block_ptr_False_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sdpa_prefer_nd_tiling_True_use_block_ptr_True_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sdpa_unaligned_mask_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sdpa_unaligned_mask_freezing_cuda, test/inductor/test_torchinductor.py::GPUTests::test_searchsorted_broadcast_cuda, test/inductor/test_torchinductor.py::GPUTests::test_searchsorted_cuda, test/inductor/test_torchinductor.py::GPUTests::test_select_scatter_cuda, test/inductor/test_torchinductor.py::GPUTests::test_setitem_with_int_parameter_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sgn_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sgn_extremal_cuda, test/inductor/test_torchinductor.py::GPUTests::test_shape_padding_cuda, test/inductor/test_torchinductor.py::GPUTests::test_shape_prop_torch_ones_cuda, test/inductor/test_torchinductor.py::GPUTests::test_should_pad_bench_for_bmm_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sigmoid_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sign_dtype_cuda, test/inductor/test_torchinductor.py::GPUTests::test_signbit_cuda, test/inductor/test_torchinductor.py::GPUTests::test_silu_cuda, test/inductor/test_torchinductor.py::GPUTests::test_simplify_loops_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sin_cuda, test/inductor/test_torchinductor.py::GPUTests::test_single_elem_cuda, test/inductor/test_torchinductor.py::GPUTests::test_single_elem_indirect_cuda, test/inductor/test_torchinductor.py::GPUTests::test_size_asserts_for_multi_output_fallback_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sizehint_issue1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice_mutation1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice_mutation2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice_mutation3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice_scatter2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice_scatter3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice_scatter4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice_scatter5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice_scatter_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice_scatter_dtype_consistency_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice_scatter_reinplace_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice_view_with_graph_break_cuda, test/inductor/test_torchinductor.py::GPUTests::test_softmax_backward_data_cuda, test/inductor/test_torchinductor.py::GPUTests::test_softmax_cuda, test/inductor/test_torchinductor.py::GPUTests::test_softmax_one_kernel_loop_cuda, test/inductor/test_torchinductor.py::GPUTests::test_softmax_one_kernel_persist_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sort_bool_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sort_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sort_stable_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sort_transpose_cuda, test/inductor/test_torchinductor.py::GPUTests::test_special_polygamma_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_cumprod_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_cumprod_low_prec_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_cumsum_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_cumsum_index_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_cumsum_low_prec_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_failed_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_reduction_dynamic_shape_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_reduction_with_int64_size_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_with_integer_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_with_list_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_with_sizes_with_unbacked_symints_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_with_unbacked_symints_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sqrt_dynamic_shapes_cuda, test/inductor/test_torchinductor.py::GPUTests::test_squeeze1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_squeeze2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_squeeze_varargs_cuda, test/inductor/test_torchinductor.py::GPUTests::test_stack_cuda, test/inductor/test_torchinductor.py::GPUTests::test_std_cuda, test/inductor/test_torchinductor.py::GPUTests::test_stride_preservation_with_stride_modifying_fx_pass_cuda, test/inductor/test_torchinductor.py::GPUTests::test_strided_inputs_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sum1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sum2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sum3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sum4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sum5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sum_dtype_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sum_int_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sum_keepdims_cuda, test/inductor/test_torchinductor.py::GPUTests::test_tan_cuda, test/inductor/test_torchinductor.py::GPUTests::test_tanh_cuda, test/inductor/test_torchinductor.py::GPUTests::test_tensor1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_tensor2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_tensor3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_tensor_index_put_slice_cuda, test/inductor/test_torchinductor.py::GPUTests::test_tensor_index_slice_cuda, test/inductor/test_torchinductor.py::GPUTests::test_tmp_not_defined_issue1_use_block_ptr_True_cuda, test/inductor/test_torchinductor.py::GPUTests::test_tmp_not_defined_issue2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_tmp_not_defined_issue3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_to_device_constant_cuda, test/inductor/test_torchinductor.py::GPUTests::test_to_device_cuda, test/inductor/test_torchinductor.py::GPUTests::test_to_dtype_cuda, test/inductor/test_torchinductor.py::GPUTests::test_to_memory_format_cuda, test/inductor/test_torchinductor.py::GPUTests::test_topk_cuda, test/inductor/test_torchinductor.py::GPUTests::test_torch_device_split_cuda, test/inductor/test_torchinductor.py::GPUTests::test_transpose_add_cuda, test/inductor/test_torchinductor.py::GPUTests::test_transpose_cuda, test/inductor/test_torchinductor.py::GPUTests::test_transposed_propagates_cuda, test/inductor/test_torchinductor.py::GPUTests::test_triton_kernel_bool_param_cuda, test/inductor/test_torchinductor.py::GPUTests::test_triu_cuda, test/inductor/test_torchinductor.py::GPUTests::test_uint4x2_mixed_mm_cuda, test/inductor/test_torchinductor.py::GPUTests::test_uint_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unbacked_floordiv_simplify_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unbacked_floordiv_simplify_errors_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unbind_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unfold_zero_dimension_tensor_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unroll_small_reduction_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unsigned_constant_tensors_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unspec_inputs_bfloat16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unspec_inputs_float16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unspec_inputs_float32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unspec_inputs_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unspec_inputs_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unspec_inputs_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unspec_inputs_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unspec_inputs_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unspec_inputs_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unsqueeze_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unsqueeze_inplace_cuda, test/inductor/test_torchinductor.py::GPUTests::test_upsample_bicubic2d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_upsample_bilinear2d_a_cuda, test/inductor/test_torchinductor.py::GPUTests::test_upsample_bilinear2d_b_cuda, test/inductor/test_torchinductor.py::GPUTests::test_upsample_cat_conv_cuda, test/inductor/test_torchinductor.py::GPUTests::test_upsample_nearest1d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_upsample_nearest2d_backward_cuda, test/inductor/test_torchinductor.py::GPUTests::test_upsample_nearest2d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_upsample_nearest3d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_var_correction_cuda, test/inductor/test_torchinductor.py::GPUTests::test_var_mean_div_by_cuda, test/inductor/test_torchinductor.py::GPUTests::test_var_mean_tile_reduction_False_cuda, test/inductor/test_torchinductor.py::GPUTests::test_var_mean_tile_reduction_True_cuda, test/inductor/test_torchinductor.py::GPUTests::test_vdd_clamp_cuda, test/inductor/test_torchinductor.py::GPUTests::test_vectorized_ops_masked_cuda, test/inductor/test_torchinductor.py::GPUTests::test_vectorized_ops_masked_var_novec_cuda, test/inductor/test_torchinductor.py::GPUTests::test_vertical_fusion1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_view_as_complex_cuda, test/inductor/test_torchinductor.py::GPUTests::test_view_as_real_cuda, test/inductor/test_torchinductor.py::GPUTests::test_view_detach_cuda, test/inductor/test_torchinductor.py::GPUTests::test_view_on_aliased_cuda, test/inductor/test_torchinductor.py::GPUTests::test_view_uint8_through_differing_bitwidths_cuda, test/inductor/test_torchinductor.py::GPUTests::test_views1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_views2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_views3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_views4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_views5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_views6_cuda, test/inductor/test_torchinductor.py::GPUTests::test_views7_cuda, test/inductor/test_torchinductor.py::GPUTests::test_weight_norm_bwd_cuda, test/inductor/test_torchinductor.py::GPUTests::test_where_broadcast_cuda, test/inductor/test_torchinductor.py::GPUTests::test_where_with_logical_op_cuda, test/inductor/test_torchinductor.py::GPUTests::test_xblock_divides_xnumel_cuda, test/inductor/test_torchinductor.py::GPUTests::test_zero_dim_reductions_cuda, test/inductor/test_torchinductor.py::GPUTests::test_zero_element_mutation_cuda, test/inductor/test_torchinductor.py::GPUTests::test_zeros_cuda, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_bandwidth_profiler, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_cant_optimize_compute, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_codegen_config_option_dont_assume_alignment, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_comment_graph_fragment, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_computed_indirect_mask, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_constant_folding_deallocation, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_ctr_not_moved_to_cuda_when_used_in_index_put, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_divisible_by_16_covers_numel_args, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_donated_buffer_inplace, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_donated_buffer_inplace_gpt, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_evict_last_non_coalesced_loads, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_evict_last_non_coalesced_loads_block_ptr, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_graph_partition_default_device_context, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_grouped_mm, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_has_constant_mask_block_multiple_False_ynumel_exceed_ygrid_size_False, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_has_constant_mask_block_multiple_True_ynumel_exceed_ygrid_size_False, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_has_constant_mask_block_multiple_True_ynumel_exceed_ygrid_size_True, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_indirect_device_assert, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_inductor_detach_view_backend_aot_eager, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_inductor_detach_view_backend_inductor, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_inductor_sequence_nr, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_kernel_names_descriptive, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_layer_norm_inplaces_after_matmul, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_non_blocking_copy_codegen, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_not_materialize_pointwise_reduction, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_numpy_autograd, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_numpy_on_gpu, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_optimize_compute, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_optimize_indexing_assert, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_optimize_indexing_dtype, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_optimize_indexing_dtype_with_constraint, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_red_followed_by_transposed_pointwise, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_respect_scaled_grouped_mm_layout_tag, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_rope_fusion, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_sdpa_inference_mode_aot_compile, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_skip_l1_cache, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_split_op_with_sym, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_triton_attrs_dict_constexpr_signature, test/inductor/test_torchinductor.py::RNNTest::test_rnn_compile_safe, test/inductor/test_torchinductor.py::NanCheckerTest::test_nan_checker_fail, test/inductor/test_torchinductor.py::NanCheckerTest::test_nan_checker_pass 2025-10-10T01:29:07.5483926Z 2025-10-10T01:29:11.4935986Z Running inductor/test_torchinductor_opinfo 8/11 ... [2025-10-10 01:29:11.492960] 2025-10-10T01:29:11.4936510Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:29:11.4937887Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'not serial', '--shard-id=8', '--num-shards=11', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:29:11.493384] 2025-10-10T01:30:08.0115008Z 2025-10-10T01:30:08.0116581Z inductor/test_torchinductor_opinfo 2/11 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_2.11_d749cdb92972a508_.log 2025-10-10T01:30:08.0389884Z Running 354 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___getitem___cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___radd___cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rsub___cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rxor___cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__chunk_cat_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__softmax_backward_data_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_abs_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acosh_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addbmm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addcmul_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_alias_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amax_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_aminmax_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_any_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_arange_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmin_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_partial_views_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_partial_views_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asinh_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan2_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan2_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atanh_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_baddbmm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_and_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_and_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_or_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_right_shift_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_right_shift_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_xor_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_block_diag_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bool_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_tensors_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_to_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cdist_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cdouble_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ceil_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cfloat_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cfloat_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_char_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cholesky_solve_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_min_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_column_stack_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_column_stack_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_complex_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_physical_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_physical_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_constant_pad_nd_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_contiguous_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_contiguous_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_contiguous_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_count_nonzero_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cov_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cross_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummax_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummin_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumprod_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumprod_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumsum_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumsum_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diff_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diff_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diff_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_digamma_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_digamma_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_floor_rounding_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_no_rounding_mode_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_no_rounding_mode_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_no_rounding_mode_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_double_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dsplit_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dstack_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_equal_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erf_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfinv_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp2_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expm1_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exponential_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftshift_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftshift_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftshift_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft2_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfftn_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfftn_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfftn_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfft2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fliplr_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_power_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmod_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmod_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gather_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gather_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gradient_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_grid_sampler_2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gt_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_histc_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hsplit_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_igamma_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_add_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_put_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_mean_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_int_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_int_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isfinite_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isinf_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isneginf_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isneginf_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isposinf_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_unary_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lgamma_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cross_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_diagonal_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_eigvals_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_inv_ex_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_inv_ex_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_lu_solve_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_matrix_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_matrix_rank_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_slogdet_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vander_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log2_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log2_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_normal_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logcumsumexp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_and_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_not_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_xor_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_xor_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logspace_tensor_overload_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lt_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lu_unpack_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mH_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mH_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mT_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumprod_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_fill_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_fill_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_normalize_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_prod_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_prod_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_scatter_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_matrix_exp_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_no_dim_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_no_dim_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_with_dim_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_median_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_median_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_binary_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_binary_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_with_dim_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_minimum_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_minimum_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_msort_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_multinomial_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nanmedian_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nanmedian_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nansum_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_native_dropout_backward_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_native_layer_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_native_layer_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ne_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_strided_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_full_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_avg_pool2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_max_pool3d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_avg_pool1d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_batch_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_bilinear_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_bilinear_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv3d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_dropout_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_embedding_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_group_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardsigmoid_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardswish_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardtanh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardtanh_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hinge_embedding_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_area_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_nearest_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_nearest_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_linear_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool3d_grad_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_mish_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_mse_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multi_margin_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_nll_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softsign_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softsign_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_unfold_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_unfold_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_upsample_bilinear_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_static_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_fro_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_fro_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_nuc_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_normal_in_place_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_outer_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polar_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rad2deg_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rad2deg_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randint_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_real_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reciprocal_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_remainder_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_remainder_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_interleave_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_interleave_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_interleave_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize__cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_conj_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_neg_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_roll_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsqrt_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsub_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_amin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_amin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_prod_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_sum_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_sum_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_searchsorted_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sgn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sigmoid_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sign_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_hamming_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_kaiser_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_nuttall_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sin_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinh_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_scatter_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_j1_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_t_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_entr_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_entr_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_log_ndtr_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_i1_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_i1_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtr_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtri_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtri_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_u_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sqrt_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sqrt_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_multiple_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_std_mean_unbiased_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_std_unbiased_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sub_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_to_size_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_svd_lowrank_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_take_along_dim_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tan_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tan_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tanh_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tile_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_sparse_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_topk_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trace_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trapezoid_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trapz_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_triu_indices_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_true_divide_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_true_divide_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_var_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_as_complex_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vsplit_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vstack_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vstack_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_where_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_like_cuda_float16 2025-10-10T01:30:08.0584453Z 2025-10-10T01:30:11.9450168Z Running inductor/test_torchinductor_opinfo 11/11 ... [2025-10-10 01:30:11.944384] 2025-10-10T01:30:11.9451004Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:30:11.9452800Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'not serial', '--shard-id=11', '--num-shards=11', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:30:11.944824] 2025-10-10T01:30:50.0783563Z 2025-10-10T01:30:50.0785045Z inductor/test_torchinductor_opinfo 8/11 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_8.11_eee6125272166a05_.log 2025-10-10T01:30:50.0924370Z Running 305 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___getitem___cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___getitem___cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___radd___cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rdiv___cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmul___cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___ror___cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rpow___cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rsub___cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__batch_norm_with_update_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__chunk_cat_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_abs_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acos_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acosh_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addcdiv_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addmm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_allclose_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_aminmax_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_aminmax_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_any_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_any_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_arange_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argwhere_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_partial_views_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asinh_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atanh_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_1d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_1d_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_2d_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_3d_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_left_shift_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_or_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_block_diag_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_block_diag_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bool_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bool_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_tensors_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bucketize_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bucketize_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_byte_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cartesian_prod_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cat_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cdouble_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cdouble_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_chunk_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_chunk_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clone_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clone_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_column_stack_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_constant_pad_nd_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_corrcoef_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cos_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cosh_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cross_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumprod_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumsum_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumsum_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumsum_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_deg2rad_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_deg2rad_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diag_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_scatter_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dist_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_floor_rounding_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_no_rounding_mode_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_no_rounding_mode_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_trunc_rounding_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_trunc_rounding_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dot_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_double_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_einsum_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_permuted_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_strided_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_equal_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erf_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfc_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfc_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfinv_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eye_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftn_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfftn_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfftn_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft2_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfft2_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfftn_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_divide_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_frac_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_frexp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_like_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gather_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ge_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_half_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hash_tensor_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_heaviside_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_i0_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_igamma_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_put_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amin_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_prod_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_select_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_inner_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isfinite_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isinf_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isneginf_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isneginf_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_le_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cholesky_ex_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cross_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_eig_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_eigh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_eigvalsh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_multi_dot_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_norm_subgradients_at_zero_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_pinv_singular_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_solve_ex_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vander_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_tensor_overload_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_tensor_overload_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log10_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log2_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_normal_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_with_dtype_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_and_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_xor_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mH_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumsum_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumsum_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_log_softmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_mean_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_scatter_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_softmin_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_matmul_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_matmul_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_no_dim_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mean_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_with_dim_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mode_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_movedim_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_msort_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mul_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nan_to_num_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nanmean_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_native_batch_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_neg_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_ones_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_ones_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_alpha_dropout_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_celu_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv1d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv3d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv_transpose2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cosine_similarity_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cosine_similarity_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_ctc_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_embedding_bag_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_gaussian_nll_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_glu_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_grid_sample_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hinge_embedding_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_huber_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_instance_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_linear_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_nearest_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_local_response_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool1d_grad_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool2d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool3d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multi_head_attention_forward_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multi_head_attention_forward_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multi_margin_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_normalize_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_constant_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_scaled_dot_product_attention_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_selu_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_soft_margin_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_unfold_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_static_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_normal_in_place_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_like_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ormqr_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_prod_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_put_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_quantile_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rad2deg_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ravel_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ravel_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize_as__cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize_as__cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_neg_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rot90_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_round_decimals_0_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsqrt_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scalar_tensor_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scalar_tensor_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_amin_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_mean_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_sum_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sgn_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sign_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sign_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_exponential_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_general_hamming_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_hamming_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signbit_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinc_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sort_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y0_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_v_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_entr_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_log_ndtr_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_log_ndtr_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_log_ndtr_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_i0_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtri_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_zeta_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_zeta_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_zeta_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_list_args_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_square_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_multiple_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_stack_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sub_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_take_along_dim_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_sparse_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_topk_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_true_divide_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_consecutive_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_cuda_uint32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsafe_chunk_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsafe_split_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_var_mean_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_var_mean_unbiased_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_xlogy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_cuda_float16 2025-10-10T01:30:50.1055049Z 2025-10-10T01:30:54.0318817Z Running inductor/test_static_cuda_launcher 1/1 ... [2025-10-10 01:30:54.031297] 2025-10-10T01:30:54.0319338Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:30:54.0320819Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_static_cuda_launcher.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:30:54.031687] 2025-10-10T01:31:01.3619128Z 2025-10-10T01:31:01.3620229Z inductor/test_static_cuda_launcher 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_static_cuda_launcher_1.1_fecdd3b2b7645033_.log 2025-10-10T01:31:01.3627055Z Running 17 items in this shard: test/inductor/test_static_cuda_launcher.py::TestStaticCudaLauncher::test_basic, test/inductor/test_static_cuda_launcher.py::TestStaticCudaLauncher::test_basic_1arg, test/inductor/test_static_cuda_launcher.py::TestStaticCudaLauncher::test_constexpr, test/inductor/test_static_cuda_launcher.py::TestStaticCudaLauncher::test_high_shared_mem, test/inductor/test_static_cuda_launcher.py::TestStaticCudaLauncher::test_implied_constant, test/inductor/test_static_cuda_launcher.py::TestStaticCudaLauncher::test_kernel_empty_tensor, test/inductor/test_static_cuda_launcher.py::TestStaticCudaLauncher::test_kernel_many_args, test/inductor/test_static_cuda_launcher.py::TestStaticCudaLauncher::test_kernel_no_args, test/inductor/test_static_cuda_launcher.py::TestStaticCudaLauncher::test_signed_integers, test/inductor/test_static_cuda_launcher.py::TestStaticCudaLauncher::test_too_high_shared_mem, test/inductor/test_static_cuda_launcher.py::TestStaticCudaLauncher::test_unsigned_integers, test/inductor/test_static_cuda_launcher.py::TestStaticTritonCompileResult::test_any, test/inductor/test_static_cuda_launcher.py::TestStaticTritonCompileResult::test_basic_compile, test/inductor/test_static_cuda_launcher.py::TestStaticTritonCompileResult::test_disable_static_cuda_launcher, test/inductor/test_static_cuda_launcher.py::TestStaticTritonCompileResult::test_empty_tensor, test/inductor/test_static_cuda_launcher.py::TestStaticTritonCompileResult::test_incompatible_code, test/inductor/test_static_cuda_launcher.py::TestStaticTritonCompileResult::test_static_launch_user_defined_triton_kernels 2025-10-10T01:31:01.3633137Z 2025-10-10T01:31:05.2375165Z Running inductor/test_cooperative_reductions 1/1 ... [2025-10-10 01:31:05.236923] 2025-10-10T01:31:05.2375842Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:31:05.2377592Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cooperative_reductions.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:31:05.237315] 2025-10-10T01:31:12.7679924Z 2025-10-10T01:31:12.7682014Z inductor/test_cooperative_reductions 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cooperative_reductions_1.1_a58b03e9f2224605_.log 2025-10-10T01:31:12.7758246Z Running 163 items in this shard: test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_bool_reduction_fns, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_chained_reductions, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_non_power_of_2_bs_15_count_1024, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_non_power_of_2_bs_15_count_1048575, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_non_power_of_2_bs_15_count_1048577, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_non_power_of_2_bs_1_count_1024, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_non_power_of_2_bs_1_count_1048575, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_non_power_of_2_bs_1_count_1048577, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_non_power_of_2_bs_2_count_1024, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_non_power_of_2_bs_2_count_1048575, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_non_power_of_2_bs_2_count_1048577, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_non_power_of_2_bs_5_count_1024, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_non_power_of_2_bs_5_count_1048575, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_non_power_of_2_bs_5_count_1048577, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduce_split, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_amax_float16, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_amax_float32, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_amax_float64, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_amin_float16, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_amin_float32, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_amin_float64, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_max_float16, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_max_float32, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_max_float64, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_mean_float16, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_mean_float32, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_mean_float64, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_min_float16, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_min_float32, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_min_float64, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_prod_float16, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_prod_float32, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_prod_float64, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_softmax_float16, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_softmax_float32, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_softmax_float64, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_std_float16, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_std_float32, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_std_float64, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_sum_float16, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_sum_float32, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_sum_float64, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_var_mean_float16, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_var_mean_float32, test/inductor/test_cooperative_reductions.py::CooperativeReductionTests::test_reduction_fns_name_var_mean_float64, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_bool_reduction_fns, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_chained_reductions, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_non_power_of_2_bs_15_count_1024, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_non_power_of_2_bs_15_count_1048575, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_non_power_of_2_bs_15_count_1048577, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_non_power_of_2_bs_1_count_1024, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_non_power_of_2_bs_1_count_1048575, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_non_power_of_2_bs_1_count_1048577, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_non_power_of_2_bs_2_count_1024, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_non_power_of_2_bs_2_count_1048575, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_non_power_of_2_bs_2_count_1048577, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_non_power_of_2_bs_5_count_1024, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_non_power_of_2_bs_5_count_1048575, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_non_power_of_2_bs_5_count_1048577, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduce_split, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_amax_float16, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_amax_float32, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_amax_float64, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_amin_float16, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_amin_float32, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_amin_float64, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_max_float16, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_max_float32, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_max_float64, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_mean_float16, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_mean_float32, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_mean_float64, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_min_float16, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_min_float32, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_min_float64, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_prod_float16, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_prod_float32, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_prod_float64, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_softmax_float16, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_softmax_float32, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_softmax_float64, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_std_float16, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_std_float32, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_std_float64, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_sum_float16, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_sum_float32, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_sum_float64, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_var_mean_float16, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_var_mean_float32, test/inductor/test_cooperative_reductions.py::NoPersistCooperativeReductionTests::test_reduction_fns_name_var_mean_float64, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_bool_reduction_fns, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_chained_reductions, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_non_power_of_2_bs_15_count_1024, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_non_power_of_2_bs_15_count_1048575, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_non_power_of_2_bs_15_count_1048577, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_non_power_of_2_bs_1_count_1024, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_non_power_of_2_bs_1_count_1048575, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_non_power_of_2_bs_1_count_1048577, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_non_power_of_2_bs_2_count_1024, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_non_power_of_2_bs_2_count_1048575, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_non_power_of_2_bs_2_count_1048577, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_non_power_of_2_bs_5_count_1024, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_non_power_of_2_bs_5_count_1048575, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_non_power_of_2_bs_5_count_1048577, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduce_split, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_amax_float16, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_amax_float32, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_amax_float64, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_amin_float16, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_amin_float32, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_amin_float64, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_max_float16, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_max_float32, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_max_float64, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_mean_float16, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_mean_float32, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_mean_float64, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_min_float16, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_min_float32, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_min_float64, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_prod_float16, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_prod_float32, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_prod_float64, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_softmax_float16, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_softmax_float32, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_softmax_float64, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_std_float16, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_std_float32, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_std_float64, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_sum_float16, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_sum_float32, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_sum_float64, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_var_mean_float16, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_var_mean_float32, test/inductor/test_cooperative_reductions.py::MultiKernelCooperativeReductionTests::test_reduction_fns_name_var_mean_float64, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_fixed_config_with_larger_xblock_than_xnumel_persistent_False_rsplit_32, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_fixed_config_with_larger_xblock_than_xnumel_persistent_False_rsplit_33, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_fixed_config_with_larger_xblock_than_xnumel_persistent_True_rsplit_32, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_fixed_config_with_larger_xblock_than_xnumel_persistent_True_rsplit_33, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_fixed_configs_persistent_False_cooperative_False_cfg0, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_fixed_configs_persistent_False_cooperative_False_cfg1, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_fixed_configs_persistent_False_cooperative_True_cfg4, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_fixed_configs_persistent_False_cooperative_True_cfg5, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_fixed_configs_persistent_False_cooperative_True_cfg8, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_fixed_configs_persistent_False_cooperative_True_cfg9, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_fixed_configs_persistent_True_cooperative_False_cfg2, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_fixed_configs_persistent_True_cooperative_False_cfg3, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_fixed_configs_persistent_True_cooperative_True_cfg10, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_fixed_configs_persistent_True_cooperative_True_cfg11, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_fixed_configs_persistent_True_cooperative_True_cfg6, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_fixed_configs_persistent_True_cooperative_True_cfg7, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_min_max_non_power_of_2_rsplit_persistent_False, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_min_max_non_power_of_2_rsplit_persistent_True, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_welford_non_power_of_2_rsplit_persistent_False_x_1_r_8000_rsplit_17, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_welford_non_power_of_2_rsplit_persistent_False_x_1_r_8192_rsplit_33, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_welford_non_power_of_2_rsplit_persistent_False_x_3_r_8192_rsplit_17, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_welford_non_power_of_2_rsplit_persistent_False_x_4_r_8123_rsplit_33, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_welford_non_power_of_2_rsplit_persistent_False_x_9_r_8000_rsplit_17, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_welford_non_power_of_2_rsplit_persistent_True_x_1_r_7567_rsplit_17, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_welford_non_power_of_2_rsplit_persistent_True_x_1_r_8192_rsplit_17, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_welford_non_power_of_2_rsplit_persistent_True_x_3_r_8192_rsplit_40, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_welford_non_power_of_2_rsplit_persistent_True_x_4_r_8000_rsplit_17, test/inductor/test_cooperative_reductions.py::TestFixedConfigs::test_welford_non_power_of_2_rsplit_persistent_True_x_9_r_8000_rsplit_37 2025-10-10T01:31:12.7830581Z 2025-10-10T01:31:16.7108257Z Running inductor/test_async_compile 1/1 ... [2025-10-10 01:31:16.710263] 2025-10-10T01:31:16.7108738Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:31:16.7109781Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_async_compile.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:31:16.710638] 2025-10-10T01:31:23.8953518Z 2025-10-10T01:31:23.8955005Z inductor/test_async_compile 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_async_compile_1.1_9a7f52f541016830_.log 2025-10-10T01:31:23.8958554Z Running 8 items in this shard: test/inductor/test_async_compile.py::TestAsyncCompile::test_autotune_lookup_table_method_fork, test/inductor/test_async_compile.py::TestAsyncCompile::test_autotune_lookup_table_method_spawn, test/inductor/test_async_compile.py::TestAsyncCompile::test_autotune_lookup_table_method_subprocess, test/inductor/test_async_compile.py::TestAsyncCompile::test_bad_kernel, test/inductor/test_async_compile.py::TestAsyncCompile::test_pool_method_fork, test/inductor/test_async_compile.py::TestAsyncCompile::test_pool_method_spawn, test/inductor/test_async_compile.py::TestAsyncCompile::test_pool_method_subprocess, test/inductor/test_async_compile.py::TestAsyncCompile::test_wait_pool_ready 2025-10-10T01:31:23.8961450Z 2025-10-10T01:31:27.8365314Z Running inductor/test_kernel_benchmark 1/1 ... [2025-10-10 01:31:27.835927] 2025-10-10T01:31:27.8365921Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:31:27.8367311Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_kernel_benchmark.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:31:27.836341] 2025-10-10T01:31:35.0159861Z 2025-10-10T01:31:35.0161246Z inductor/test_kernel_benchmark 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_kernel_benchmark_1.1_0019b7cb7ae5a001_.log 2025-10-10T01:31:35.0168570Z Running 18 items in this shard: test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_fused_layernorm_bandwidth_computation, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_matmul_bandwidth_computation, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_matmul_triton_kernel_benchmark, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_mm_slice_add_bandwidth_computation, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_mm_slice_add_bandwidth_computation_2, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_mm_triton_kernel_benchmark, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_pw_kernel_benchmark, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_reduction_bandwidth_computation, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_remove_inductor_deps, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_remove_inductor_deps_multiple_kernels, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_remove_inductor_deps_scalar, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_remove_inductor_deps_templates, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_slice_add_bandwidth_computation, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_slice_add_cat_bandwidth_computation, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_slice_mm_bandwidth_computation, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_split_scan, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_star_dep, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_unused_input_bandwidth_computation 2025-10-10T01:31:35.0175135Z 2025-10-10T01:31:37.4985106Z 2025-10-10T01:31:37.4986393Z inductor/test_torchinductor_opinfo 11/11 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_11.11_5fb5715d736a3c9c_.log 2025-10-10T01:31:37.5143296Z Running 358 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_T_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_T_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___radd___cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rand___cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rand___cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rdiv___cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmatmul___cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmatmul___cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmod___cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmul___cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmul___cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rpow___cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__batch_norm_with_update_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__chunk_cat_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__native_batch_norm_legit_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__upsample_bilinear2d_aa_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_add_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addmm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addmm_decomposed_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addmv_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addr_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_alias_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_all_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_all_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amin_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_aminmax_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_angle_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_any_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmax_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asin_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atanh_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_1d_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_baddbmm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_and_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_or_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_xor_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bool_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bool_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bool_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_tensors_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_to_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_to_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_to_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bucketize_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_byte_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_byte_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_byte_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cholesky_inverse_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_chunk_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clone_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_physical_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_constant_pad_nd_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_constant_pad_nd_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_constant_pad_nd_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cos_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cosh_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cosh_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_count_nonzero_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cov_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cov_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummax_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummin_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_deg2rad_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diag_embed_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_scatter_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diff_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dsplit_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dstack_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dstack_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dstack_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_strided_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eq_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eq_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_equal_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfc_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfc_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfinv_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp2_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expm1_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expm1_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exponential_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eye_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eye_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eye_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft2_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft2_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft2_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftn_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft2_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft2_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfft2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfft_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfftn_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flip_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flip_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_power_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_divide_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_divide_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_divide_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmax_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmin_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_frac_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ge_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gt_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gt_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hash_tensor_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_histc_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_histc_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hsplit_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hypot_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_fill_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amin_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_mean_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_prod_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_int_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isfinite_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isnan_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isnan_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isreal_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_unary_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_kron_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lgamma_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_diagonal_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_eigvals_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_eigvalsh_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_lstsq_grad_oriented_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_lu_factor_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_lu_solve_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_multi_dot_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_qr_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_svd_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vector_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vector_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_tensor_overload_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log10_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log1p_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logaddexp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_or_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_xor_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logspace_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logspace_tensor_overload_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logspace_tensor_overload_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_long_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_long_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lt_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lu_unpack_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mT_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amax_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_argmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_argmin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_fill_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_median_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_prod_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_select_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_softmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_softmax_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_softmin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_sum_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_sum_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_matrix_exp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_no_dim_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_no_dim_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_with_dim_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_with_dim_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_median_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_binary_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_with_dim_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_with_dim_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_minimum_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mode_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mode_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_multinomial_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nan_to_num_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nanmedian_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nansum_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nansum_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_copy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_native_batch_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_neg_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_ones_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_max_pool2d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_avg_pool3d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_batch_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_batch_norm_without_cudnn_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_embedding_bag_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_gelu_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardtanh_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hinge_embedding_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_instance_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_bilinear_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_bilinear_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_linear_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_nearest-exact_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_l1_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_logsigmoid_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool1d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_nll_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_constant_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu6_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_silu_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_pca_lowrank_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_put_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randint_like_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randn_like_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randn_like_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_real_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_remainder_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_interleave_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize__cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize__cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize__cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize_as__cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_conj_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_roll_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rot90_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_round_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_round_decimals_3_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsqrt_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scalar_tensor_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_add_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_mean_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_sum_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_searchsorted_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_select_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_select_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_select_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_select_scatter_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sgn_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_short_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sigmoid_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_exponential_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signbit_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signbit_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_with_dtype_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sparse_mm_reduce_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_airy_ai_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_airy_ai_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y0_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y1_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y1_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_v_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_entr_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_erfcx_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i0e_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i0e_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i0e_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_i1_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_k0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_k0_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_k1_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtr_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtri_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_zeta_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_square_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_multiple_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_stack_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_std_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_std_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_std_mean_unbiased_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_copy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_take_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_take_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tanh_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tensor_split_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_sparse_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trace_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trapezoid_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trapz_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_triu_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_triu_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_true_divide_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_consecutive_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_cuda_uint16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_as_complex_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_as_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vsplit_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vsplit_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zero__cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_like_cuda_int32 2025-10-10T01:31:37.5291987Z 2025-10-10T01:31:38.9449670Z Running inductor/test_cuda_repro 1/1 ... [2025-10-10 01:31:38.944420] 2025-10-10T01:31:38.9450149Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:31:38.9451902Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cuda_repro.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:31:38.944798] 2025-10-10T01:31:41.3416987Z Running dynamo/test_callback 1/1 ... [2025-10-10 01:31:41.341089] 2025-10-10T01:31:41.3417572Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:31:41.3419141Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_callback.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:31:41.341490] 2025-10-10T01:31:46.7250598Z 2025-10-10T01:31:46.7251770Z inductor/test_cuda_repro 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cuda_repro_1.1_52865d9256534d61_.log 2025-10-10T01:31:46.7276867Z Running 86 items in this shard: test/inductor/test_cuda_repro.py::CudaReproTests::test_3d_tiling, test/inductor/test_cuda_repro.py::CudaReproTests::test_accuracy_issue1, test/inductor/test_cuda_repro.py::CudaReproTests::test_adaptive_avg_pool3d_issue_157248, test/inductor/test_cuda_repro.py::CudaReproTests::test_atomic_add_bfloat16, test/inductor/test_cuda_repro.py::CudaReproTests::test_atomic_add_bfloat16_config, test/inductor/test_cuda_repro.py::CudaReproTests::test_autotune_inplace_kernel, test/inductor/test_cuda_repro.py::CudaReproTests::test_backward_context, test/inductor/test_cuda_repro.py::CudaReproTests::test_bool_emulate_low_precision, test/inductor/test_cuda_repro.py::CudaReproTests::test_bucketize_dynamic_dense, test/inductor/test_cuda_repro.py::CudaReproTests::test_bucketize_epilogue, test/inductor/test_cuda_repro.py::CudaReproTests::test_cat_int8_one_kernel, test/inductor/test_cuda_repro.py::CudaReproTests::test_cpu_index, test/inductor/test_cuda_repro.py::CudaReproTests::test_deterministic_algorithms, test/inductor/test_cuda_repro.py::CudaReproTests::test_dont_inplace_disjoint_accesses, test/inductor/test_cuda_repro.py::CudaReproTests::test_dtype_factory_issue, test/inductor/test_cuda_repro.py::CudaReproTests::test_dynamic_persistent_reductions, test/inductor/test_cuda_repro.py::CudaReproTests::test_dynamic_shapes, test/inductor/test_cuda_repro.py::CudaReproTests::test_dynamic_to_static_cudagraphs, test/inductor/test_cuda_repro.py::CudaReproTests::test_effn_attn_bias_padding, test/inductor/test_cuda_repro.py::CudaReproTests::test_effn_attn_bias_padding_misaligned, test/inductor/test_cuda_repro.py::CudaReproTests::test_embedding_var_mean, test/inductor/test_cuda_repro.py::CudaReproTests::test_emulate_low_precision, test/inductor/test_cuda_repro.py::CudaReproTests::test_emulate_precision_casts_mean_ratio_chain, test/inductor/test_cuda_repro.py::CudaReproTests::test_emulate_precision_casts_min_pow_chain, test/inductor/test_cuda_repro.py::CudaReproTests::test_emulate_precision_casts_norm_rounding, test/inductor/test_cuda_repro.py::CudaReproTests::test_epilogue_fusion_with_view, test/inductor/test_cuda_repro.py::CudaReproTests::test_expanded_inputs_cudagraphs, test/inductor/test_cuda_repro.py::CudaReproTests::test_expanded_inputs_cudagraphs_no_size_asserts, test/inductor/test_cuda_repro.py::CudaReproTests::test_flash_attention_dynamic, test/inductor/test_cuda_repro.py::CudaReproTests::test_float64_constants, test/inductor/test_cuda_repro.py::CudaReproTests::test_float8_e8m0fnu, test/inductor/test_cuda_repro.py::CudaReproTests::test_full_copy, test/inductor/test_cuda_repro.py::CudaReproTests::test_index_add_fallback, test/inductor/test_cuda_repro.py::CudaReproTests::test_index_put_cudagraph, test/inductor/test_cuda_repro.py::CudaReproTests::test_index_put_inplace_cudagraph, test/inductor/test_cuda_repro.py::CudaReproTests::test_index_put_issue, test/inductor/test_cuda_repro.py::CudaReproTests::test_index_put_no_fallback_cudagraph, test/inductor/test_cuda_repro.py::CudaReproTests::test_indirect_indexing_dense_mask, test/inductor/test_cuda_repro.py::CudaReproTests::test_inductor_output_aliases_intermediate, test/inductor/test_cuda_repro.py::CudaReproTests::test_inplace_add_alpha_autotune, test/inductor/test_cuda_repro.py::CudaReproTests::test_inplace_buffer_autotune, test/inductor/test_cuda_repro.py::CudaReproTests::test_inplace_updates_cudagraphs, test/inductor/test_cuda_repro.py::CudaReproTests::test_input_channels_last, test/inductor/test_cuda_repro.py::CudaReproTests::test_int64_index_intermediate, test/inductor/test_cuda_repro.py::CudaReproTests::test_issue100806, test/inductor/test_cuda_repro.py::CudaReproTests::test_issue103461, test/inductor/test_cuda_repro.py::CudaReproTests::test_issue103481, test/inductor/test_cuda_repro.py::CudaReproTests::test_issue104759, test/inductor/test_cuda_repro.py::CudaReproTests::test_issue97695_1input, test/inductor/test_cuda_repro.py::CudaReproTests::test_issue97695_2input, test/inductor/test_cuda_repro.py::CudaReproTests::test_issue_103924, test/inductor/test_cuda_repro.py::CudaReproTests::test_libdevice_routing, test/inductor/test_cuda_repro.py::CudaReproTests::test_linear_cpu_input, test/inductor/test_cuda_repro.py::CudaReproTests::test_linear_with_zero_infeature_size, test/inductor/test_cuda_repro.py::CudaReproTests::test_lookup_seed_backward, test/inductor/test_cuda_repro.py::CudaReproTests::test_max_autotune_nograd, test/inductor/test_cuda_repro.py::CudaReproTests::test_memory_history_inductor, test/inductor/test_cuda_repro.py::CudaReproTests::test_mm_out_dtype_compile, test/inductor/test_cuda_repro.py::CudaReproTests::test_multi_output_layout_fallback, test/inductor/test_cuda_repro.py::CudaReproTests::test_mutated_aligned_tensor, test/inductor/test_cuda_repro.py::CudaReproTests::test_negative_arange_dynamic_shapes, test/inductor/test_cuda_repro.py::CudaReproTests::test_no_device_idx_repro_cudagraphs, test/inductor/test_cuda_repro.py::CudaReproTests::test_non_commutative_scan_op, test/inductor/test_cuda_repro.py::CudaReproTests::test_non_contiguous_unaligned_input_indices, test/inductor/test_cuda_repro.py::CudaReproTests::test_normalize_norm_leq_one, test/inductor/test_cuda_repro.py::CudaReproTests::test_not_initializing_wrong_device, test/inductor/test_cuda_repro.py::CudaReproTests::test_permute_fusion, test/inductor/test_cuda_repro.py::CudaReproTests::test_qwen2_7b_sdpa_input_alignment_requires_recompile, test/inductor/test_cuda_repro.py::CudaReproTests::test_red_dtype_mismatch, test/inductor/test_cuda_repro.py::CudaReproTests::test_reflection_pad_loop_order, test/inductor/test_cuda_repro.py::CudaReproTests::test_repeated_masked_load, test/inductor/test_cuda_repro.py::CudaReproTests::test_scalar_triton_index, test/inductor/test_cuda_repro.py::CudaReproTests::test_scaled_dot_product_efficient_attention_backward, test/inductor/test_cuda_repro.py::CudaReproTests::test_scatter_index_not_wrapped, test/inductor/test_cuda_repro.py::CudaReproTests::test_selecsls42b_misaligned_address, test/inductor/test_cuda_repro.py::CudaReproTests::test_simplify_dims, test/inductor/test_cuda_repro.py::CudaReproTests::test_sort_stride_issue, test/inductor/test_cuda_repro.py::CudaReproTests::test_sorted_masks, test/inductor/test_cuda_repro.py::CudaReproTests::test_split_reduction_channels_last, test/inductor/test_cuda_repro.py::CudaReproTests::test_split_reduction_transposed, test/inductor/test_cuda_repro.py::CudaReproTests::test_triton_interpret, test/inductor/test_cuda_repro.py::CudaReproTests::test_uint_view_copy, test/inductor/test_cuda_repro.py::CudaReproTests::test_unspec_inputs_interop, test/inductor/test_cuda_repro.py::CudaReproTests::test_unused_cpu_input_cudagraphs, test/inductor/test_cuda_repro.py::CudaReproTests::test_view_replay_padding_issue_163328, test/inductor/test_cuda_repro.py::CudaReproTests::test_xlnet_lm_stride_repro 2025-10-10T01:31:46.7301188Z 2025-10-10T01:31:48.6207920Z 2025-10-10T01:31:48.6209007Z dynamo/test_callback 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_callback_1.1_a27e80d13cd2d8fa_.log 2025-10-10T01:31:48.6210684Z Running 4 items in this shard: test/dynamo/test_callback.py::CallbackTests::test_callbacks_with_duplicate_prevention, test/dynamo/test_callback.py::CallbackTests::test_counter, test/dynamo/test_callback.py::CallbackTests::test_counter_assertion, test/dynamo/test_callback.py::CallbackTests::test_triggers 2025-10-10T01:31:48.6211833Z 2025-10-10T01:31:50.6079171Z Running inductor/test_fp8 1/1 ... [2025-10-10 01:31:50.607314] 2025-10-10T01:31:50.6080006Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:31:50.6082272Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_fp8.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:31:50.607701] 2025-10-10T01:31:52.4456700Z Running inductor/test_torchinductor_dynamic_shapes 1/2 ... [2025-10-10 01:31:52.445015] 2025-10-10T01:31:52.4457767Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:31:52.4459183Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_dynamic_shapes.py', '-m', 'not serial', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:31:52.445380] 2025-10-10T01:31:58.3379129Z 2025-10-10T01:31:58.3380067Z inductor/test_fp8 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_fp8_1.1_c75c2f35def64f67_.log 2025-10-10T01:31:58.3481132Z Running 241 items in this shard: test/inductor/test_fp8.py::TestFP8Types::test_amax_along_with_fp8_quant_float8_e4m3fn_shape_1,1,15_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_amax_along_with_fp8_quant_float8_e4m3fn_shape_1,1,15_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_amax_along_with_fp8_quant_float8_e4m3fn_shape_1,10,15_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_amax_along_with_fp8_quant_float8_e4m3fn_shape_1,10,15_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_amax_along_with_fp8_quant_float8_e4m3fn_shape_1,10,4096_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_amax_along_with_fp8_quant_float8_e4m3fn_shape_1,10,4096_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_amax_along_with_fp8_quant_float8_e4m3fn_shape_1,10,512_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_amax_along_with_fp8_quant_float8_e4m3fn_shape_1,10,512_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_amax_along_with_fp8_quant_float8_e4m3fn_shape_4,2048,4096_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_amax_along_with_fp8_quant_float8_e4m3fn_shape_4,2048,4096_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_amax_along_with_fp8_quant_float8_e5m2_shape_1,1,15_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_amax_along_with_fp8_quant_float8_e5m2_shape_1,1,15_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_amax_along_with_fp8_quant_float8_e5m2_shape_1,10,15_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_amax_along_with_fp8_quant_float8_e5m2_shape_1,10,15_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_amax_along_with_fp8_quant_float8_e5m2_shape_1,10,4096_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_amax_along_with_fp8_quant_float8_e5m2_shape_1,10,4096_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_amax_along_with_fp8_quant_float8_e5m2_shape_1,10,512_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_amax_along_with_fp8_quant_float8_e5m2_shape_1,10,512_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_amax_along_with_fp8_quant_float8_e5m2_shape_4,2048,4096_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_amax_along_with_fp8_quant_float8_e5m2_shape_4,2048,4096_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_amax_fp8_quant_float8_e4m3fn_shape_1,1,15_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_amax_fp8_quant_float8_e4m3fn_shape_1,1,15_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_amax_fp8_quant_float8_e4m3fn_shape_1,10,15_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_amax_fp8_quant_float8_e4m3fn_shape_1,10,15_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_amax_fp8_quant_float8_e4m3fn_shape_1,10,4096_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_amax_fp8_quant_float8_e4m3fn_shape_1,10,4096_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_amax_fp8_quant_float8_e4m3fn_shape_1,10,512_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_amax_fp8_quant_float8_e4m3fn_shape_1,10,512_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_amax_fp8_quant_float8_e4m3fn_shape_4,2048,4096_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_amax_fp8_quant_float8_e4m3fn_shape_4,2048,4096_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_amax_fp8_quant_float8_e5m2_shape_1,1,15_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_amax_fp8_quant_float8_e5m2_shape_1,1,15_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_amax_fp8_quant_float8_e5m2_shape_1,10,15_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_amax_fp8_quant_float8_e5m2_shape_1,10,15_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_amax_fp8_quant_float8_e5m2_shape_1,10,4096_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_amax_fp8_quant_float8_e5m2_shape_1,10,4096_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_amax_fp8_quant_float8_e5m2_shape_1,10,512_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_amax_fp8_quant_float8_e5m2_shape_1,10,512_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_amax_fp8_quant_float8_e5m2_shape_4,2048,4096_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_amax_fp8_quant_float8_e5m2_shape_4,2048,4096_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_bad_cast, test/inductor/test_fp8.py::TestFP8Types::test_eager_fallback_bfloat16_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_eager_fallback_bfloat16_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_eager_fallback_float16_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_eager_fallback_float16_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_benchmark_float8_e4m3fn_shape_4,2048,4096_keepdim_False, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_benchmark_float8_e4m3fn_shape_4,2048,4096_keepdim_True, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_benchmark_float8_e5m2_shape_4,2048,4096_keepdim_False, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_benchmark_float8_e5m2_shape_4,2048,4096_keepdim_True, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e4m3fn_amax_keep_dim_False_shape_1,1,15_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e4m3fn_amax_keep_dim_False_shape_1,1,15_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e4m3fn_amax_keep_dim_False_shape_1,10,15_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e4m3fn_amax_keep_dim_False_shape_1,10,15_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e4m3fn_amax_keep_dim_False_shape_1,10,4096_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e4m3fn_amax_keep_dim_False_shape_1,10,4096_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e4m3fn_amax_keep_dim_False_shape_1,10,512_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e4m3fn_amax_keep_dim_False_shape_1,10,512_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e4m3fn_amax_keep_dim_False_shape_4,2048,4096_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e4m3fn_amax_keep_dim_False_shape_4,2048,4096_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e4m3fn_amax_keep_dim_True_shape_1,1,15_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e4m3fn_amax_keep_dim_True_shape_1,1,15_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e4m3fn_amax_keep_dim_True_shape_1,10,15_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e4m3fn_amax_keep_dim_True_shape_1,10,15_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e4m3fn_amax_keep_dim_True_shape_1,10,4096_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e4m3fn_amax_keep_dim_True_shape_1,10,4096_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e4m3fn_amax_keep_dim_True_shape_1,10,512_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e4m3fn_amax_keep_dim_True_shape_1,10,512_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e4m3fn_amax_keep_dim_True_shape_4,2048,4096_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e4m3fn_amax_keep_dim_True_shape_4,2048,4096_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e5m2_amax_keep_dim_False_shape_1,1,15_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e5m2_amax_keep_dim_False_shape_1,1,15_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e5m2_amax_keep_dim_False_shape_1,10,15_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e5m2_amax_keep_dim_False_shape_1,10,15_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e5m2_amax_keep_dim_False_shape_1,10,4096_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e5m2_amax_keep_dim_False_shape_1,10,4096_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e5m2_amax_keep_dim_False_shape_1,10,512_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e5m2_amax_keep_dim_False_shape_1,10,512_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e5m2_amax_keep_dim_False_shape_4,2048,4096_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e5m2_amax_keep_dim_False_shape_4,2048,4096_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e5m2_amax_keep_dim_True_shape_1,1,15_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e5m2_amax_keep_dim_True_shape_1,1,15_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e5m2_amax_keep_dim_True_shape_1,10,15_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e5m2_amax_keep_dim_True_shape_1,10,15_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e5m2_amax_keep_dim_True_shape_1,10,4096_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e5m2_amax_keep_dim_True_shape_1,10,4096_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e5m2_amax_keep_dim_True_shape_1,10,512_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e5m2_amax_keep_dim_True_shape_1,10,512_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e5m2_amax_keep_dim_True_shape_4,2048,4096_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_layernorm_fp8_quant_float8_e5m2_amax_keep_dim_True_shape_4,2048,4096_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_bfloat16_float8_e4m3fn_shape_16,16,16_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_bfloat16_float8_e4m3fn_shape_16,16,16_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_bfloat16_float8_e4m3fn_shape_4,2048,4096_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_bfloat16_float8_e4m3fn_shape_4,2048,4096_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_bfloat16_float8_e5m2_shape_16,16,16_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_bfloat16_float8_e5m2_shape_16,16,16_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_bfloat16_float8_e5m2_shape_4,2048,4096_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_bfloat16_float8_e5m2_shape_4,2048,4096_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_float16_float8_e4m3fn_shape_16,16,16_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_float16_float8_e4m3fn_shape_16,16,16_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_float16_float8_e4m3fn_shape_4,2048,4096_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_float16_float8_e4m3fn_shape_4,2048,4096_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_float16_float8_e5m2_shape_16,16,16_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_float16_float8_e5m2_shape_16,16,16_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_float16_float8_e5m2_shape_4,2048,4096_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_float16_float8_e5m2_shape_4,2048,4096_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_float32_float8_e4m3fn_shape_16,16,16_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_float32_float8_e4m3fn_shape_16,16,16_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_float32_float8_e4m3fn_shape_4,2048,4096_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_float32_float8_e4m3fn_shape_4,2048,4096_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_float32_float8_e5m2_shape_16,16,16_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_float32_float8_e5m2_shape_16,16,16_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_float32_float8_e5m2_shape_4,2048,4096_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_to_fp8_saturated_float32_float8_e5m2_shape_4,2048,4096_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_valid_cast_bfloat16_shape_15,3,13_dst_types0_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_valid_cast_bfloat16_shape_15,3,13_dst_types0_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_valid_cast_bfloat16_shape_4,2048,4096_dst_types0_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_valid_cast_bfloat16_shape_4,2048,4096_dst_types0_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_valid_cast_float16_shape_15,3,13_dst_types0_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_valid_cast_float16_shape_15,3,13_dst_types0_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_valid_cast_float16_shape_4,2048,4096_dst_types0_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_valid_cast_float16_shape_4,2048,4096_dst_types0_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_valid_cast_float32_shape_15,3,13_dst_types0_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_valid_cast_float32_shape_15,3,13_dst_types0_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_valid_cast_float32_shape_4,2048,4096_dst_types0_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_valid_cast_float32_shape_4,2048,4096_dst_types0_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_xblock_for_small_numel_float8_e4m3fn_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_xblock_for_small_numel_float8_e4m3fn_device_cuda, test/inductor/test_fp8.py::TestFP8Types::test_xblock_for_small_numel_float8_e5m2_device_cpu, test/inductor/test_fp8.py::TestFP8Types::test_xblock_for_small_numel_float8_e5m2_device_cuda, test/inductor/test_fp8.py::TestFP8Lowering::test_mx_fp8_max_autotune, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_1024_K_1024_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_1024_K_1024_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_1024_K_16_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_1024_K_16_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_1024_K_32_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_1024_K_32_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_1_K_1024_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_1_K_1024_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_1_K_16_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_1_K_16_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_1_K_32_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_1_K_32_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_257_K_1024_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_257_K_1024_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_257_K_16_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_257_K_16_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_257_K_32_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_257_K_32_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_33_K_1024_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_33_K_1024_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_33_K_16_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_33_K_16_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_33_K_32_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_33_K_32_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_3_K_1024_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_3_K_1024_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_3_K_16_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_3_K_16_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_3_K_32_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_acceptable_input_dims_M_3_K_32_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_shape_1024,1024,512_has_bias_False_use_fast_accum_False_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_shape_1024,1024,512_has_bias_False_use_fast_accum_True_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_shape_1024,1024,512_has_bias_True_use_fast_accum_False_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_shape_1024,1024,512_has_bias_True_use_fast_accum_True_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_shape_16,16,32_has_bias_False_use_fast_accum_False_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_shape_16,16,32_has_bias_False_use_fast_accum_True_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_shape_16,16,32_has_bias_True_use_fast_accum_False_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_shape_16,16,32_has_bias_True_use_fast_accum_True_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_shape_16,32,32_has_bias_False_use_fast_accum_False_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_shape_16,32,32_has_bias_False_use_fast_accum_True_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_shape_16,32,32_has_bias_True_use_fast_accum_False_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_shape_16,32,32_has_bias_True_use_fast_accum_True_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_tma_template_shape_1024,1024,512_use_fast_accum_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_tma_template_shape_1024,1024,512_use_fast_accum_True, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_tma_template_shape_16,32,32_use_fast_accum_False, test/inductor/test_fp8.py::TestFP8Lowering::test_rowwise_scaling_tma_template_shape_16,32,32_use_fast_accum_True, test/inductor/test_fp8.py::TestFP8Lowering::test_scaled_mm_preserves_strides, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_1024_K_1024_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_1024_K_1024_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_1024_K_16_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_1024_K_16_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_1024_K_32_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_1024_K_32_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_1_K_1024_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_1_K_1024_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_1_K_16_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_1_K_16_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_1_K_32_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_1_K_32_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_257_K_1024_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_257_K_1024_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_257_K_16_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_257_K_16_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_257_K_32_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_257_K_32_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_33_K_1024_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_33_K_1024_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_33_K_16_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_33_K_16_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_33_K_32_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_33_K_32_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_3_K_1024_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_3_K_1024_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_3_K_16_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_3_K_16_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_3_K_32_N_16_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_acceptable_input_dims_M_3_K_32_N_2048_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_bfloat16_shape_1024,1024,512_has_bias_False_use_fast_accum_False_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_bfloat16_shape_1024,1024,512_has_bias_False_use_fast_accum_True_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_bfloat16_shape_1024,1024,512_has_bias_True_use_fast_accum_False_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_bfloat16_shape_1024,1024,512_has_bias_True_use_fast_accum_True_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_bfloat16_shape_16,16,32_has_bias_False_use_fast_accum_False_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_bfloat16_shape_16,16,32_has_bias_False_use_fast_accum_True_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_bfloat16_shape_16,16,32_has_bias_True_use_fast_accum_False_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_bfloat16_shape_16,16,32_has_bias_True_use_fast_accum_True_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_bfloat16_shape_16,32,32_has_bias_False_use_fast_accum_False_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_bfloat16_shape_16,32,32_has_bias_False_use_fast_accum_True_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_bfloat16_shape_16,32,32_has_bias_True_use_fast_accum_False_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_bfloat16_shape_16,32,32_has_bias_True_use_fast_accum_True_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_float32_shape_1024,1024,512_has_bias_False_use_fast_accum_False_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_float32_shape_1024,1024,512_has_bias_False_use_fast_accum_True_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_float32_shape_1024,1024,512_has_bias_True_use_fast_accum_False_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_float32_shape_1024,1024,512_has_bias_True_use_fast_accum_True_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_float32_shape_16,16,32_has_bias_False_use_fast_accum_False_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_float32_shape_16,16,32_has_bias_False_use_fast_accum_True_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_float32_shape_16,16,32_has_bias_True_use_fast_accum_False_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_float32_shape_16,16,32_has_bias_True_use_fast_accum_True_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_float32_shape_16,32,32_has_bias_False_use_fast_accum_False_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_float32_shape_16,32,32_has_bias_False_use_fast_accum_True_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_float32_shape_16,32,32_has_bias_True_use_fast_accum_False_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_float32_shape_16,32,32_has_bias_True_use_fast_accum_True_persistent_matmul_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_tma_template_bfloat16_shape_1024,1024,512_use_fast_accum_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_tma_template_bfloat16_shape_1024,1024,512_use_fast_accum_True, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_tma_template_bfloat16_shape_16,32,32_use_fast_accum_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_tma_template_bfloat16_shape_16,32,32_use_fast_accum_True, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_tma_template_float32_shape_1024,1024,512_use_fast_accum_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_tma_template_float32_shape_1024,1024,512_use_fast_accum_True, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_tma_template_float32_shape_16,32,32_use_fast_accum_False, test/inductor/test_fp8.py::TestFP8Lowering::test_tensorwise_scaling_tma_template_float32_shape_16,32,32_use_fast_accum_True, test/inductor/test_fp8.py::TestFP8Lowering::test_unacceptable_input_dims, test/inductor/test_fp8.py::TestFP8Lowering::test_unacceptable_scale_dims_rowwise_scaling 2025-10-10T01:31:58.3578035Z 2025-10-10T01:32:02.1705851Z Running inductor/test_analysis 1/1 ... [2025-10-10 01:32:02.169860] 2025-10-10T01:32:02.1706350Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:32:02.1717421Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_analysis.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:32:02.170240] 2025-10-10T01:32:09.5014787Z 2025-10-10T01:32:09.5015969Z inductor/test_analysis 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_analysis_1.1_105daf17aeb27a7b_.log 2025-10-10T01:32:09.5026548Z Running 28 items in this shard: test/inductor/test_analysis.py::TestUtils::test_tabulate2d, test/inductor/test_analysis.py::TestUtils::test_zip_dicts, test/inductor/test_analysis.py::TestAnalysisCUDA::test_augment_trace_against_flop_counter_maxat0_cuda_float16, test/inductor/test_analysis.py::TestAnalysisCUDA::test_augment_trace_against_flop_counter_maxat0_cuda_float32, test/inductor/test_analysis.py::TestAnalysisCUDA::test_augment_trace_against_flop_counter_maxat1_cuda_float16, test/inductor/test_analysis.py::TestAnalysisCUDA::test_augment_trace_against_flop_counter_maxat1_cuda_float32, test/inductor/test_analysis.py::TestAnalysisCUDA::test_augment_trace_against_flop_counter_maxat2_cuda_float16, test/inductor/test_analysis.py::TestAnalysisCUDA::test_augment_trace_against_flop_counter_maxat2_cuda_float32, test/inductor/test_analysis.py::TestAnalysisCUDA::test_augment_trace_against_flop_counter_maxat3_cuda_float16, test/inductor/test_analysis.py::TestAnalysisCUDA::test_augment_trace_against_flop_counter_maxat3_cuda_float32, test/inductor/test_analysis.py::TestAnalysisCUDA::test_augment_trace_helper_unit_cuda, test/inductor/test_analysis.py::TestAnalysisCUDA::test_combine_profiles_cuda_float16, test/inductor/test_analysis.py::TestAnalysisCUDA::test_combine_profiles_cuda_float32, test/inductor/test_analysis.py::TestAnalysisCUDA::test_diff_cuda_float16, test/inductor/test_analysis.py::TestAnalysisCUDA::test_diff_cuda_float32, test/inductor/test_analysis.py::TestAnalysisCUDA::test_diff_cuda_float64, test/inductor/test_analysis.py::TestAnalysisCUDA::test_noop_cuda, test/inductor/test_analysis.py::TestAnalysisCUDA::test_pointwise_bandwidth_maxat0_cuda_float16, test/inductor/test_analysis.py::TestAnalysisCUDA::test_pointwise_bandwidth_maxat0_cuda_float32, test/inductor/test_analysis.py::TestAnalysisCUDA::test_pointwise_bandwidth_maxat1_cuda_float16, test/inductor/test_analysis.py::TestAnalysisCUDA::test_pointwise_bandwidth_maxat1_cuda_float32, test/inductor/test_analysis.py::TestAnalysisCUDA::test_pointwise_bandwidth_maxat2_cuda_float16, test/inductor/test_analysis.py::TestAnalysisCUDA::test_pointwise_bandwidth_maxat2_cuda_float32, test/inductor/test_analysis.py::TestAnalysisCUDA::test_pointwise_bandwidth_maxat3_cuda_float16, test/inductor/test_analysis.py::TestAnalysisCUDA::test_pointwise_bandwidth_maxat3_cuda_float32, test/inductor/test_analysis.py::TestAnalysisCUDA::test_triton_has_metadata_maxat0_cuda_float16, test/inductor/test_analysis.py::TestAnalysisCUDA::test_triton_has_metadata_maxat0_cuda_float32, test/inductor/test_analysis.py::TestAnalysisCUDA::test_triton_has_metadata_maxat0_cuda_float64 2025-10-10T01:32:09.5036859Z 2025-10-10T01:32:13.2555897Z Running inductor/test_triton_syntax 1/1 ... [2025-10-10 01:32:13.254954] 2025-10-10T01:32:13.2556650Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:32:13.2558745Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_triton_syntax.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:32:13.255412] 2025-10-10T01:32:20.4861231Z 2025-10-10T01:32:20.4862249Z inductor/test_triton_syntax 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_triton_syntax_1.1_96dd9f0f5728fc7a_.log 2025-10-10T01:32:20.4863616Z Running 1 items in this shard: test/inductor/test_triton_syntax.py::TestTritonSyntacticallyValid::test_triton_sqrt 2025-10-10T01:32:20.4864083Z 2025-10-10T01:32:24.3150862Z Running inductor/test_triton_extension_backend 1/1 ... [2025-10-10 01:32:24.314521] 2025-10-10T01:32:24.3151699Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:32:24.3154006Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_triton_extension_backend.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:32:24.314901] 2025-10-10T01:32:31.8941522Z 2025-10-10T01:32:31.8942882Z inductor/test_triton_extension_backend 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_triton_extension_backend_1.1_3354f39e07ca9e6b_.log 2025-10-10T01:32:31.8943733Z Running 0 items in this shard: 2025-10-10T01:32:31.8943929Z 2025-10-10T01:32:35.7580054Z Running inductor/test_utils 1/1 ... [2025-10-10 01:32:35.757429] 2025-10-10T01:32:35.7580782Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:32:35.7582553Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_utils.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:32:35.757855] 2025-10-10T01:32:39.6307340Z 2025-10-10T01:32:39.6308645Z inductor/test_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_utils_1.1_4cc7b136308b4869_.log 2025-10-10T01:32:39.6312864Z Running 7 items in this shard: test/inductor/test_utils.py::TestUtilsCUDA::testSympySubs_cuda, test/inductor/test_utils.py::TestUtilsCUDA::test_flops_fx_cuda, test/inductor/test_utils.py::TestUtilsCUDA::test_get_device_tflops_cuda_bfloat16, test/inductor/test_utils.py::TestUtilsCUDA::test_get_device_tflops_cuda_float16, test/inductor/test_utils.py::TestUtilsCUDA::test_get_device_tflops_cuda_float32, test/inductor/test_utils.py::TestUtilsCUDA::test_sympy_str_cuda, test/inductor/test_utils.py::TestUtilsCUDA::test_zip_schema_cuda 2025-10-10T01:32:39.6316274Z 2025-10-10T01:32:43.4513279Z Running inductor/test_coordinate_descent_tuner 1/1 ... [2025-10-10 01:32:43.450757] 2025-10-10T01:32:43.4514030Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:32:43.4515152Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_coordinate_descent_tuner.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:32:43.451130] 2025-10-10T01:32:50.7312456Z 2025-10-10T01:32:50.7313458Z inductor/test_coordinate_descent_tuner 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_coordinate_descent_tuner_1.1_6a2934ec14281cf7_.log 2025-10-10T01:32:50.7316471Z Running 5 items in this shard: test/inductor/test_coordinate_descent_tuner.py::TestCoordinateDescentTuner::test_abs_function, test/inductor/test_coordinate_descent_tuner.py::TestCoordinateDescentTuner::test_get_neighbour_values, test/inductor/test_coordinate_descent_tuner.py::TestCoordinateDescentTuner::test_no_neighbors, test/inductor/test_coordinate_descent_tuner.py::TestCoordinateDescentTuner::test_persistent_reduction, test/inductor/test_coordinate_descent_tuner.py::TestCoordinateDescentTuner::test_value_too_large 2025-10-10T01:32:50.7318568Z 2025-10-10T01:32:54.6029549Z Running inductor/test_inplace_padding 1/1 ... [2025-10-10 01:32:54.602426] 2025-10-10T01:32:54.6030015Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:32:54.6032588Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_inplace_padding.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:32:54.602829] 2025-10-10T01:33:02.2335795Z 2025-10-10T01:33:02.2336706Z inductor/test_inplace_padding 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_inplace_padding_1.1_55b2e80ffac4002c_.log 2025-10-10T01:33:02.2339883Z Running 8 items in this shard: test/inductor/test_inplace_padding.py::InplacePaddingTest::test_linear_and_cel_max_autotune, test/inductor/test_inplace_padding.py::InplacePaddingTest::test_mutating_padding_input, test/inductor/test_inplace_padding.py::InplacePaddingTest::test_mutating_padding_output, test/inductor/test_inplace_padding.py::InplacePaddingTest::test_pad_non_zero, test/inductor/test_inplace_padding.py::InplacePaddingTest::test_pad_non_zero_cpp_wrapper, test/inductor/test_inplace_padding.py::InplacePaddingTest::test_pad_too_large, test/inductor/test_inplace_padding.py::InplacePaddingTest::test_skip_pad_due_to_fusion, test/inductor/test_inplace_padding.py::InplacePaddingTest::test_skip_pad_input 2025-10-10T01:33:02.2342463Z 2025-10-10T01:33:06.0843452Z Running inductor/test_template_heuristics_registry 1/1 ... [2025-10-10 01:33:06.083597] 2025-10-10T01:33:06.0844788Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:33:06.0846668Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_template_heuristics_registry.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:33:06.083972] 2025-10-10T01:33:11.3590957Z 2025-10-10T01:33:11.3592131Z inductor/test_template_heuristics_registry 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_template_heuristics_registry_1.1_61d7bb43414b2b23_.log 2025-10-10T01:33:11.3595104Z Running 5 items in this shard: test/inductor/test_template_heuristics_registry.py::TestTemplateHeuristicsRegistry::test_assertion_existing_class, test/inductor/test_template_heuristics_registry.py::TestTemplateHeuristicsRegistry::test_fallback_behavior, test/inductor/test_template_heuristics_registry.py::TestTemplateHeuristicsRegistry::test_hierarchy_lookup, test/inductor/test_template_heuristics_registry.py::TestTemplateHeuristicsRegistry::test_partial_hierarchy_scenarios, test/inductor/test_template_heuristics_registry.py::TestTemplateHeuristicsRegistry::test_register_class 2025-10-10T01:33:11.3597987Z 2025-10-10T01:33:15.2853096Z Running inductor/test_select_algorithm 1/1 ... [2025-10-10 01:33:15.284648] 2025-10-10T01:33:15.2853572Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:33:15.2854601Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_select_algorithm.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:33:15.285018] 2025-10-10T01:33:22.5142487Z 2025-10-10T01:33:22.5143523Z inductor/test_select_algorithm 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_select_algorithm_1.1_c9f1f2b80a4ec2ff_.log 2025-10-10T01:33:22.5151190Z Running 23 items in this shard: test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_TritonTemplateCaller_str, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test__int_mm, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_addmm, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_addmm_fp16, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_baddbmm, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_bmm, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_convolution1, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_convolution2, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_convolution2_group, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_convolution_as_mm, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_convolution_as_mm_triton_only, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_linear_relu, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_mm, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_mm_dropout, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_mm_dup_args, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_mm_dup_args_view, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_mm_not_even_k, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_mm_plus_mm, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_mm_plus_mm2, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_mm_plus_mm3, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_mm_skip, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_preprocessing_single_choice, test/inductor/test_select_algorithm.py::TestTemplateRender::test_finalized_subclass_hooks 2025-10-10T01:33:22.5158217Z 2025-10-10T01:33:26.4219254Z Running inductor/test_extension_backend 1/1 ... [2025-10-10 01:33:26.421317] 2025-10-10T01:33:26.4220047Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:33:26.4222139Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_extension_backend.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:33:26.421714] 2025-10-10T01:33:36.4577033Z 2025-10-10T01:33:36.4578573Z inductor/test_extension_backend 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_extension_backend_1.1_0691e68051c267cd_.log 2025-10-10T01:33:36.4580302Z Running 1 items in this shard: test/inductor/test_extension_backend.py::ExtensionBackendTests::test_open_device_registration 2025-10-10T01:33:36.4581063Z 2025-10-10T01:33:40.3553079Z Running inductor/test_inductor_scheduler 1/1 ... [2025-10-10 01:33:40.354707] 2025-10-10T01:33:40.3553712Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:33:40.3555249Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_inductor_scheduler.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:33:40.355116] 2025-10-10T01:33:47.6849869Z 2025-10-10T01:33:47.6851000Z inductor/test_inductor_scheduler 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_inductor_scheduler_1.1_5969c9d2b44781db_.log 2025-10-10T01:33:47.6854091Z Running 6 items in this shard: test/inductor/test_inductor_scheduler.py::TestSchedulerCUDA::test_disable_get_estimated_runtime_logging_cuda_float16, test/inductor/test_inductor_scheduler.py::TestSchedulerCUDA::test_disable_get_estimated_runtime_logging_cuda_float32, test/inductor/test_inductor_scheduler.py::TestSchedulerCUDA::test_flop_counter_op_options0_cuda_float16, test/inductor/test_inductor_scheduler.py::TestSchedulerCUDA::test_flop_counter_op_options0_cuda_float32, test/inductor/test_inductor_scheduler.py::TestSchedulerCUDA::test_flop_counter_op_options1_cuda_float16, test/inductor/test_inductor_scheduler.py::TestSchedulerCUDA::test_flop_counter_op_options1_cuda_float32 2025-10-10T01:33:47.6856577Z 2025-10-10T01:33:51.5582267Z Running inductor/test_padding 1/1 ... [2025-10-10 01:33:51.557617] 2025-10-10T01:33:51.5583371Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:33:51.5584386Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_padding.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:33:51.558027] 2025-10-10T01:33:58.8377191Z 2025-10-10T01:33:58.8378458Z inductor/test_padding 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_padding_1.1_540e9d8848716689_.log 2025-10-10T01:33:58.8397898Z Running 54 items in this shard: test/inductor/test_padding.py::PerfTestBetweenGoodAndBadShape::test_BertForMaskedLM, test/inductor/test_padding.py::PerfTestBetweenGoodAndBadShape::test_LinearAndSoftmax_both_shapes, test/inductor/test_padding.py::PerfTestBetweenGoodAndBadShape::test_nobias_LinearAndSoftmax_both_shapes, test/inductor/test_padding.py::PerfTestWithAndWithoutPadding::test_longformer, test/inductor/test_padding.py::PerfTestWithAndWithoutPadding::test_longformer_small_bs, test/inductor/test_padding.py::PerfTestWithAndWithoutPadding::test_nvidia_deeprecommender, test/inductor/test_padding.py::PaddingTest::test_LinearAndSoftmax_codegen, test/inductor/test_padding.py::PaddingTest::test_attention, test/inductor/test_padding.py::PaddingTest::test_cat, test/inductor/test_padding.py::PaddingTest::test_conv, test/inductor/test_padding.py::PaddingTest::test_dynamic_shape_padding_shape0_alignment_bytes_32_enable_pad_False, test/inductor/test_padding.py::PaddingTest::test_dynamic_shape_padding_shape1_alignment_bytes_32_enable_pad_True, test/inductor/test_padding.py::PaddingTest::test_dynamic_shape_padding_shape2_alignment_bytes_64_enable_pad_False, test/inductor/test_padding.py::PaddingTest::test_dynamic_shape_padding_shape3_alignment_bytes_64_enable_pad_True, test/inductor/test_padding.py::PaddingTest::test_dynamic_shape_padding_shape4_alignment_bytes_32_enable_pad_False, test/inductor/test_padding.py::PaddingTest::test_dynamic_shape_padding_shape5_alignment_bytes_32_enable_pad_True, test/inductor/test_padding.py::PaddingTest::test_dynamic_shape_padding_shape6_alignment_bytes_64_enable_pad_False, test/inductor/test_padding.py::PaddingTest::test_dynamic_shape_padding_shape7_alignment_bytes_64_enable_pad_True, test/inductor/test_padding.py::PaddingTest::test_matmul, test/inductor/test_padding.py::PaddingTest::test_mm_padding_perf, test/inductor/test_padding.py::PaddingTest::test_noop_concat_output_padding_shape0_alignment_bytes_32_pad_output_False, test/inductor/test_padding.py::PaddingTest::test_noop_concat_output_padding_shape1_alignment_bytes_32_pad_output_True, test/inductor/test_padding.py::PaddingTest::test_noop_concat_output_padding_shape2_alignment_bytes_64_pad_output_False, test/inductor/test_padding.py::PaddingTest::test_noop_concat_output_padding_shape3_alignment_bytes_64_pad_output_True, test/inductor/test_padding.py::PaddingTest::test_outer_dynamic_shape_padding_shape0_alignment_bytes_32_enable_pad_False, test/inductor/test_padding.py::PaddingTest::test_outer_dynamic_shape_padding_shape1_alignment_bytes_32_enable_pad_True, test/inductor/test_padding.py::PaddingTest::test_outer_dynamic_shape_padding_shape2_alignment_bytes_64_enable_pad_False, test/inductor/test_padding.py::PaddingTest::test_outer_dynamic_shape_padding_shape3_alignment_bytes_64_enable_pad_True, test/inductor/test_padding.py::PaddingTest::test_outer_dynamic_shape_padding_shape4_alignment_bytes_32_enable_pad_False, test/inductor/test_padding.py::PaddingTest::test_outer_dynamic_shape_padding_shape5_alignment_bytes_32_enable_pad_True, test/inductor/test_padding.py::PaddingTest::test_outer_dynamic_shape_padding_shape6_alignment_bytes_64_enable_pad_False, test/inductor/test_padding.py::PaddingTest::test_outer_dynamic_shape_padding_shape7_alignment_bytes_64_enable_pad_True, test/inductor/test_padding.py::PaddingTest::test_pad_3d_tensor, test/inductor/test_padding.py::PaddingTest::test_pad_channels_last, test/inductor/test_padding.py::PaddingTest::test_pad_outputs_alignment_bytes_128_shape0_float16, test/inductor/test_padding.py::PaddingTest::test_pad_outputs_alignment_bytes_128_shape0_float32, test/inductor/test_padding.py::PaddingTest::test_pad_outputs_alignment_bytes_128_shape1_float16, test/inductor/test_padding.py::PaddingTest::test_pad_outputs_alignment_bytes_128_shape1_float32, test/inductor/test_padding.py::PaddingTest::test_pad_outputs_alignment_bytes_32_shape0_float16, test/inductor/test_padding.py::PaddingTest::test_pad_outputs_alignment_bytes_32_shape0_float32, test/inductor/test_padding.py::PaddingTest::test_pad_outputs_alignment_bytes_32_shape1_float16, test/inductor/test_padding.py::PaddingTest::test_pad_outputs_alignment_bytes_32_shape1_float32, test/inductor/test_padding.py::PaddingTest::test_pad_strides, test/inductor/test_padding.py::PaddingTest::test_pad_strides_skip, test/inductor/test_padding.py::PaddingTest::test_padmm, test/inductor/test_padding.py::PaddingTest::test_perm_outer_dynamic_shape_padding_shape0_perm0_alignment_bytes_32_enable_pad_False, test/inductor/test_padding.py::PaddingTest::test_perm_outer_dynamic_shape_padding_shape1_perm1_alignment_bytes_32_enable_pad_True, test/inductor/test_padding.py::PaddingTest::test_perm_outer_dynamic_shape_padding_shape2_perm2_alignment_bytes_64_enable_pad_True, test/inductor/test_padding.py::PaddingTest::test_perm_outer_dynamic_shape_padding_shape3_perm3_alignment_bytes_64_enable_pad_False, test/inductor/test_padding.py::PaddingTest::test_perm_outer_dynamic_shape_padding_shape4_perm4_alignment_bytes_32_enable_pad_False, test/inductor/test_padding.py::PaddingTest::test_perm_outer_dynamic_shape_padding_shape5_perm5_alignment_bytes_32_enable_pad_True, test/inductor/test_padding.py::PaddingTest::test_perm_outer_dynamic_shape_padding_shape6_perm6_alignment_bytes_64_enable_pad_True, test/inductor/test_padding.py::PaddingTest::test_perm_outer_dynamic_shape_padding_shape7_perm7_alignment_bytes_64_enable_pad_False, test/inductor/test_padding.py::PaddingTest::test_view 2025-10-10T01:33:58.8417231Z 2025-10-10T01:34:02.7789733Z Running inductor/test_codegen_triton 1/1 ... [2025-10-10 01:34:02.778430] 2025-10-10T01:34:02.7790518Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:34:02.7792476Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_codegen_triton.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:34:02.778834] 2025-10-10T01:34:09.8578500Z 2025-10-10T01:34:09.8579834Z inductor/test_codegen_triton 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_codegen_triton_1.1_785ab94d7e55a2f2_.log 2025-10-10T01:34:09.8581391Z Running 1 items in this shard: test/inductor/test_codegen_triton.py::TestCodegenTriton::test_config_of_sizearg 2025-10-10T01:34:09.8582369Z 2025-10-10T01:34:13.6566452Z Running inductor/test_torchinductor_codegen_dynamic_shapes 1/2 ... [2025-10-10 01:34:13.656086] 2025-10-10T01:34:13.6567194Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:34:13.6570226Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_codegen_dynamic_shapes.py', '-m', 'not serial', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:34:13.656454] 2025-10-10T01:44:02.4826210Z 2025-10-10T01:44:02.4828301Z inductor/test_torchinductor_codegen_dynamic_shapes 1/2 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_codegen_dynamic_shapes_1.2_7641e235d1e3f55b_.log 2025-10-10T01:44:02.5271056Z Running 874 items in this shard: test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_abs_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_adaptive_avg_pool1d_argmax_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_adaptive_avg_pool_with_output_size_0_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_adaptive_max_pool2d2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_adaptive_pool_errors_with_long_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_add_complex3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_add_complex6_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_add_complex7_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_add_complex9_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_add_const_float_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_addmv_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_aliased_buffer_reuse_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_allow_reuse_active_if_under_peak_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_aoti_eager_support_out_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_aoti_eager_with_scalar_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_arange2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_arange3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_arange4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_arange5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_arange6_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_argmax_argmin3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_argmax_argmin_with_duplicates_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_argmax_min_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_as_strided_on_views_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_as_strided_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_assert_alignment_op_name_fail_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_assert_alignment_op_name_pass_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_assert_size_stride_op_name_fail_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_assert_size_stride_op_name_pass_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d_backward2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d_backward3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d_backward_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool3d_backward2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool3d_backward3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool3d_backward4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool3d_backward_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool_errors_with_uint_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_baddbmm_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_batch_norm_2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bernoulli1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bernoulli2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bfloat16_to_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bmm2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_both_scalars_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_computed_offsets_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_default_kwargs_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int16_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int16_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int32_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int32_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int64_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int64_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int64_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int8_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int8_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int8_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int8_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_uint8_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_uint8_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_uint8_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_nd_tiling_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_buffer_copied_in_graph_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_builtins_round_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_builtins_round_float_ndigits_zero_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_builtins_round_int_ndigits_pos_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_builtins_round_int_ndigits_zero_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_empty_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_inplace_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_negative_dim_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_of_loops_and_extern_kernel_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_unbacked_2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_unbacked_empty_1d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_upcasting_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_chunk_recompiles_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_clamp_type_promotion_non_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_compar_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_complex_memory_overlap_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_config_option_dont_assume_alignment_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_config_option_dont_assume_alignment_recompiles_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_consecutive_split_cumprod_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_const_int32_to_float_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_constant_pad_2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_constant_pad_fill_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_constant_pad_nd_inplace_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_conv2d_channels_last_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_conv3d_channels_last_use_block_ptr_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_conv3d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_conv_bn_fuse_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_conv_functional_bn_fuse_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_convolution4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_copy_non_blocking_is_pinned_use_cat_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cos_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cpu_scalar_with_gpu_tensor_cpp_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cpu_scalar_with_gpu_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cpu_tensor_with_cpu_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cpu_tensor_with_gpu_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cudnn_rnn_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cumsum_no_mask_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cumsum_pattern_matcher_issue_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_custom_op_default_layout_constraint_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_custom_op_fixed_layout_sequential_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_custom_scan_would_split_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_data_type_propogation_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_deterministic_codegen_with_suffix_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_device_assert_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_diagonal_copy_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dist_bf16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div6_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div7_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div9_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div_by_zero_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div_zero_dim_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dropout_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dropout_trivial_1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtype_mismatch_issue_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_bfloat16_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_bfloat16_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_bfloat16_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_bfloat16_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float16_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float16_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float16_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float64_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float64_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float64_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float64_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int16_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int16_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int16_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int16_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int32_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int32_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int32_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int32_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int32_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int32_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int64_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int64_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int64_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_uint8_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_uint8_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_embedding_bag_byte_unpack_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_embedding_bag_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_empty_strided_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_exact_stride_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_exp2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_exp_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_expand_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fallback_mutable_op_basic_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fallback_mutable_op_list_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fallback_mutable_op_list_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fft_real_input_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_flip_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_float32_to_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fmin_fmax_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fractional_max_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fractional_max_pool2d2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_full_like_sliced_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_full_like_transposed_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_functionalize_rng_wrappers_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fuse_tiled_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_gather1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_gather2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_gather_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_gelu_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_generated_code_has_alignment_assert_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_gpu_scalar_with_cpu_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_gpu_scalar_with_gpu_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_graph_partition_constant_tensor2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_graph_partition_mutation_real_name_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_graph_partition_no_inputs_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_graph_partition_unbacked_symint_as_output_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_grid_sampler_2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_grid_sampler_expand_preserves_view_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_hardtanh_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_horizonal_fusion1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_horizonal_fusion2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_dynamic_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_propagation_abs_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_propagation_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_propagation_flip_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_put2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_put4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_put_as_masked_fill_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_put_failed_reinplace_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_put_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_select_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_indirect_load_broadcast_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_inductor_multiple_specializations_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_inner_fn_str_and_stride_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_inplace_flip_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_input_mutation2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_input_mutation5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_int8_weight_only_quant_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_invalid_operand_issue1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_isin_tensor_scalar_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_isinf2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_l1_loss_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_large_grid_use_block_ptr_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_large_pointwise_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_lerp_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_lgamma_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_like_channels_last_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_like_rands_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_like_rands_sliced_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_linear1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_linear_dynamic_maxautotune_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_linspace1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_linspace3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_linspace4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_logaddexp_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_logcumsumexp_zero_dim_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_logsumexp_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_long_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_low_memory_max_pool_dilation_1_dim_3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mark_unbacked_with_hint_override_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_masked_fill_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_masked_fill_promotion_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_masked_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_matmul_layer_norm_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_min_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d7_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d_with_indices_backward2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d_with_indices_backward4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d_with_indices_backward_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mean_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_misaligned_address_issue1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mix_device_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mixed_mm_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_move_arange_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mul_index_expr_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_multi_gpu_device_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_multi_gpu_recompile_on_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_multi_threading_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_multilayer_prime_size_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_multilayer_var_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_multilayer_var_lowp_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mutable_custom_op_fixed_layout_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_nan_sort_stable_False_descending_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_nan_sort_stable_False_descending_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_nan_to_num_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_narrow_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_new_empty_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_no_mega_fusion_during_lowering_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_no_op_reduction_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_no_specization_over_symbolic_value_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_nonzero_unbacked_refinement_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_output_strides_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pattern_matcher_multi_user_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_permute1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_permute2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_philox_rand_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pixel_shuffle_channels_last_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_airy_ai_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_bessel_j1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_bessel_y1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_chebyshev_polynomial_t_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_chebyshev_polynomial_v_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_digamma_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_entr_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_erfcx_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_erfinv_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_expit_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_gammainc_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_gammaln_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_hermite_polynomial_he_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_i1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_laguerre_polynomial_l_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_legendre_polynomial_p_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_log1p_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_log_ndtr_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_modified_bessel_i1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_modified_bessel_k0_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_modified_bessel_k1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_ndtri_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_psi_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_round_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_scaled_modified_bessel_k0_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_shifted_chebyshev_polynomial_t_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_shifted_chebyshev_polynomial_u_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_shifted_chebyshev_polynomial_w_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_xlog1py_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_polar_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pow1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pow2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pow_by_natural_log2_dynamic_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pow_symfloat_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_prod_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_profiler_mark_wrapper_call_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_randint_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_randint_int64_mod_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_randint_kernel_count_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_randn_generator_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_randn_like_empty_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_reflection_pad2d_backward_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_reflection_pad2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_reinterpret_dtypeview_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_remove_noop_clone_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_remove_noop_copy_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_remove_noop_slice_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_remove_noop_view_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_repeat_interleave_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_repeat_interleave_Tensor_decomp_int32_nd_1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_repeat_interleave_Tensor_decomp_int64_nd_1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_repeat_interleave_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_replication_pad_errors_with_bool_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_require_stride_expanded_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_reuse_buffers_with_aliasing_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_roll_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_round_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_rsqrt_dynamic_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scalar_cpu_tensor_arg_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scalar_output_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scaled_dot_product_attention_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter_add1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter_add2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter_reduce1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter_reduce2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sdpa_prefer_nd_tiling_False_use_block_ptr_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sdpa_prefer_nd_tiling_False_use_block_ptr_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sdpa_unaligned_mask_freezing_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_searchsorted_broadcast_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_searchsorted_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_select_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sgn_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_shape_padding_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_shape_prop_torch_ones_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_signbit_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_silu_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_simplify_loops_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_single_elem_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sizehint_issue1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_mutation3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_scatter2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_scatter3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_scatter4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_scatter5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_scatter_dtype_consistency_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_scatter_reinplace_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_view_with_graph_break_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sort_bool_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sort_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sort_transpose_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_special_polygamma_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_cumprod_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_cumprod_low_prec_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_cumsum_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_cumsum_low_prec_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_failed_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_reduction_dynamic_shape_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_reduction_with_int64_size_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_with_integer_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_with_list_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_with_sizes_with_unbacked_symints_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_with_unbacked_symints_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sqrt_dynamic_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_stack_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_std_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_stride_preservation_with_stride_modifying_fx_pass_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_strided_inputs_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sum1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sum2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sum5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sum_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sum_keepdims_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_tensor3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_tensor_index_slice_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_tmp_not_defined_issue1_use_block_ptr_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_tmp_not_defined_issue2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_to_device_constant_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_to_memory_format_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_topk_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_transpose_add_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_triton_kernel_bool_param_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_uint4x2_mixed_mm_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unfold_zero_dimension_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unroll_small_reduction_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unsigned_constant_tensors_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unspec_inputs_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unspec_inputs_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unspec_inputs_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unspec_inputs_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unspec_inputs_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_upsample_bilinear2d_a_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_upsample_bilinear2d_b_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_upsample_cat_conv_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_upsample_nearest2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_upsample_nearest3d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_var_mean_div_by_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_vdd_clamp_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_vectorized_ops_masked_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_vertical_fusion1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_view_as_real_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_view_on_aliased_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_view_uint8_through_differing_bitwidths_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_views1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_views3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_views4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_zero_dim_reductions_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_zero_element_mutation_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_zeros_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test__dyn_quant_matmul_4bit_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_abs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adaptive_avg_pool1d_argmax_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adaptive_avg_pool2d1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adaptive_avg_pool2d2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adaptive_avg_pool2d_low_prec_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adaptive_max_pool2d2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_add_complex6_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_add_complex8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_add_const_float_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_add_const_int_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adding_tensor_offsets_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_addmm_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_addmv_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_aliased_buffer_reuse_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_allow_reuse_active_if_under_peak_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_allow_reuse_disable_if_exceed_peak_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_angle_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_aoti_eager_override_registration_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_aoti_eager_support_str_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_aoti_eager_with_persistent_cache_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_aoti_eager_with_scalar_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_arange5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_argmax_argmin1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_argmax_argmin2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_argmax_argmin_with_duplicates_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_argmax_argmin_with_nan_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_argmax_min_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_argmax_to_float_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_as_strided_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_as_strided_scatter_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_assert_alignment_op_name_fail_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_assert_alignment_op_name_pass_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_assert_size_stride_op_name_fail_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_assert_size_stride_op_name_pass_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d6_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d7_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d_backward2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d_backward3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool3d_backward4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool3d_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool_errors_with_uint_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_baddbmm_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_batch_norm_2d_2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_batch_norm_2d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bernoulli1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bernoulli2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bitwise2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bitwise3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bool_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_both_scalars_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_add_autotune_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_broadcast_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_computed_offsets_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int16_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int16_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int16_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int32_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int32_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int32_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int32_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int64_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int64_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int8_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int8_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_uint8_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_uint8_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_uint8_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_nd_tiling_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_buffer_copied_in_graph_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_buffer_copied_in_graph_with_different_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_builtins_round_float_ndigits_neg_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_builtins_round_int_ndigits_pos_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_empty_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_empty_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_inplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_unbacked_empty_1d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_upcasting_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_clamp_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_clamp_type_promotion_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_clone_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_compar_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_complex_from_real_imag_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_complex_memory_overlap_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_config_option_dont_assume_alignment_cudagraphs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_consecutive_split_cumprod_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_const_int32_to_float_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_constant_pad_2d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_constant_pad_2d_strides_nonpositive_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_constant_pad_3d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_constant_pad_fill_dtype_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_conv1d_depthwise_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_conv2d_backward_channels_last_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_conv3d_channels_last_use_block_ptr_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_conv_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_conv_functional_bn_fuse_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_convolution3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_convolution5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_copy_with_scalar_src_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cos_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cpu_scalar_with_gpu_tensor_cpp_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cpu_scalar_with_gpu_tensor_dynamic_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cpu_scalar_with_gpu_tensor_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cpu_tensor_with_cpu_tensor_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cpu_tensor_with_gpu_tensor_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cudnn_rnn_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cumsum_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cumsum_inf_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cumsum_no_mask_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cumsum_pattern_matcher_issue_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cumsum_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_op_1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_op_3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_op_default_layout_constraint_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_op_fixed_layout_channels_last_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_op_unbacked_symints_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_scan_op_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_data_type_propogation_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_device_assert_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div9_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div_precision_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div_softmax_symfloat_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dont_constant_fold_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dropout2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dropout3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dropout_deterministic_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dropout_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtype_sympy_expr_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_bfloat16_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_bfloat16_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_bfloat16_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float16_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float16_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float32_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float32_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float32_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float32_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float32_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int16_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int16_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int16_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int16_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int32_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int32_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int32_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int64_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int64_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int64_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int64_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int64_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int8_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int8_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int8_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int8_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_uint8_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_uint8_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_embedding_bag_byte_unpack_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_embedding_bag_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_erfc_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_exact_stride_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_exp_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_expand_as_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_expand_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_expanded_reduction_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_expm1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fallback_mutable_op_basic_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fallback_mutable_op_list_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fill2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_flip_cat_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_flip_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_float16_to_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_float32_to_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_float_repr_dynamic_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fmod_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_forced_buffer_realize_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fractional_max_pool2d1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fractional_max_pool2d3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fractional_max_pool2d5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_full_boolean_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_full_like_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_full_like_sliced_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_full_like_transposed_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_full_truncation_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_functionalize_rng_wrappers_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fuse_tiled_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_gather1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_gather_scatter_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_generated_code_has_size_stride_assert_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_glu_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_arange1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_arange2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_both_scalars_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_constant_tensor1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_constant_tensor2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_mutation_real_name_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_pad_dynamic_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_scalar_inputs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_grid_sampler_expand_preserves_view_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_hardtanh_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_horizonal_fusion1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_dynamic_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_propagation_abs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_propagation_nested_indirect_indexing_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put_as_masked_fill_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put_failed_reinplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put_reinplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_remainder_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_indirect_load_broadcast_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inductor_assert_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inductor_layout_optimization_input_mutations_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inplace_activations_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inplace_flip_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inplace_resize_as_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inplace_where_pointwise_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_input_mutation1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_input_mutation2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_input_mutation3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_input_mutation4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_input_mutation5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_insignificant_strides_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_int8_weight_only_quant_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_isin_tensor_scalar_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_isinf_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_kwargs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_large_offset_pointwise_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_layer_norm_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_leaky_relu_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_like_channels_last_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_like_rands3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_like_rands_sliced_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_linear_dynamic_maxautotune_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_linspace1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_linspace3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_linspace4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_log1p_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_log_fp64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_log_softmax_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_logcumsumexp_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_logcumsumexp_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_logsumexp_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_long_tensor_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mark_unbacked_with_hint_override_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_masked_fill_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_masked_fill_promotion_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d7_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d_with_indices_backward2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d_with_indices_backward4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d_with_indices_backward5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_min_max_reduction_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_min_max_reduction_nan_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_misaligned_address_issue1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mix_device_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mixed_mm2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mixed_mm_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mm_views_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_move_arange_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mul_softmax_symfloat_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_multi_device_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_multi_gpu_recompile_on_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_multilayer_any_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_multilayer_prime_size_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_multilayer_sum_low_prec_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_multilayer_var_lowp_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mutable_custom_op_fixed_layout_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mutations_loop_fusion_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_nan_sort_stable_False_descending_False_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_nan_sort_stable_True_descending_False_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_nan_to_num_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_narrow_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_neg_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_neg_max_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_new_ones_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_no_mega_fusion_during_lowering_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_no_op_reduction_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_nonzero_unbacked_refinement_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_norm_constant_overflow_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_one_hot_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_output_strides_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pad_cast_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pad_single_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pad_view_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pattern_matcher_unbacked_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_permute2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_philox_rand_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_airy_ai_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_bessel_y1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_chebyshev_polynomial_u_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_chebyshev_polynomial_v_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_digamma_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_entr_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_erf_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_erfcx_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_exp2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_gammainc_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_hermite_polynomial_he_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_i0e_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_laguerre_polynomial_l_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_legendre_polynomial_p_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_modified_bessel_i0_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_modified_bessel_i1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_modified_bessel_k1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_ndtr_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_ndtri_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_polygamma_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_scaled_modified_bessel_k1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_shifted_chebyshev_polynomial_u_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_sinc_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_spherical_bessel_j0_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_xlogy_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_prod_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_profiler_mark_wrapper_call_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_randint_distribution_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_randint_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_randint_int64_mod_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_randint_kernel_count_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_reduction3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_reduction4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_reduction5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_reduction_config_limit_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_reflection_pad2d_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_reinterpret_dtypeview_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remainder_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remove_noop_clone_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remove_noop_copy_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remove_noop_slice1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remove_noop_slice_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remove_noop_slice_scatter_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remove_noop_view_default_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_repeat_interleave_Tensor_decomp_int64_nd_1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_repeat_interleave_Tensor_decomp_int64_nd_2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_repeat_interleave_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_replication_pad_errors_with_bool_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_roi_align_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_round_correctness_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_round_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_rsqrt_dynamic_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scaled_dot_product_attention_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter6_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter_add1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter_bf16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter_reduce2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter_reduce3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sdpa_prefer_nd_tiling_False_use_block_ptr_False_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sdpa_prefer_nd_tiling_False_use_block_ptr_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sdpa_prefer_nd_tiling_True_use_block_ptr_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sdpa_unaligned_mask_freezing_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_searchsorted_broadcast_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_setitem_with_int_parameter_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_shape_padding_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_should_pad_bench_for_bmm_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sigmoid_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sign_dtype_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_signbit_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_simplify_loops_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice_mutation1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice_scatter3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice_scatter4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice_scatter5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice_scatter_dtype_consistency_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice_view_with_graph_break_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_softmax_one_kernel_loop_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sort_transpose_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_special_polygamma_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_cumsum_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_reduction_dynamic_shape_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_reduction_with_int64_size_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_with_list_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_with_sizes_with_unbacked_symints_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_with_unbacked_symints_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_squeeze_varargs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_stack_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_std_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_stride_preservation_with_stride_modifying_fx_pass_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_strided_inputs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sum2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sum_dtype_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sum_int_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sum_keepdims_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_tan_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_tensor1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_tensor2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_tensor3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_tensor_index_put_slice_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_to_device_constant_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_to_device_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_to_memory_format_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_topk_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_transposed_propagates_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_uint_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unbacked_floordiv_simplify_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unbind_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unsigned_constant_tensors_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unspec_inputs_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unspec_inputs_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unspec_inputs_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_upsample_bicubic2d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_upsample_bilinear2d_b_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_upsample_cat_conv_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_upsample_nearest1d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_upsample_nearest2d_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_upsample_nearest3d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_var_mean_div_by_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_var_mean_tile_reduction_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_vdd_clamp_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_vectorized_ops_masked_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_vertical_fusion1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_views1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_views5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_views6_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_where_broadcast_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_xblock_divides_xnumel_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_zero_element_mutation_dynamic_shapes_cuda 2025-10-10T01:44:02.5704789Z 2025-10-10T01:44:03.8429113Z Uploading artifacts took 1.36 seconds 2025-10-10T01:44:06.3471884Z Running export/test_export_training_ir_to_run_decomp 1/1 ... [2025-10-10 01:44:06.346555] 2025-10-10T01:44:06.3472494Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:44:06.3473705Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_export_training_ir_to_run_decomp.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:44:06.346924] 2025-10-10T01:44:15.2299510Z 2025-10-10T01:44:15.2300950Z export/test_export_training_ir_to_run_decomp 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_export_training_ir_to_run_decomp_1.1_a47bfa21cc01caa5_.log 2025-10-10T01:44:15.3036379Z Running 866 items in this shard: test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestDynamismExpression::test_export_assume_static_by_default_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestDynamismExpression::test_export_constraints_error_not_in_range_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestDynamismExpression::test_export_constraints_error_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestDynamismExpression::test_export_inline_constraints_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestDynamismExpression::test_export_slice_maxsize_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestDynamismExpression::test_export_slice_unbacked_dim1_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestDynamismExpression::test_export_strict_narrow_unbacked_expr_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestDynamismExpression::test_no_grad_param_inplace_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestDynamismExpression::test_reshape_view_backed_size_oblivious_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestDynamismExpression::test_export_assume_static_by_default_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestDynamismExpression::test_export_constraints_error_not_in_range_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestDynamismExpression::test_export_constraints_error_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestDynamismExpression::test_export_inline_constraints_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestDynamismExpression::test_export_slice_maxsize_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestDynamismExpression::test_export_slice_unbacked_dim1_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestDynamismExpression::test_export_strict_narrow_unbacked_expr_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestDynamismExpression::test_no_grad_param_inplace_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestDynamismExpression::test_reshape_view_backed_size_oblivious_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test__scaled_dot_product_flash_attention_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_additional_inputs_constants_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_allow_explicit_guards_as_runtime_asserts_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_args_type_checked_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_aten_lift_fresh_copy_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_attention_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_attr_assignment_extra_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_automatic_constrain_size_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_automatic_dynamic_shapes_constant_relation_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_automatic_dynamic_shapes_linear_relation_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_automatic_dynamic_shapes_simple_equality_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_baddbmm_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_basic_non_strict_fake_tensor_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_basic_non_strict_real_tensor_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_basic_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_bincount_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_buffer_util_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_capture_subclass_constructor_torch_ir_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_capture_subclass_constructor_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_capture_subclass_wrong_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_ccode_python_mod_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_cdist_forward_compute_mode_zero_export_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_check_specialized_int_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_checks_to_constrain_range_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_cleanup_dynamic_markers_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_colin_unbacked_backed_vr_sub_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_colon_parameter_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_compiling_state_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_cond_access_identical_symint_closure_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_cond_branches_return_constant_int_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_cond_branches_return_same_int_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_cond_buffers_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_cond_contains_unbacked_no_escape_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_cond_int_closure_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_cond_unflatten_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_cond_with_module_stack_export_with_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_cond_with_module_stack_export_with_unflatten_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_constant_aliasing_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_constant_input_naming_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_constant_no_user_inp_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_constant_output_dup_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_constant_output_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_constant_requires_grad_const_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_constant_return_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_constant_tensor_mutation_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_constant_tensor_with_non_functional_nested_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_constant_tensor_with_non_functional_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_constrain_decomp_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_constrain_size_in_eager_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_constrain_size_with_constrain_value_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_constrain_size_with_various_cases_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_conv_dynamic_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_crop_like_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_cse_for_symint_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_custom_op_auto_functionalize_pre_dispatch_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_custom_op_auto_functionalize_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_custom_op_auto_warn_pre_dispatch_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_custom_op_preserve_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_custom_pytree_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_custom_tag_metadata_re_export_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_decomp_batch_norm_functional_predispatch_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_decomp_item_in_prim_after_decomposition_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_decomp_item_in_prim_before_decomposition_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_default_decomposition_core_cia_ops_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_derived_dim_1_2_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_derived_dim_basic_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_derived_dim_integer_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_derived_dim_nested_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_derived_dim_out_of_order_repeat_derived_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_derived_dim_out_of_order_simplified_repeat_non_derived_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_derived_dim_out_of_order_simplified_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_derived_dim_out_of_order_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_derived_dim_repeat_derived_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_detect_leak_nonstrict_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_detect_leak_nonstrict_with_stacktrace_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_detect_leak_strict_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_device_to_dynamic_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_device_to_gpu_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_device_to_mutation_float_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_device_to_mutation_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_device_to_static_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_dim_1_2_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_dim_auto_and_dim_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_dim_dynamic_divisibility_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_dim_dynamic_specialization_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_dim_dynamic_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_dim_hint_range_violations_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_dim_hint_ranges_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_disable_forced_specializations_errors_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_disable_forced_specializations_ok_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_distributed_all_gather_into_tensor_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_distributed_all_gather_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_distributed_all_reduce_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_distributed_all_to_all_single_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_distributed_reduce_scatter_tensor_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_dont_duck_size_for_auto_dynamic_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_double_lifted_constants_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_draft_export_checks_aliasing_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_draft_export_checks_mutation_list_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_draft_export_checks_mutation_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_draft_export_checks_mutation_with_nan_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_draft_export_fake_kernel_inference_errors_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_draft_export_infers_fake_kernel_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_duplicate_modules_with_non_persistent_buffers_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_dynamic_lr_shift_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_dynamic_shapes_bounds_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_dynamic_shapes_builder_basic_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_dynamic_shapes_builder_kwargs_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_dynamic_shapes_builder_pytree_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_dynamic_shapes_dataclass_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_dynamic_shapes_inferred_basic_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_dynamic_shapes_serdes_generic_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_dynamic_shapes_serdes_user_errors_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_dynamic_shapes_serdes_various_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_dynamic_shapes_spec_with_pytree_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_dynamic_shapes_wrapped_with_shape_guards_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_dynamic_sym_round_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_ends_of_bounds_oblivious_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_error_does_not_reference_eager_fallback_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_error_when_passing_mutating_primitive_op_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_exception_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_expand_copy_export_handles_implicit_true_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_api_with_dynamic_shapes_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_as_backend_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_associative_scan_lifted_buffers_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_associative_scan_symbol_dim_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_associative_scan_symbol_scandim_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_aten_to_unflatten_subclass_pre_dispatch_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_aten_to_unflatten_subclass_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_aten_to_unflatten_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_cond_preserve_torch_fn_for_subgraphs_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_cond_symbool_pred_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_cond_warns_constant_pred_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_custom_decomp_table_basic_pop_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_custom_decomp_table_container_methods_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_custom_op_lib_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_custom_triton_kernel_mutable_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_custom_triton_kernel_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_cyclic_reference_leak_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_decomp_torture_case_1_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_decomp_torture_case_2_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_decomps_dynamic_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_decomps_simple_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_dynamo_config_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_for_training_run_decomp_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_for_training_with_container_type_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_for_training_with_dynamic_shapes_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_for_training_with_mutation_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_for_training_with_state_dict_hooks_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_func_with_default_kwargs_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_func_with_keyword_only_args_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_func_with_kwargs_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_func_with_pytree_kwargs_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_func_with_var_keyword_args_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_func_with_var_keyword_pytree_args_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_func_with_var_postional_args_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_function_schema_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_graph_with_no_inputs_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_input_mutation_bug_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_input_mutation_dynamic_shape_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_input_mutation_static_shape_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_leak_compile_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_linear_preserve_dynamic_shape_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_max_nonstrict_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_max_onnx_reported_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_method_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_mod_constraints_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_module_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_preserve_linear_at_aot_level_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_preserve_linear_but_not_custom_op_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_rnn_variants_with_warning_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_scan_pytree_output_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_script_module_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_statically_known_true_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_then_compile_tensor_ctor_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_with_autocast_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_with_fake_tensor_inputs_on_cuda_devices_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_with_fake_tensor_inputs_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_with_inline_constraints_complex_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_with_inline_constraints_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_with_set_grad_enabled_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_export_with_wrong_inputs_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_external_call_non_strict_real_tensor_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_fake_inputs_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_fake_weights_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_filter_traceback_frames_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_flex_attention_export_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_float_conversion_from_int_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_float_conversion_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_fqn_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_from_node_metadata_export_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_full_on_scalar_tensor_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_function_holding_tensor_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_hints_wrapper_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_hoo_inline_users_issue_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_if_functional_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_if_post_autograd_op_preserved_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_inductor_backend_inside_nonstrict_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_inline_script_class_method_recursive_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_inline_script_class_method_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_inline_script_function_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_inline_script_method_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_int_shape_specialization_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_intermediate_shape_comp_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_is_exporting_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_is_nonzero_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_isnonzero_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_issue_113041_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_issue_157289_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_issue_161902_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_istft_op_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_keep_composite_ops_invalid_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_keep_composite_ops_linear_convd_for_training_ir_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_keep_composite_ops_linear_convd_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_kwarg_dynamic_shapes_diff_order_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_kwargs_reorder_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_layer_norm_unbacked_normalized_shape_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_layer_sharing_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_lazy_module_kwargs_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_lifted_constants_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_linear_conv_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_malformed_fqn_from_source_name_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_map_buffers_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_map_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_mask_nonzero_static_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_masked_select_dynamic_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_math_pow_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_mismatched_dynamic_shapes_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_mixed_input_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_module_dict_key_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_module_input_subclasses_parameterization_nested_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_module_input_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_module_list_slice_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_module_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_module_with_dict_container_inp_out_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_modules_access_for_deleted_submodule_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_more_multidimensional_slicing_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_multidimensional_slicing_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_multinomial_dynamic_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_multiple_definitions_same_name_dim_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_namedtuple_input_export_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_native_multi_attention_head_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_nested_dynamic_shapes_spec_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_nested_module_fake_tensor_leak_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_nested_module_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_nested_module_with_constant_buffer_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_nested_module_with_init_buffer_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_nested_module_with_parameter_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_nn_module_stack_shared_submodule_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_nn_module_stack_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_no_check_is_size_error_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_no_suggested_fixes_for_data_dependent_errors_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_no_tensor_computation_2_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_no_tensor_computation_3_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_no_tensor_computation_4_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_no_tensor_computation_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_non_arg_name_dynamic_shapes_api_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_non_arg_name_dynamic_shapes_api_with_container_type_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_non_arg_name_dynamic_shapes_api_with_kwarg_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_non_persistent_buffer_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_non_strict_dynamic_shapes_suggested_fixes_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_non_strict_dynamic_shapes_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_none_buffers_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_nonstrict_retrace_preserves_metadata_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_nonzero_2_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_nonzero_dynamic_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_not_registered_parameter_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_operator_aten_tensor_mode_variant_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_output_node_name_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_pad_sequence_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_param_util_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_partial_patched_forward_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_placeholder_naming_collisions_hoo_subgraphs_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_placeholder_naming_collisions_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_placeholder_naming_order_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_placeholder_naming_order_variadic_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_placeholder_update_preserving_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_predispatch_cond_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_predispatch_grad_wrappers_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_preserve_annotation_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_preserve_module_call_signature_unflatten_specialization_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_preserve_requires_grad_placeholders_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_preserve_shape_dynamism_for_unused_inputs_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_profiling_code_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_python_asserts_with_sym_int_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_pytree_register_data_class_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_pytree_register_nested_data_class_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_raise_user_error_when_guard_on_data_dependent_operation_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_range_constraints_with_replacement_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_real_tensor_alias_dtype_mismatch_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_real_tensor_bool_cast_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_real_tensor_errors_on_aliasing_custom_op_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_real_tensor_for_max_op_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_real_tensor_size_mismatch_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_redundant_assert_max_upper_bound_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_redundant_asserts_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_refine_dynamic_shapes_from_suggested_fixes_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_register_constant_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_repeat_interleave_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_replace_unbacked_with_very_large_upperbound_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_replaced_unbacked_bindings_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_reshape_view_helper_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_retracable_ep_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_retrace_pre_autograd_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_run_decomposition_supports_user_input_mutation_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_run_decompositions_keep_metadata_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_run_decompositions_keep_tensor_constant_metadata_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_runtime_assert_for_prim_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_runtime_assert_for_prm_str_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_runtime_assert_with_size_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_sdpa_gqa_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_sequential_slicing_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_set_example_inputs_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_set_grad_as_side_effect_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_set_grad_empty_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_set_grad_unflatten_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_setgrad_lifted_tensor_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_shared_submodule_nn_module_stack_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_simple_export_for_training_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_simple_unbacked_view_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_size_input_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_slice_nn_module_stack_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_solver_unsupported_sympy_function_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_specialize_derived_dim_roots_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_split_const_gm_with_lifted_constants_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_stack_trace_make_fx_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_stack_trace_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_state_primitives_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_state_shape_attribute_assignment_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_state_tensors_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_static_dim_constraints_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_subclass_context_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_subclass_nested_attr_access_complicated_metadata_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_subclass_nested_attr_access_const_metadata_not_top_level_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_subclass_nested_attr_access_const_metadata_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_subclass_nested_attr_access_submodule_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_subclass_nested_attr_access_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_subclasses_parameterization_nested_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_subclasses_parameterization_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_suggest_torch_checks_with_non_negative_check_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_suggest_torch_checks_with_regular_check_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_suggested_fixes_for_data_dependent_errors_basic_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_suggested_fixes_for_data_dependent_errors_puzzlers_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_suggested_fixes_new_roots_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_sym_float_operators_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_sym_or_sym_and_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_sym_sqrt_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_symbool_item_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_symfloat_item_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_symint_input_additional_inputs_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_symint_input_basic_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_symint_input_ranges_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_symint_input_shapes_collection_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_symint_input_specialization_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_symint_item_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_symint_output_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_symint_tensor_return_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_tag_ac_export_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_tensor_attribute_zero_args_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_tensor_constant_aten_to_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_tensor_constant_with_wrapped_method_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_to_module_with_mutated_buffer_multiple_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_to_module_with_mutated_buffer_multiple_update_sub_later_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_to_module_with_mutated_buffer_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_tolist_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_torch_check_eq_commutativity_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_torch_fn_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_trace_under_fake_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_train_eval_on_exported_preautograd_module_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unbacked_3d_matmul_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unbacked_bincount_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unbacked_bindings_for_divisible_u_symint_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unbacked_deferred_runtime_retrace_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unbacked_expand_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unbacked_infer_size_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unbacked_kth_value_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unbacked_linear_layer_norm_input_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unbacked_noncontig_lin_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unbacked_pad_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unbacked_scalar_constructor_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unbacked_slice_forward_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unbacked_slice_simple_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unbacked_stack_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unbacked_to_cond_passthrough_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unbacked_to_cond_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unbacked_unsqueeze_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_asserts_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_buffer_update_child2parent_swap_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_closure_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_isinstance_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_multiple_graphs_dispatch_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_multiple_graphs_preserve_signature_no_error_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_multiple_graphs_shared_submodule_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_multiple_graphs_state_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_no_unroll_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_placeholder_update_child2parent_swap_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_placeholder_update_grandchild2cousin_swap_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_random_dag_5_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_random_dag_6_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_random_dag_buf_8_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_random_dag_const_preserving_3_1_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_random_dag_const_preserving_3_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_random_dag_mutating_buf_4_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_random_dag_mutating_buf_6_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_random_dag_mutating_buf_9_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_random_dag_mutating_buf_preserving_10_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_random_dag_mutating_buf_preserving_4_1_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_random_dag_mutating_buf_preserving_4_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_random_dag_mutating_buf_preserving_5_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_random_dag_mutating_buf_preserving_7_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unflatten_random_dag_preserving_4_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unused_aliases_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_unused_constant_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_use_embedding_twice_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_user_input_and_buffer_mutation_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_vmap_custom_autograd_function_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_vmap_to_assert_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_vmap_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_where_decomp_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_while_loop_assert_separation_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_while_loop_index_assertions_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_while_loop_simple_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_while_loop_tensor_constant_idx_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_wrapper_module_training_ir_to_decomp_strict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test__scaled_dot_product_flash_attention_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_additional_inputs_constants_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_allow_explicit_guards_as_runtime_asserts_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_args_type_checked_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_aten_lift_fresh_copy_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_attention_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_attr_assignment_extra_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_automatic_constrain_size_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_automatic_dynamic_shapes_constant_relation_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_automatic_dynamic_shapes_linear_relation_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_automatic_dynamic_shapes_simple_equality_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_baddbmm_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_basic_non_strict_fake_tensor_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_basic_non_strict_real_tensor_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_basic_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_bincount_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_buffer_util_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_capture_subclass_constructor_torch_ir_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_capture_subclass_constructor_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_capture_subclass_wrong_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_ccode_python_mod_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_cdist_forward_compute_mode_zero_export_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_check_specialized_int_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_checks_to_constrain_range_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_cleanup_dynamic_markers_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_colin_unbacked_backed_vr_sub_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_colon_parameter_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_compiling_state_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_cond_access_identical_symint_closure_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_cond_branches_return_constant_int_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_cond_branches_return_same_int_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_cond_buffers_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_cond_contains_unbacked_no_escape_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_cond_int_closure_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_cond_unflatten_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_cond_with_module_stack_export_with_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_cond_with_module_stack_export_with_unflatten_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_constant_aliasing_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_constant_input_naming_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_constant_no_user_inp_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_constant_output_dup_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_constant_output_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_constant_requires_grad_const_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_constant_return_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_constant_tensor_mutation_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_constant_tensor_with_non_functional_nested_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_constant_tensor_with_non_functional_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_constrain_decomp_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_constrain_size_in_eager_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_constrain_size_with_constrain_value_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_constrain_size_with_various_cases_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_conv_dynamic_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_crop_like_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_cse_for_symint_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_custom_op_auto_functionalize_pre_dispatch_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_custom_op_auto_functionalize_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_custom_op_auto_warn_pre_dispatch_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_custom_op_preserve_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_custom_pytree_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_custom_tag_metadata_re_export_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_decomp_batch_norm_functional_predispatch_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_decomp_item_in_prim_after_decomposition_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_decomp_item_in_prim_before_decomposition_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_default_decomposition_core_cia_ops_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_derived_dim_1_2_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_derived_dim_basic_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_derived_dim_integer_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_derived_dim_nested_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_derived_dim_out_of_order_repeat_derived_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_derived_dim_out_of_order_simplified_repeat_non_derived_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_derived_dim_out_of_order_simplified_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_derived_dim_out_of_order_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_derived_dim_repeat_derived_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_detect_leak_nonstrict_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_detect_leak_nonstrict_with_stacktrace_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_detect_leak_strict_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_device_to_dynamic_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_device_to_gpu_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_device_to_mutation_float_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_device_to_mutation_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_device_to_static_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_dim_1_2_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_dim_auto_and_dim_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_dim_dynamic_divisibility_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_dim_dynamic_specialization_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_dim_dynamic_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_dim_hint_range_violations_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_dim_hint_ranges_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_disable_forced_specializations_errors_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_disable_forced_specializations_ok_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_distributed_all_gather_into_tensor_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_distributed_all_gather_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_distributed_all_reduce_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_distributed_all_to_all_single_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_distributed_reduce_scatter_tensor_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_dont_duck_size_for_auto_dynamic_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_double_lifted_constants_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_draft_export_checks_aliasing_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_draft_export_checks_mutation_list_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_draft_export_checks_mutation_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_draft_export_checks_mutation_with_nan_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_draft_export_fake_kernel_inference_errors_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_draft_export_infers_fake_kernel_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_duplicate_modules_with_non_persistent_buffers_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_dynamic_lr_shift_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_dynamic_shapes_bounds_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_dynamic_shapes_builder_basic_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_dynamic_shapes_builder_kwargs_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_dynamic_shapes_builder_pytree_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_dynamic_shapes_dataclass_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_dynamic_shapes_inferred_basic_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_dynamic_shapes_serdes_generic_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_dynamic_shapes_serdes_user_errors_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_dynamic_shapes_serdes_various_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_dynamic_shapes_spec_with_pytree_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_dynamic_shapes_wrapped_with_shape_guards_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_dynamic_sym_round_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_ends_of_bounds_oblivious_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_error_does_not_reference_eager_fallback_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_error_when_passing_mutating_primitive_op_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_exception_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_expand_copy_export_handles_implicit_true_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_api_with_dynamic_shapes_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_as_backend_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_associative_scan_lifted_buffers_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_associative_scan_symbol_dim_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_associative_scan_symbol_scandim_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_aten_to_unflatten_subclass_pre_dispatch_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_aten_to_unflatten_subclass_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_aten_to_unflatten_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_cond_preserve_torch_fn_for_subgraphs_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_cond_symbool_pred_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_cond_warns_constant_pred_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_custom_decomp_table_basic_pop_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_custom_decomp_table_container_methods_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_custom_op_lib_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_custom_triton_kernel_mutable_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_custom_triton_kernel_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_cyclic_reference_leak_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_decomp_torture_case_1_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_decomp_torture_case_2_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_decomps_dynamic_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_decomps_simple_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_dynamo_config_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_for_training_run_decomp_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_for_training_with_container_type_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_for_training_with_dynamic_shapes_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_for_training_with_mutation_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_for_training_with_state_dict_hooks_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_func_with_default_kwargs_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_func_with_keyword_only_args_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_func_with_kwargs_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_func_with_pytree_kwargs_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_func_with_var_keyword_args_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_func_with_var_keyword_pytree_args_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_func_with_var_postional_args_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_function_schema_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_graph_with_no_inputs_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_input_mutation_bug_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_input_mutation_dynamic_shape_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_input_mutation_static_shape_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_leak_compile_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_linear_preserve_dynamic_shape_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_max_nonstrict_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_max_onnx_reported_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_method_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_mod_constraints_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_module_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_preserve_linear_at_aot_level_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_preserve_linear_but_not_custom_op_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_rnn_variants_with_warning_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_scan_pytree_output_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_script_module_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_statically_known_true_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_then_compile_tensor_ctor_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_with_autocast_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_with_fake_tensor_inputs_on_cuda_devices_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_with_fake_tensor_inputs_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_with_inline_constraints_complex_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_with_inline_constraints_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_with_set_grad_enabled_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_export_with_wrong_inputs_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_external_call_non_strict_real_tensor_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_fake_inputs_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_fake_weights_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_filter_traceback_frames_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_flex_attention_export_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_float_conversion_from_int_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_float_conversion_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_fqn_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_from_node_metadata_export_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_full_on_scalar_tensor_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_function_holding_tensor_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_hints_wrapper_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_hoo_inline_users_issue_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_if_functional_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_if_post_autograd_op_preserved_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_inductor_backend_inside_nonstrict_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_inline_script_class_method_recursive_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_inline_script_class_method_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_inline_script_function_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_inline_script_method_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_int_shape_specialization_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_intermediate_shape_comp_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_is_exporting_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_is_nonzero_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_isnonzero_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_issue_113041_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_issue_157289_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_issue_161902_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_istft_op_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_keep_composite_ops_invalid_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_keep_composite_ops_linear_convd_for_training_ir_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_keep_composite_ops_linear_convd_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_kwarg_dynamic_shapes_diff_order_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_kwargs_reorder_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_layer_norm_unbacked_normalized_shape_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_layer_sharing_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_lazy_module_kwargs_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_lifted_constants_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_linear_conv_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_malformed_fqn_from_source_name_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_map_buffers_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_map_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_mask_nonzero_static_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_masked_select_dynamic_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_math_pow_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_mismatched_dynamic_shapes_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_mixed_input_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_module_dict_key_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_module_input_subclasses_parameterization_nested_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_module_input_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_module_list_slice_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_module_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_module_with_dict_container_inp_out_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_modules_access_for_deleted_submodule_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_more_multidimensional_slicing_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_multidimensional_slicing_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_multinomial_dynamic_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_multiple_definitions_same_name_dim_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_namedtuple_input_export_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_native_multi_attention_head_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_nested_dynamic_shapes_spec_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_nested_module_fake_tensor_leak_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_nested_module_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_nested_module_with_constant_buffer_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_nested_module_with_init_buffer_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_nested_module_with_parameter_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_nn_module_stack_shared_submodule_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_nn_module_stack_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_no_check_is_size_error_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_no_suggested_fixes_for_data_dependent_errors_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_no_tensor_computation_2_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_no_tensor_computation_3_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_no_tensor_computation_4_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_no_tensor_computation_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_non_arg_name_dynamic_shapes_api_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_non_arg_name_dynamic_shapes_api_with_container_type_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_non_arg_name_dynamic_shapes_api_with_kwarg_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_non_persistent_buffer_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_non_strict_dynamic_shapes_suggested_fixes_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_non_strict_dynamic_shapes_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_none_buffers_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_nonstrict_retrace_preserves_metadata_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_nonzero_2_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_nonzero_dynamic_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_not_registered_parameter_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_operator_aten_tensor_mode_variant_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_output_node_name_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_pad_sequence_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_param_util_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_partial_patched_forward_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_placeholder_naming_collisions_hoo_subgraphs_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_placeholder_naming_collisions_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_placeholder_naming_order_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_placeholder_naming_order_variadic_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_placeholder_update_preserving_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_predispatch_cond_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_predispatch_grad_wrappers_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_preserve_annotation_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_preserve_module_call_signature_unflatten_specialization_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_preserve_requires_grad_placeholders_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_preserve_shape_dynamism_for_unused_inputs_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_profiling_code_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_python_asserts_with_sym_int_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_pytree_register_data_class_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_pytree_register_nested_data_class_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_raise_user_error_when_guard_on_data_dependent_operation_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_range_constraints_with_replacement_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_real_tensor_alias_dtype_mismatch_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_real_tensor_bool_cast_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_real_tensor_errors_on_aliasing_custom_op_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_real_tensor_for_max_op_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_real_tensor_size_mismatch_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_redundant_assert_max_upper_bound_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_redundant_asserts_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_refine_dynamic_shapes_from_suggested_fixes_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_register_constant_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_repeat_interleave_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_replace_unbacked_with_very_large_upperbound_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_replaced_unbacked_bindings_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_reshape_view_helper_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_retracable_ep_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_retrace_pre_autograd_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_run_decomposition_supports_user_input_mutation_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_run_decompositions_keep_metadata_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_run_decompositions_keep_tensor_constant_metadata_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_runtime_assert_for_prim_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_runtime_assert_for_prm_str_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_runtime_assert_with_size_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_sdpa_gqa_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_sequential_slicing_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_set_example_inputs_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_set_grad_as_side_effect_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_set_grad_empty_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_set_grad_unflatten_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_setgrad_lifted_tensor_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_shared_submodule_nn_module_stack_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_simple_export_for_training_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_simple_unbacked_view_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_size_input_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_slice_nn_module_stack_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_solver_unsupported_sympy_function_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_specialize_derived_dim_roots_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_split_const_gm_with_lifted_constants_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_stack_trace_make_fx_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_stack_trace_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_state_primitives_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_state_shape_attribute_assignment_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_state_tensors_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_static_dim_constraints_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_subclass_context_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_subclass_nested_attr_access_complicated_metadata_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_subclass_nested_attr_access_const_metadata_not_top_level_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_subclass_nested_attr_access_const_metadata_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_subclass_nested_attr_access_submodule_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_subclass_nested_attr_access_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_subclasses_parameterization_nested_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_subclasses_parameterization_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_suggest_torch_checks_with_non_negative_check_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_suggest_torch_checks_with_regular_check_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_suggested_fixes_for_data_dependent_errors_basic_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_suggested_fixes_for_data_dependent_errors_puzzlers_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_suggested_fixes_new_roots_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_sym_float_operators_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_sym_or_sym_and_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_sym_sqrt_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_symbool_item_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_symfloat_item_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_symint_input_additional_inputs_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_symint_input_basic_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_symint_input_ranges_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_symint_input_shapes_collection_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_symint_input_specialization_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_symint_item_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_symint_output_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_symint_tensor_return_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_tag_ac_export_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_tensor_attribute_zero_args_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_tensor_constant_aten_to_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_tensor_constant_with_wrapped_method_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_to_module_with_mutated_buffer_multiple_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_to_module_with_mutated_buffer_multiple_update_sub_later_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_to_module_with_mutated_buffer_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_tolist_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_torch_check_eq_commutativity_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_torch_fn_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_trace_under_fake_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_train_eval_on_exported_preautograd_module_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unbacked_3d_matmul_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unbacked_bincount_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unbacked_bindings_for_divisible_u_symint_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unbacked_deferred_runtime_retrace_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unbacked_expand_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unbacked_infer_size_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unbacked_kth_value_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unbacked_linear_layer_norm_input_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unbacked_noncontig_lin_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unbacked_pad_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unbacked_scalar_constructor_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unbacked_slice_forward_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unbacked_slice_simple_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unbacked_stack_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unbacked_to_cond_passthrough_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unbacked_to_cond_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unbacked_unsqueeze_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_asserts_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_buffer_update_child2parent_swap_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_closure_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_isinstance_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_multiple_graphs_dispatch_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_multiple_graphs_preserve_signature_no_error_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_multiple_graphs_shared_submodule_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_multiple_graphs_state_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_no_unroll_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_placeholder_update_child2parent_swap_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_placeholder_update_grandchild2cousin_swap_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_random_dag_5_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_random_dag_6_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_random_dag_buf_8_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_random_dag_const_preserving_3_1_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_random_dag_const_preserving_3_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_random_dag_mutating_buf_4_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_random_dag_mutating_buf_6_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_random_dag_mutating_buf_9_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_random_dag_mutating_buf_preserving_10_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_random_dag_mutating_buf_preserving_4_1_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_random_dag_mutating_buf_preserving_4_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_random_dag_mutating_buf_preserving_5_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_random_dag_mutating_buf_preserving_7_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unflatten_random_dag_preserving_4_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unused_aliases_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_unused_constant_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_use_embedding_twice_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_user_input_and_buffer_mutation_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_vmap_custom_autograd_function_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_vmap_to_assert_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_vmap_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_where_decomp_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_while_loop_assert_separation_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_while_loop_index_assertions_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_while_loop_simple_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_while_loop_tensor_constant_idx_training_ir_to_decomp_nonstrict, test/export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportNonStrictTestExport::test_wrapper_module_training_ir_to_decomp_nonstrict 2025-10-10T01:44:15.3547356Z 2025-10-10T01:44:19.0616458Z Running inductor/test_indexing 1/1 ... [2025-10-10 01:44:19.061057] 2025-10-10T01:44:19.0616994Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:44:19.0618127Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_indexing.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:44:19.061440] 2025-10-10T01:44:26.3408196Z 2025-10-10T01:44:26.3409492Z inductor/test_indexing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_indexing_1.1_4a79f373842475ab_.log 2025-10-10T01:44:26.3417088Z Running 22 items in this shard: test/inductor/test_indexing.py::TestIndexingSimplification::test_expand_floor_div_applied, test/inductor/test_indexing.py::TestIndexingSimplification::test_expand_floor_div_skipped, test/inductor/test_indexing.py::TestIndexingSimplification::test_floordiv_div_sympy_is_integer_bug, test/inductor/test_indexing.py::TestIndexingSimplification::test_indexing_join, test/inductor/test_indexing.py::TestIndexingSimplification::test_indexing_simplification, test/inductor/test_indexing.py::TestIndexingSimplification::test_int8_unpack, test/inductor/test_indexing.py::TestIndexingSimplification::test_modular_indexing_pairs_merged, test/inductor/test_indexing.py::TestIndexingSimplification::test_modular_indexing_pairs_not_merged, test/inductor/test_indexing.py::TestIndexingSimplification::test_modular_indexing_positive, test/inductor/test_indexing.py::ExprPrinterTests::test_print_Min_Max, test/inductor/test_indexing.py::ExprPrinterTests::test_print_ceil, test/inductor/test_indexing.py::ExprPrinterTests::test_print_floor, test/inductor/test_indexing.py::ExprPrinterTests::test_print_floor_div, test/inductor/test_indexing.py::ExprPrinterTests::test_print_integer, test/inductor/test_indexing.py::ExprPrinterTests::test_print_mod, test/inductor/test_indexing.py::ExprPrinterTests::test_print_mod_index, test/inductor/test_indexing.py::ExprPrinterTests::test_print_pow, test/inductor/test_indexing.py::ExprPrinterTests::test_print_python_mod, test/inductor/test_indexing.py::ExprPrinterTests::test_print_round, test/inductor/test_indexing.py::ExprPrinterTests::test_print_round_decimal_ndigits_-1, test/inductor/test_indexing.py::ExprPrinterTests::test_print_round_decimal_ndigits_0, test/inductor/test_indexing.py::ExprPrinterTests::test_print_round_decimal_ndigits_1 2025-10-10T01:44:26.3424012Z 2025-10-10T01:44:30.2379875Z Running inductor/test_minifier 1/1 ... [2025-10-10 01:44:30.237409] 2025-10-10T01:44:30.2380483Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:44:30.2381590Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_minifier.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:44:30.237781] 2025-10-10T01:44:37.6178078Z 2025-10-10T01:44:37.6179363Z inductor/test_minifier 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_minifier_1.1_4ac24adc1396a082_.log 2025-10-10T01:44:37.6184120Z Running 14 items in this shard: test/inductor/test_minifier.py::MinifierTests::test_accuracy_vs_strict_accuracy, test/inductor/test_minifier.py::MinifierTests::test_after_aot_cpu_accuracy_error, test/inductor/test_minifier.py::MinifierTests::test_after_aot_cpu_compile_error, test/inductor/test_minifier.py::MinifierTests::test_after_aot_gpu_accuracy_error, test/inductor/test_minifier.py::MinifierTests::test_after_aot_gpu_compile_error, test/inductor/test_minifier.py::MinifierTests::test_aoti_cpu_accuracy_error, test/inductor/test_minifier.py::MinifierTests::test_aoti_cpu_compile_error, test/inductor/test_minifier.py::MinifierTests::test_aoti_cpu_compile_error_unflatten, test/inductor/test_minifier.py::MinifierTests::test_aoti_gpu_accuracy_error, test/inductor/test_minifier.py::MinifierTests::test_aoti_gpu_compile_error, test/inductor/test_minifier.py::MinifierTests::test_aoti_gpu_compile_error_unflatten, test/inductor/test_minifier.py::MinifierTests::test_constant_in_graph, test/inductor/test_minifier.py::MinifierTests::test_offload_to_disk, test/inductor/test_minifier.py::MinifierTests::test_rmse_improves_over_atol 2025-10-10T01:44:37.6188315Z 2025-10-10T01:44:41.4635089Z Running inductor/test_perf 1/1 ... [2025-10-10 01:44:41.462755] 2025-10-10T01:44:41.4635670Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:44:41.4636666Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_perf.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:44:41.463143] 2025-10-10T01:44:48.7932705Z 2025-10-10T01:44:48.7933818Z inductor/test_perf 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_perf_1.1_a44bacdf904975be_.log 2025-10-10T01:44:48.7951617Z Running 66 items in this shard: test/inductor/test_perf.py::NumBytesMetricTests::test_cat, test/inductor/test_perf.py::NumBytesMetricTests::test_cat_pointwise, test/inductor/test_perf.py::NumBytesMetricTests::test_cat_pointwise_config_option, test/inductor/test_perf.py::NumBytesMetricTests::test_cat_pointwise_many_complex_inputs, test/inductor/test_perf.py::NumBytesMetricTests::test_cat_pointwise_many_simple_inputs, test/inductor/test_perf.py::NumBytesMetricTests::test_extern, test/inductor/test_perf.py::NumBytesMetricTests::test_index, test/inductor/test_perf.py::NumBytesMetricTests::test_pointwise, test/inductor/test_perf.py::NumBytesMetricTests::test_reduction, test/inductor/test_perf.py::FusionTests::test_create_block_mask, test/inductor/test_perf.py::FusionTests::test_double_softmax, test/inductor/test_perf.py::FusionTests::test_factory_reduction, test/inductor/test_perf.py::FusionTests::test_horizontal_reduction_outer_pointwise, test/inductor/test_perf.py::FusionTests::test_horizontal_reduction_pointwise, test/inductor/test_perf.py::FusionTests::test_horizontal_reduction_pointwise2, test/inductor/test_perf.py::FusionTests::test_horizontal_reduction_reduction, test/inductor/test_perf.py::FusionTests::test_horizontal_sum_pw_broadcast, test/inductor/test_perf.py::FusionTests::test_index_pointwise, test/inductor/test_perf.py::FusionTests::test_index_reduction, test/inductor/test_perf.py::FusionTests::test_layer_norm, test/inductor/test_perf.py::FusionTests::test_mutation_fusion, test/inductor/test_perf.py::FusionTests::test_neighbor, test/inductor/test_perf.py::FusionTests::test_norm_chain, test/inductor/test_perf.py::FusionTests::test_pointwise_multi_level_reduction, test/inductor/test_perf.py::FusionTests::test_reduction_pointwise_multi_level_reduction, test/inductor/test_perf.py::FusionTests::test_softmax_backward, test/inductor/test_perf.py::FusionTests::test_softmax_inner, test/inductor/test_perf.py::FusionTests::test_vertical_sum_pw, test/inductor/test_perf.py::SchedulerFusionTests::test_fusion_choice1, test/inductor/test_perf.py::SchedulerFusionTests::test_fusion_choice2, test/inductor/test_perf.py::SchedulerFusionTests::test_fusion_choice3, test/inductor/test_perf.py::SchedulerFusionTests::test_fusion_choice4_cpu, test/inductor/test_perf.py::TilingTests::test_tiling_simple, test/inductor/test_perf.py::TilingTests::test_tiling_three, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_cat, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_dtype, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_full_remat, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_keops, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_long_chain_add, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_partial_remat, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_relu, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_unremat_bw, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_unremat_bw2, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_with_view, test/inductor/test_perf.py::NoopTests::test_noop_cat, test/inductor/test_perf.py::NoopTests::test_noop_clones, test/inductor/test_perf.py::NoopTests::test_noop_device_conversion, test/inductor/test_perf.py::NoopTests::test_noop_dtype_conversion, test/inductor/test_perf.py::NoopTests::test_noop_int_ops, test/inductor/test_perf.py::NoopTests::test_noop_slice_scatter, test/inductor/test_perf.py::InplacingTests::test_inplace_custom_op, test/inductor/test_perf.py::InplacingTests::test_inplace_custom_op_intermediate, test/inductor/test_perf.py::InplacingTests::test_inplace_custom_op_training, test/inductor/test_perf.py::InplacingTests::test_inplace_custom_op_training_two_mutated_inputs, test/inductor/test_perf.py::InplacingTests::test_inplace_custom_op_two_mutated_inputs, test/inductor/test_perf.py::InplacingTests::test_inplace_randperm_scatter, test/inductor/test_perf.py::InplacingTests::test_inplace_scatter, test/inductor/test_perf.py::InplacingTests::test_inplace_scatter_noop_view, test/inductor/test_perf.py::InplacingTests::test_inplace_triton_kernel_training, test/inductor/test_perf.py::InplacingTests::test_inplace_triton_kernel_v1, test/inductor/test_perf.py::InplacingTests::test_inplace_triton_kernel_v2, test/inductor/test_perf.py::InplacingTests::test_inplace_triton_kernel_v3, test/inductor/test_perf.py::InplacingTests::test_inplace_triton_kernel_v4, test/inductor/test_perf.py::InplacingTests::test_inplace_triton_kernel_v5, test/inductor/test_perf.py::InplacingTests::test_inplace_triton_kernel_v6, test/inductor/test_perf.py::InplacingTests::test_triton_kernel_not_fusable_with_users 2025-10-10T01:44:48.7968700Z 2025-10-10T01:44:52.6127634Z Running inductor/test_pad_mm 1/1 ... [2025-10-10 01:44:52.612178] 2025-10-10T01:44:52.6128083Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:44:52.6129385Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_pad_mm.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:44:52.612578] 2025-10-10T01:44:59.9439065Z 2025-10-10T01:44:59.9440201Z inductor/test_pad_mm 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_pad_mm_1.1_681b4dfd3c846ba9_.log 2025-10-10T01:44:59.9444802Z Running 19 items in this shard: test/inductor/test_pad_mm.py::PadMMTest::test_cat_pad_mm_dyn_m, test/inductor/test_pad_mm.py::PadMMTest::test_exclude_cat_padding, test/inductor/test_pad_mm.py::PadMMTest::test_exclude_padding, test/inductor/test_pad_mm.py::PadMMTest::test_no_autocast_in_pad_bmm_joint_graph_pass, test/inductor/test_pad_mm.py::PadMMTest::test_original_aten_preserved_pad_mm, test/inductor/test_pad_mm.py::PadMMTest::test_pad_addmm_2d_bias, test/inductor/test_pad_mm.py::PadMMTest::test_pad_addmm_dyn_m, test/inductor/test_pad_mm.py::PadMMTest::test_pad_addmm_dyn_mn, test/inductor/test_pad_mm.py::PadMMTest::test_pad_batch, test/inductor/test_pad_mm.py::PadMMTest::test_pad_bmm_dyn_b, test/inductor/test_pad_mm.py::PadMMTest::test_pad_bmm_dyn_bm, test/inductor/test_pad_mm.py::PadMMTest::test_pad_bmm_dyn_k, test/inductor/test_pad_mm.py::PadMMTest::test_pad_mm_bf16, test/inductor/test_pad_mm.py::PadMMTest::test_pad_mm_dyn_k, test/inductor/test_pad_mm.py::PadMMTest::test_pad_mm_dyn_m, test/inductor/test_pad_mm.py::PadMMTest::test_pad_mm_dyn_mnk, test/inductor/test_pad_mm.py::PadMMTest::test_pad_mm_dyn_n, test/inductor/test_pad_mm.py::PadMMTest::test_pad_single_cat, test/inductor/test_pad_mm.py::PadMMTest::test_zero_dim 2025-10-10T01:44:59.9449184Z 2025-10-10T01:45:03.7772888Z Running inductor/test_inductor_annotations 1/1 ... [2025-10-10 01:45:03.776709] 2025-10-10T01:45:03.7773460Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:45:03.7776075Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_inductor_annotations.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:45:03.777092] 2025-10-10T01:45:11.0077475Z 2025-10-10T01:45:11.0078673Z inductor/test_inductor_annotations 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_inductor_annotations_1.1_fc532dfee78e7c3b_.log 2025-10-10T01:45:11.0080242Z Running 2 items in this shard: test/inductor/test_inductor_annotations.py::InductorAnnotationTestCase::test_no_annotations, test/inductor/test_inductor_annotations.py::InductorAnnotationTestCase::test_training_annotation 2025-10-10T01:45:11.0081357Z 2025-10-10T01:45:14.8315030Z Running inductor/test_ck_backend 1/1 ... [2025-10-10 01:45:14.830955] 2025-10-10T01:45:14.8316114Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:45:14.8328308Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_ck_backend.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:45:14.831327] 2025-10-10T01:45:22.1602383Z 2025-10-10T01:45:22.1603509Z inductor/test_ck_backend 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_ck_backend_1.1_12aa5978a80d628f_.log 2025-10-10T01:45:22.1634082Z Running 34 items in this shard: test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_addmm_max_autotune_gemm_backends_ATen,Triton,CK_x_shape0, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_addmm_max_autotune_gemm_backends_ATen,Triton,CK_x_shape1, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_addmm_max_autotune_gemm_backends_ATen,Triton,CK_x_shape2, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_addmm_max_autotune_gemm_backends_CK_x_shape0, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_addmm_max_autotune_gemm_backends_CK_x_shape1, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_addmm_max_autotune_gemm_backends_CK_x_shape2, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_conv2d_max_autotune_conv_backends_ATEN,CK,TRITON, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_conv2d_max_autotune_conv_backends_CK, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_precompile_bmm_max_autotune_gemm_backends_ATen,Triton,CK, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_precompile_bmm_max_autotune_gemm_backends_CK, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_precompile_matmul_dynamic_max_autotune_gemm_backends_CK_autotune_in_subproc_True, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_precompile_matmul_max_autotune_gemm_backends_ATen,Triton,CK_autotune_in_subproc_False_use_aoti_False, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_precompile_matmul_max_autotune_gemm_backends_ATen,Triton,CK_autotune_in_subproc_False_use_aoti_True, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_precompile_matmul_max_autotune_gemm_backends_ATen,Triton,CK_autotune_in_subproc_True_use_aoti_False, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_precompile_matmul_max_autotune_gemm_backends_ATen,Triton,CK_autotune_in_subproc_True_use_aoti_True, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_precompile_matmul_max_autotune_gemm_backends_CKTILE_autotune_in_subproc_False_use_aoti_False, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_precompile_matmul_max_autotune_gemm_backends_CKTILE_autotune_in_subproc_False_use_aoti_True, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_precompile_matmul_max_autotune_gemm_backends_CKTILE_autotune_in_subproc_True_use_aoti_False, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_precompile_matmul_max_autotune_gemm_backends_CKTILE_autotune_in_subproc_True_use_aoti_True, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_precompile_matmul_max_autotune_gemm_backends_CK_autotune_in_subproc_False_use_aoti_False, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_precompile_matmul_max_autotune_gemm_backends_CK_autotune_in_subproc_False_use_aoti_True, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_precompile_matmul_max_autotune_gemm_backends_CK_autotune_in_subproc_True_use_aoti_False, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_precompile_matmul_max_autotune_gemm_backends_CK_autotune_in_subproc_True_use_aoti_True, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_precompile_non_contiguous_max_autotune_gemm_backends_Aten,CK, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_precompile_preselected_max_autotune_gemm_backends_ATen,Triton,CK, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_precompile_preselected_max_autotune_gemm_backends_CK, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_scaled_mm_max_autotune_gemm_backends_ATen,Triton,CK_quantize_type_rowwise_has_bias_False, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_scaled_mm_max_autotune_gemm_backends_ATen,Triton,CK_quantize_type_rowwise_has_bias_True, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_scaled_mm_max_autotune_gemm_backends_ATen,Triton,CK_quantize_type_tensorwise_has_bias_False, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_scaled_mm_max_autotune_gemm_backends_ATen,Triton,CK_quantize_type_tensorwise_has_bias_True, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_scaled_mm_max_autotune_gemm_backends_CK_quantize_type_rowwise_has_bias_False, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_scaled_mm_max_autotune_gemm_backends_CK_quantize_type_rowwise_has_bias_True, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_scaled_mm_max_autotune_gemm_backends_CK_quantize_type_tensorwise_has_bias_False, test/inductor/test_ck_backend.py::TestCKBackend::test_max_autotune_scaled_mm_max_autotune_gemm_backends_CK_quantize_type_tensorwise_has_bias_True 2025-10-10T01:45:22.1663624Z 2025-10-10T01:45:25.9807567Z Running inductor/test_inductor_utils 1/1 ... [2025-10-10 01:45:25.980020] 2025-10-10T01:45:25.9809361Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:45:25.9811190Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_inductor_utils.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:45:25.980386] 2025-10-10T01:45:29.8539105Z 2025-10-10T01:45:29.8540169Z inductor/test_inductor_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_inductor_utils_1.1_767b338c1b49423d_.log 2025-10-10T01:45:29.8541647Z Running 2 items in this shard: test/inductor/test_inductor_utils.py::TestBench::test_benchmarker, test/inductor/test_inductor_utils.py::TestBench::test_do_bench_using_profiling 2025-10-10T01:45:29.8542408Z 2025-10-10T01:45:33.7202870Z Running inductor/test_op_completeness 1/1 ... [2025-10-10 01:45:33.719552] 2025-10-10T01:45:33.7203668Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:45:33.7205633Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_op_completeness.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:45:33.720026] 2025-10-10T01:45:37.9423974Z 2025-10-10T01:45:37.9425085Z inductor/test_op_completeness 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_op_completeness_1.1_f2ce10d138763311_.log 2025-10-10T01:45:37.9427879Z Running 5 items in this shard: test/inductor/test_op_completeness.py::TestOpCompleteness::test_cpp_overrides, test/inductor/test_op_completeness.py::TestOpCompleteness::test_cpp_vec_overrides, test/inductor/test_op_completeness.py::TestOpCompleteness::test_halide_overrides, test/inductor/test_op_completeness.py::TestOpCompleteness::test_metal_overrides, test/inductor/test_op_completeness.py::TestOpCompleteness::test_triton_overrides 2025-10-10T01:45:37.9429843Z 2025-10-10T01:45:41.7452154Z Running inductor/test_multi_kernel 1/1 ... [2025-10-10 01:45:41.744625] 2025-10-10T01:45:41.7452655Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:45:41.7453796Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_multi_kernel.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:45:41.744999] 2025-10-10T01:45:48.8739952Z 2025-10-10T01:45:48.8741061Z inductor/test_multi_kernel 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_multi_kernel_1.1_bd204fef91aabefa_.log 2025-10-10T01:45:48.8747749Z Running 19 items in this shard: test/inductor/test_multi_kernel.py::MultiKernelTest::test_batchnorm_training, test/inductor/test_multi_kernel.py::MultiKernelTest::test_inplace_update, test/inductor/test_multi_kernel.py::MultiKernelTest::test_layernorm, test/inductor/test_multi_kernel.py::MultiKernelTest::test_pass_same_arg_multi_times, test/inductor/test_multi_kernel.py::MultiKernelTest::test_reduction_scratch_buffer, test/inductor/test_multi_kernel.py::MultiKernelTest::test_reduction_scratch_buffer_cpp_wrapper, test/inductor/test_multi_kernel.py::MultiKernelTest::test_reduction_scratch_buffer_cpp_wrapper_non_persistent_reduction, test/inductor/test_multi_kernel.py::MultiKernelTest::test_reduction_scratch_buffer_cpp_wrapper_persistent_reduction, test/inductor/test_multi_kernel.py::MultiKernelTest::test_softmax, test/inductor/test_multi_kernel.py::MultiKernelTest::test_softmax_cpp_wrapper, test/inductor/test_multi_kernel.py::MultiKernelTest::test_softmax_force_non_persistent_reduction_force_kernel_0, test/inductor/test_multi_kernel.py::MultiKernelTest::test_softmax_force_non_persistent_reduction_force_kernel_1, test/inductor/test_multi_kernel.py::MultiKernelTest::test_softmax_warn_mixed_layout, test/inductor/test_multi_kernel.py::MultiKernelTest::test_sort_disables_multi_kernel, test/inductor/test_multi_kernel.py::MultiKernelTest::test_split_scan, test/inductor/test_multi_kernel.py::MultiKernelTest::test_transformer_snippet, test/inductor/test_multi_kernel.py::MultiKernelTest::test_transformer_snippet_with_fallback_random, test/inductor/test_multi_kernel.py::MultiKernelTest::test_triton_gemm, test/inductor/test_multi_kernel.py::MultiKernelTest::test_triton_relu_fused_gemm 2025-10-10T01:45:48.8754200Z 2025-10-10T01:45:52.6730066Z Running inductor/test_autoheuristic 1/1 ... [2025-10-10 01:45:52.672307] 2025-10-10T01:45:52.6730722Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:45:52.6732209Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_autoheuristic.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:45:52.672673] 2025-10-10T01:45:59.8519920Z 2025-10-10T01:45:59.8522003Z inductor/test_autoheuristic 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_autoheuristic_1.1_5f05e505b6ae1521_.log 2025-10-10T01:45:59.8522871Z Running 0 items in this shard: 2025-10-10T01:45:59.8523066Z 2025-10-10T01:46:03.7072456Z Running export/test_serdes 1/1 ... [2025-10-10 01:46:03.706399] 2025-10-10T01:46:03.7072898Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:46:03.7074244Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_serdes.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:46:03.706797] 2025-10-10T01:46:12.7903062Z 2025-10-10T01:46:12.7903824Z export/test_serdes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_serdes_1.1_2fd2a67c3686ad62_.log 2025-10-10T01:46:12.8254320Z Running 866 items in this shard: test/export/test_serdes.py::SerDesExportTestDynamismExpression::test_export_assume_static_by_default_serdes_strict, test/export/test_serdes.py::SerDesExportTestDynamismExpression::test_export_constraints_error_not_in_range_serdes_strict, test/export/test_serdes.py::SerDesExportTestDynamismExpression::test_export_constraints_error_serdes_strict, test/export/test_serdes.py::SerDesExportTestDynamismExpression::test_export_inline_constraints_serdes_strict, test/export/test_serdes.py::SerDesExportTestDynamismExpression::test_export_slice_maxsize_serdes_strict, test/export/test_serdes.py::SerDesExportTestDynamismExpression::test_export_slice_unbacked_dim1_serdes_strict, test/export/test_serdes.py::SerDesExportTestDynamismExpression::test_export_strict_narrow_unbacked_expr_serdes_strict, test/export/test_serdes.py::SerDesExportTestDynamismExpression::test_no_grad_param_inplace_serdes_strict, test/export/test_serdes.py::SerDesExportTestDynamismExpression::test_reshape_view_backed_size_oblivious_serdes_strict, test/export/test_serdes.py::SerDesExportNonStrictTestDynamismExpression::test_export_assume_static_by_default_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestDynamismExpression::test_export_constraints_error_not_in_range_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestDynamismExpression::test_export_constraints_error_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestDynamismExpression::test_export_inline_constraints_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestDynamismExpression::test_export_slice_maxsize_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestDynamismExpression::test_export_slice_unbacked_dim1_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestDynamismExpression::test_export_strict_narrow_unbacked_expr_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestDynamismExpression::test_no_grad_param_inplace_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestDynamismExpression::test_reshape_view_backed_size_oblivious_serdes_nonstrict, test/export/test_serdes.py::SerDesExportTestExport::test__scaled_dot_product_flash_attention_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_additional_inputs_constants_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_allow_explicit_guards_as_runtime_asserts_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_args_type_checked_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_aten_lift_fresh_copy_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_attention_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_attr_assignment_extra_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_automatic_constrain_size_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_automatic_dynamic_shapes_constant_relation_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_automatic_dynamic_shapes_linear_relation_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_automatic_dynamic_shapes_simple_equality_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_baddbmm_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_basic_non_strict_fake_tensor_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_basic_non_strict_real_tensor_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_basic_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_bincount_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_buffer_util_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_capture_subclass_constructor_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_capture_subclass_constructor_torch_ir_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_capture_subclass_wrong_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_ccode_python_mod_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_cdist_forward_compute_mode_zero_export_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_check_specialized_int_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_checks_to_constrain_range_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_cleanup_dynamic_markers_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_colin_unbacked_backed_vr_sub_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_colon_parameter_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_compiling_state_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_cond_access_identical_symint_closure_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_cond_branches_return_constant_int_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_cond_branches_return_same_int_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_cond_buffers_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_cond_contains_unbacked_no_escape_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_cond_int_closure_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_cond_unflatten_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_cond_with_module_stack_export_with_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_cond_with_module_stack_export_with_unflatten_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_constant_aliasing_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_constant_input_naming_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_constant_no_user_inp_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_constant_output_dup_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_constant_output_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_constant_requires_grad_const_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_constant_return_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_constant_tensor_mutation_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_constant_tensor_with_non_functional_nested_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_constant_tensor_with_non_functional_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_constrain_decomp_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_constrain_size_in_eager_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_constrain_size_with_constrain_value_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_constrain_size_with_various_cases_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_conv_dynamic_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_crop_like_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_cse_for_symint_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_custom_op_auto_functionalize_pre_dispatch_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_custom_op_auto_functionalize_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_custom_op_auto_warn_pre_dispatch_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_custom_op_preserve_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_custom_pytree_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_custom_tag_metadata_re_export_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_decomp_batch_norm_functional_predispatch_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_decomp_item_in_prim_after_decomposition_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_decomp_item_in_prim_before_decomposition_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_default_decomposition_core_cia_ops_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_derived_dim_1_2_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_derived_dim_basic_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_derived_dim_integer_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_derived_dim_nested_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_derived_dim_out_of_order_repeat_derived_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_derived_dim_out_of_order_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_derived_dim_out_of_order_simplified_repeat_non_derived_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_derived_dim_out_of_order_simplified_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_derived_dim_repeat_derived_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_detect_leak_nonstrict_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_detect_leak_nonstrict_with_stacktrace_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_detect_leak_strict_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_device_to_dynamic_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_device_to_gpu_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_device_to_mutation_float_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_device_to_mutation_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_device_to_static_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_dim_1_2_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_dim_auto_and_dim_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_dim_dynamic_divisibility_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_dim_dynamic_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_dim_dynamic_specialization_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_dim_hint_range_violations_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_dim_hint_ranges_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_disable_forced_specializations_errors_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_disable_forced_specializations_ok_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_distributed_all_gather_into_tensor_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_distributed_all_gather_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_distributed_all_reduce_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_distributed_all_to_all_single_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_distributed_reduce_scatter_tensor_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_dont_duck_size_for_auto_dynamic_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_double_lifted_constants_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_draft_export_checks_aliasing_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_draft_export_checks_mutation_list_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_draft_export_checks_mutation_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_draft_export_checks_mutation_with_nan_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_draft_export_fake_kernel_inference_errors_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_draft_export_infers_fake_kernel_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_duplicate_modules_with_non_persistent_buffers_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_dynamic_lr_shift_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_dynamic_shapes_bounds_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_dynamic_shapes_builder_basic_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_dynamic_shapes_builder_kwargs_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_dynamic_shapes_builder_pytree_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_dynamic_shapes_dataclass_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_dynamic_shapes_inferred_basic_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_dynamic_shapes_serdes_generic_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_dynamic_shapes_serdes_user_errors_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_dynamic_shapes_serdes_various_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_dynamic_shapes_spec_with_pytree_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_dynamic_shapes_wrapped_with_shape_guards_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_dynamic_sym_round_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_ends_of_bounds_oblivious_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_error_does_not_reference_eager_fallback_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_error_when_passing_mutating_primitive_op_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_exception_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_expand_copy_export_handles_implicit_true_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_api_with_dynamic_shapes_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_as_backend_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_associative_scan_lifted_buffers_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_associative_scan_symbol_dim_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_associative_scan_symbol_scandim_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_aten_to_unflatten_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_aten_to_unflatten_subclass_pre_dispatch_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_aten_to_unflatten_subclass_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_cond_preserve_torch_fn_for_subgraphs_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_cond_symbool_pred_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_cond_warns_constant_pred_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_custom_decomp_table_basic_pop_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_custom_decomp_table_container_methods_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_custom_op_lib_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_custom_triton_kernel_mutable_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_custom_triton_kernel_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_cyclic_reference_leak_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_decomp_torture_case_1_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_decomp_torture_case_2_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_decomps_dynamic_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_decomps_simple_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_dynamo_config_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_for_training_run_decomp_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_for_training_with_container_type_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_for_training_with_dynamic_shapes_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_for_training_with_mutation_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_for_training_with_state_dict_hooks_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_func_with_default_kwargs_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_func_with_keyword_only_args_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_func_with_kwargs_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_func_with_pytree_kwargs_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_func_with_var_keyword_args_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_func_with_var_keyword_pytree_args_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_func_with_var_postional_args_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_function_schema_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_graph_with_no_inputs_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_input_mutation_bug_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_input_mutation_dynamic_shape_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_input_mutation_static_shape_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_leak_compile_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_linear_preserve_dynamic_shape_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_max_nonstrict_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_max_onnx_reported_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_method_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_mod_constraints_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_module_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_preserve_linear_at_aot_level_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_preserve_linear_but_not_custom_op_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_rnn_variants_with_warning_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_scan_pytree_output_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_script_module_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_statically_known_true_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_then_compile_tensor_ctor_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_with_autocast_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_with_fake_tensor_inputs_on_cuda_devices_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_with_fake_tensor_inputs_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_with_inline_constraints_complex_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_with_inline_constraints_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_with_set_grad_enabled_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_export_with_wrong_inputs_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_external_call_non_strict_real_tensor_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_fake_inputs_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_fake_weights_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_filter_traceback_frames_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_flex_attention_export_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_float_conversion_from_int_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_float_conversion_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_fqn_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_from_node_metadata_export_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_full_on_scalar_tensor_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_function_holding_tensor_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_hints_wrapper_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_hoo_inline_users_issue_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_if_functional_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_if_post_autograd_op_preserved_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_inductor_backend_inside_nonstrict_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_inline_script_class_method_recursive_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_inline_script_class_method_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_inline_script_function_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_inline_script_method_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_int_shape_specialization_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_intermediate_shape_comp_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_is_exporting_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_is_nonzero_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_isnonzero_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_issue_113041_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_issue_157289_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_issue_161902_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_istft_op_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_keep_composite_ops_invalid_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_keep_composite_ops_linear_convd_for_training_ir_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_keep_composite_ops_linear_convd_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_kwarg_dynamic_shapes_diff_order_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_kwargs_reorder_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_layer_norm_unbacked_normalized_shape_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_layer_sharing_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_lazy_module_kwargs_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_lifted_constants_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_linear_conv_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_malformed_fqn_from_source_name_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_map_buffers_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_map_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_mask_nonzero_static_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_masked_select_dynamic_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_math_pow_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_mismatched_dynamic_shapes_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_mixed_input_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_module_dict_key_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_module_input_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_module_input_subclasses_parameterization_nested_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_module_list_slice_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_module_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_module_with_dict_container_inp_out_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_modules_access_for_deleted_submodule_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_more_multidimensional_slicing_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_multidimensional_slicing_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_multinomial_dynamic_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_multiple_definitions_same_name_dim_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_namedtuple_input_export_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_native_multi_attention_head_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_nested_dynamic_shapes_spec_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_nested_module_fake_tensor_leak_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_nested_module_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_nested_module_with_constant_buffer_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_nested_module_with_init_buffer_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_nested_module_with_parameter_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_nn_module_stack_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_nn_module_stack_shared_submodule_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_no_check_is_size_error_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_no_suggested_fixes_for_data_dependent_errors_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_no_tensor_computation_2_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_no_tensor_computation_3_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_no_tensor_computation_4_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_no_tensor_computation_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_non_arg_name_dynamic_shapes_api_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_non_arg_name_dynamic_shapes_api_with_container_type_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_non_arg_name_dynamic_shapes_api_with_kwarg_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_non_persistent_buffer_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_non_strict_dynamic_shapes_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_non_strict_dynamic_shapes_suggested_fixes_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_none_buffers_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_nonstrict_retrace_preserves_metadata_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_nonzero_2_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_nonzero_dynamic_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_not_registered_parameter_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_operator_aten_tensor_mode_variant_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_output_node_name_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_pad_sequence_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_param_util_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_partial_patched_forward_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_placeholder_naming_collisions_hoo_subgraphs_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_placeholder_naming_collisions_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_placeholder_naming_order_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_placeholder_naming_order_variadic_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_placeholder_update_preserving_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_predispatch_cond_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_predispatch_grad_wrappers_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_preserve_annotation_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_preserve_module_call_signature_unflatten_specialization_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_preserve_requires_grad_placeholders_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_preserve_shape_dynamism_for_unused_inputs_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_profiling_code_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_python_asserts_with_sym_int_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_pytree_register_data_class_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_pytree_register_nested_data_class_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_raise_user_error_when_guard_on_data_dependent_operation_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_range_constraints_with_replacement_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_real_tensor_alias_dtype_mismatch_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_real_tensor_bool_cast_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_real_tensor_errors_on_aliasing_custom_op_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_real_tensor_for_max_op_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_real_tensor_size_mismatch_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_redundant_assert_max_upper_bound_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_redundant_asserts_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_refine_dynamic_shapes_from_suggested_fixes_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_register_constant_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_repeat_interleave_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_replace_unbacked_with_very_large_upperbound_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_replaced_unbacked_bindings_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_reshape_view_helper_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_retracable_ep_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_retrace_pre_autograd_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_run_decomposition_supports_user_input_mutation_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_run_decompositions_keep_metadata_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_run_decompositions_keep_tensor_constant_metadata_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_runtime_assert_for_prim_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_runtime_assert_for_prm_str_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_runtime_assert_with_size_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_sdpa_gqa_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_sequential_slicing_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_set_example_inputs_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_set_grad_as_side_effect_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_set_grad_empty_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_set_grad_unflatten_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_setgrad_lifted_tensor_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_shared_submodule_nn_module_stack_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_simple_export_for_training_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_simple_unbacked_view_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_size_input_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_slice_nn_module_stack_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_solver_unsupported_sympy_function_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_specialize_derived_dim_roots_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_split_const_gm_with_lifted_constants_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_stack_trace_make_fx_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_stack_trace_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_state_primitives_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_state_shape_attribute_assignment_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_state_tensors_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_static_dim_constraints_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_subclass_context_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_subclass_nested_attr_access_complicated_metadata_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_subclass_nested_attr_access_const_metadata_not_top_level_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_subclass_nested_attr_access_const_metadata_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_subclass_nested_attr_access_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_subclass_nested_attr_access_submodule_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_subclasses_parameterization_nested_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_subclasses_parameterization_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_suggest_torch_checks_with_non_negative_check_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_suggest_torch_checks_with_regular_check_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_suggested_fixes_for_data_dependent_errors_basic_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_suggested_fixes_for_data_dependent_errors_puzzlers_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_suggested_fixes_new_roots_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_sym_float_operators_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_sym_or_sym_and_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_sym_sqrt_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_symbool_item_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_symfloat_item_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_symint_input_additional_inputs_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_symint_input_basic_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_symint_input_ranges_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_symint_input_shapes_collection_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_symint_input_specialization_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_symint_item_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_symint_output_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_symint_tensor_return_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_tag_ac_export_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_tensor_attribute_zero_args_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_tensor_constant_aten_to_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_tensor_constant_with_wrapped_method_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_to_module_with_mutated_buffer_multiple_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_to_module_with_mutated_buffer_multiple_update_sub_later_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_to_module_with_mutated_buffer_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_tolist_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_torch_check_eq_commutativity_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_torch_fn_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_trace_under_fake_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_train_eval_on_exported_preautograd_module_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unbacked_3d_matmul_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unbacked_bincount_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unbacked_bindings_for_divisible_u_symint_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unbacked_deferred_runtime_retrace_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unbacked_expand_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unbacked_infer_size_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unbacked_kth_value_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unbacked_linear_layer_norm_input_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unbacked_noncontig_lin_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unbacked_pad_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unbacked_scalar_constructor_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unbacked_slice_forward_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unbacked_slice_simple_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unbacked_stack_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unbacked_to_cond_passthrough_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unbacked_to_cond_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unbacked_unsqueeze_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_asserts_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_buffer_update_child2parent_swap_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_closure_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_isinstance_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_multiple_graphs_dispatch_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_multiple_graphs_preserve_signature_no_error_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_multiple_graphs_shared_submodule_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_multiple_graphs_state_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_no_unroll_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_placeholder_update_child2parent_swap_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_placeholder_update_grandchild2cousin_swap_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_random_dag_5_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_random_dag_6_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_random_dag_buf_8_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_random_dag_const_preserving_3_1_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_random_dag_const_preserving_3_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_random_dag_mutating_buf_4_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_random_dag_mutating_buf_6_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_random_dag_mutating_buf_9_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_random_dag_mutating_buf_preserving_10_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_random_dag_mutating_buf_preserving_4_1_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_random_dag_mutating_buf_preserving_4_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_random_dag_mutating_buf_preserving_5_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_random_dag_mutating_buf_preserving_7_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unflatten_random_dag_preserving_4_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unused_aliases_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_unused_constant_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_use_embedding_twice_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_user_input_and_buffer_mutation_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_vmap_custom_autograd_function_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_vmap_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_vmap_to_assert_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_where_decomp_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_while_loop_assert_separation_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_while_loop_index_assertions_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_while_loop_simple_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_while_loop_tensor_constant_idx_serdes_strict, test/export/test_serdes.py::SerDesExportTestExport::test_wrapper_module_serdes_strict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test__scaled_dot_product_flash_attention_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_additional_inputs_constants_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_allow_explicit_guards_as_runtime_asserts_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_args_type_checked_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_aten_lift_fresh_copy_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_attention_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_attr_assignment_extra_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_automatic_constrain_size_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_automatic_dynamic_shapes_constant_relation_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_automatic_dynamic_shapes_linear_relation_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_automatic_dynamic_shapes_simple_equality_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_baddbmm_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_basic_non_strict_fake_tensor_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_basic_non_strict_real_tensor_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_basic_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_bincount_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_buffer_util_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_capture_subclass_constructor_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_capture_subclass_constructor_torch_ir_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_capture_subclass_wrong_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_ccode_python_mod_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_cdist_forward_compute_mode_zero_export_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_check_specialized_int_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_checks_to_constrain_range_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_cleanup_dynamic_markers_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_colin_unbacked_backed_vr_sub_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_colon_parameter_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_compiling_state_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_cond_access_identical_symint_closure_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_cond_branches_return_constant_int_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_cond_branches_return_same_int_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_cond_buffers_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_cond_contains_unbacked_no_escape_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_cond_int_closure_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_cond_unflatten_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_cond_with_module_stack_export_with_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_cond_with_module_stack_export_with_unflatten_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_constant_aliasing_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_constant_input_naming_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_constant_no_user_inp_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_constant_output_dup_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_constant_output_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_constant_requires_grad_const_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_constant_return_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_constant_tensor_mutation_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_constant_tensor_with_non_functional_nested_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_constant_tensor_with_non_functional_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_constrain_decomp_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_constrain_size_in_eager_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_constrain_size_with_constrain_value_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_constrain_size_with_various_cases_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_conv_dynamic_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_crop_like_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_cse_for_symint_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_custom_op_auto_functionalize_pre_dispatch_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_custom_op_auto_functionalize_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_custom_op_auto_warn_pre_dispatch_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_custom_op_preserve_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_custom_pytree_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_custom_tag_metadata_re_export_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_decomp_batch_norm_functional_predispatch_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_decomp_item_in_prim_after_decomposition_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_decomp_item_in_prim_before_decomposition_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_default_decomposition_core_cia_ops_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_derived_dim_1_2_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_derived_dim_basic_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_derived_dim_integer_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_derived_dim_nested_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_derived_dim_out_of_order_repeat_derived_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_derived_dim_out_of_order_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_derived_dim_out_of_order_simplified_repeat_non_derived_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_derived_dim_out_of_order_simplified_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_derived_dim_repeat_derived_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_detect_leak_nonstrict_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_detect_leak_nonstrict_with_stacktrace_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_detect_leak_strict_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_device_to_dynamic_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_device_to_gpu_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_device_to_mutation_float_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_device_to_mutation_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_device_to_static_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_dim_1_2_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_dim_auto_and_dim_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_dim_dynamic_divisibility_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_dim_dynamic_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_dim_dynamic_specialization_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_dim_hint_range_violations_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_dim_hint_ranges_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_disable_forced_specializations_errors_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_disable_forced_specializations_ok_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_distributed_all_gather_into_tensor_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_distributed_all_gather_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_distributed_all_reduce_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_distributed_all_to_all_single_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_distributed_reduce_scatter_tensor_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_dont_duck_size_for_auto_dynamic_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_double_lifted_constants_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_draft_export_checks_aliasing_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_draft_export_checks_mutation_list_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_draft_export_checks_mutation_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_draft_export_checks_mutation_with_nan_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_draft_export_fake_kernel_inference_errors_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_draft_export_infers_fake_kernel_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_duplicate_modules_with_non_persistent_buffers_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_dynamic_lr_shift_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_dynamic_shapes_bounds_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_dynamic_shapes_builder_basic_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_dynamic_shapes_builder_kwargs_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_dynamic_shapes_builder_pytree_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_dynamic_shapes_dataclass_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_dynamic_shapes_inferred_basic_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_dynamic_shapes_serdes_generic_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_dynamic_shapes_serdes_user_errors_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_dynamic_shapes_serdes_various_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_dynamic_shapes_spec_with_pytree_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_dynamic_shapes_wrapped_with_shape_guards_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_dynamic_sym_round_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_ends_of_bounds_oblivious_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_error_does_not_reference_eager_fallback_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_error_when_passing_mutating_primitive_op_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_exception_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_expand_copy_export_handles_implicit_true_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_api_with_dynamic_shapes_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_as_backend_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_associative_scan_lifted_buffers_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_associative_scan_symbol_dim_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_associative_scan_symbol_scandim_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_aten_to_unflatten_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_aten_to_unflatten_subclass_pre_dispatch_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_aten_to_unflatten_subclass_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_cond_preserve_torch_fn_for_subgraphs_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_cond_symbool_pred_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_cond_warns_constant_pred_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_custom_decomp_table_basic_pop_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_custom_decomp_table_container_methods_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_custom_op_lib_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_custom_triton_kernel_mutable_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_custom_triton_kernel_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_cyclic_reference_leak_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_decomp_torture_case_1_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_decomp_torture_case_2_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_decomps_dynamic_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_decomps_simple_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_dynamo_config_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_for_training_run_decomp_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_for_training_with_container_type_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_for_training_with_dynamic_shapes_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_for_training_with_mutation_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_for_training_with_state_dict_hooks_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_func_with_default_kwargs_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_func_with_keyword_only_args_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_func_with_kwargs_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_func_with_pytree_kwargs_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_func_with_var_keyword_args_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_func_with_var_keyword_pytree_args_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_func_with_var_postional_args_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_function_schema_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_graph_with_no_inputs_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_input_mutation_bug_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_input_mutation_dynamic_shape_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_input_mutation_static_shape_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_leak_compile_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_linear_preserve_dynamic_shape_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_max_nonstrict_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_max_onnx_reported_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_method_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_mod_constraints_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_module_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_preserve_linear_at_aot_level_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_preserve_linear_but_not_custom_op_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_rnn_variants_with_warning_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_scan_pytree_output_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_script_module_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_statically_known_true_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_then_compile_tensor_ctor_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_with_autocast_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_with_fake_tensor_inputs_on_cuda_devices_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_with_fake_tensor_inputs_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_with_inline_constraints_complex_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_with_inline_constraints_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_with_set_grad_enabled_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_export_with_wrong_inputs_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_external_call_non_strict_real_tensor_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_fake_inputs_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_fake_weights_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_filter_traceback_frames_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_flex_attention_export_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_float_conversion_from_int_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_float_conversion_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_fqn_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_from_node_metadata_export_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_full_on_scalar_tensor_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_function_holding_tensor_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_hints_wrapper_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_hoo_inline_users_issue_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_if_functional_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_if_post_autograd_op_preserved_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_inductor_backend_inside_nonstrict_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_inline_script_class_method_recursive_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_inline_script_class_method_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_inline_script_function_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_inline_script_method_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_int_shape_specialization_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_intermediate_shape_comp_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_is_exporting_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_is_nonzero_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_isnonzero_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_issue_113041_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_issue_157289_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_issue_161902_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_istft_op_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_keep_composite_ops_invalid_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_keep_composite_ops_linear_convd_for_training_ir_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_keep_composite_ops_linear_convd_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_kwarg_dynamic_shapes_diff_order_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_kwargs_reorder_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_layer_norm_unbacked_normalized_shape_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_layer_sharing_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_lazy_module_kwargs_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_lifted_constants_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_linear_conv_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_malformed_fqn_from_source_name_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_map_buffers_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_map_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_mask_nonzero_static_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_masked_select_dynamic_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_math_pow_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_mismatched_dynamic_shapes_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_mixed_input_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_module_dict_key_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_module_input_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_module_input_subclasses_parameterization_nested_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_module_list_slice_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_module_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_module_with_dict_container_inp_out_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_modules_access_for_deleted_submodule_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_more_multidimensional_slicing_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_multidimensional_slicing_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_multinomial_dynamic_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_multiple_definitions_same_name_dim_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_namedtuple_input_export_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_native_multi_attention_head_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_nested_dynamic_shapes_spec_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_nested_module_fake_tensor_leak_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_nested_module_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_nested_module_with_constant_buffer_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_nested_module_with_init_buffer_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_nested_module_with_parameter_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_nn_module_stack_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_nn_module_stack_shared_submodule_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_no_check_is_size_error_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_no_suggested_fixes_for_data_dependent_errors_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_no_tensor_computation_2_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_no_tensor_computation_3_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_no_tensor_computation_4_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_no_tensor_computation_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_non_arg_name_dynamic_shapes_api_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_non_arg_name_dynamic_shapes_api_with_container_type_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_non_arg_name_dynamic_shapes_api_with_kwarg_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_non_persistent_buffer_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_non_strict_dynamic_shapes_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_non_strict_dynamic_shapes_suggested_fixes_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_none_buffers_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_nonstrict_retrace_preserves_metadata_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_nonzero_2_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_nonzero_dynamic_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_not_registered_parameter_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_operator_aten_tensor_mode_variant_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_output_node_name_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_pad_sequence_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_param_util_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_partial_patched_forward_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_placeholder_naming_collisions_hoo_subgraphs_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_placeholder_naming_collisions_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_placeholder_naming_order_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_placeholder_naming_order_variadic_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_placeholder_update_preserving_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_predispatch_cond_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_predispatch_grad_wrappers_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_preserve_annotation_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_preserve_module_call_signature_unflatten_specialization_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_preserve_requires_grad_placeholders_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_preserve_shape_dynamism_for_unused_inputs_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_profiling_code_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_python_asserts_with_sym_int_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_pytree_register_data_class_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_pytree_register_nested_data_class_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_raise_user_error_when_guard_on_data_dependent_operation_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_range_constraints_with_replacement_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_real_tensor_alias_dtype_mismatch_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_real_tensor_bool_cast_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_real_tensor_errors_on_aliasing_custom_op_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_real_tensor_for_max_op_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_real_tensor_size_mismatch_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_redundant_assert_max_upper_bound_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_redundant_asserts_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_refine_dynamic_shapes_from_suggested_fixes_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_register_constant_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_repeat_interleave_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_replace_unbacked_with_very_large_upperbound_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_replaced_unbacked_bindings_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_reshape_view_helper_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_retracable_ep_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_retrace_pre_autograd_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_run_decomposition_supports_user_input_mutation_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_run_decompositions_keep_metadata_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_run_decompositions_keep_tensor_constant_metadata_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_runtime_assert_for_prim_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_runtime_assert_for_prm_str_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_runtime_assert_with_size_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_sdpa_gqa_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_sequential_slicing_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_set_example_inputs_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_set_grad_as_side_effect_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_set_grad_empty_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_set_grad_unflatten_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_setgrad_lifted_tensor_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_shared_submodule_nn_module_stack_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_simple_export_for_training_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_simple_unbacked_view_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_size_input_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_slice_nn_module_stack_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_solver_unsupported_sympy_function_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_specialize_derived_dim_roots_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_split_const_gm_with_lifted_constants_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_stack_trace_make_fx_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_stack_trace_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_state_primitives_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_state_shape_attribute_assignment_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_state_tensors_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_static_dim_constraints_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_subclass_context_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_subclass_nested_attr_access_complicated_metadata_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_subclass_nested_attr_access_const_metadata_not_top_level_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_subclass_nested_attr_access_const_metadata_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_subclass_nested_attr_access_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_subclass_nested_attr_access_submodule_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_subclasses_parameterization_nested_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_subclasses_parameterization_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_suggest_torch_checks_with_non_negative_check_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_suggest_torch_checks_with_regular_check_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_suggested_fixes_for_data_dependent_errors_basic_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_suggested_fixes_for_data_dependent_errors_puzzlers_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_suggested_fixes_new_roots_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_sym_float_operators_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_sym_or_sym_and_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_sym_sqrt_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_symbool_item_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_symfloat_item_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_symint_input_additional_inputs_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_symint_input_basic_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_symint_input_ranges_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_symint_input_shapes_collection_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_symint_input_specialization_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_symint_item_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_symint_output_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_symint_tensor_return_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_tag_ac_export_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_tensor_attribute_zero_args_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_tensor_constant_aten_to_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_tensor_constant_with_wrapped_method_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_to_module_with_mutated_buffer_multiple_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_to_module_with_mutated_buffer_multiple_update_sub_later_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_to_module_with_mutated_buffer_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_tolist_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_torch_check_eq_commutativity_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_torch_fn_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_trace_under_fake_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_train_eval_on_exported_preautograd_module_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unbacked_3d_matmul_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unbacked_bincount_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unbacked_bindings_for_divisible_u_symint_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unbacked_deferred_runtime_retrace_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unbacked_expand_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unbacked_infer_size_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unbacked_kth_value_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unbacked_linear_layer_norm_input_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unbacked_noncontig_lin_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unbacked_pad_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unbacked_scalar_constructor_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unbacked_slice_forward_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unbacked_slice_simple_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unbacked_stack_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unbacked_to_cond_passthrough_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unbacked_to_cond_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unbacked_unsqueeze_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_asserts_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_buffer_update_child2parent_swap_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_closure_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_isinstance_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_multiple_graphs_dispatch_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_multiple_graphs_preserve_signature_no_error_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_multiple_graphs_shared_submodule_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_multiple_graphs_state_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_no_unroll_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_placeholder_update_child2parent_swap_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_placeholder_update_grandchild2cousin_swap_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_random_dag_5_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_random_dag_6_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_random_dag_buf_8_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_random_dag_const_preserving_3_1_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_random_dag_const_preserving_3_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_random_dag_mutating_buf_4_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_random_dag_mutating_buf_6_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_random_dag_mutating_buf_9_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_random_dag_mutating_buf_preserving_10_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_random_dag_mutating_buf_preserving_4_1_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_random_dag_mutating_buf_preserving_4_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_random_dag_mutating_buf_preserving_5_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_random_dag_mutating_buf_preserving_7_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unflatten_random_dag_preserving_4_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unused_aliases_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_unused_constant_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_use_embedding_twice_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_user_input_and_buffer_mutation_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_vmap_custom_autograd_function_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_vmap_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_vmap_to_assert_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_where_decomp_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_while_loop_assert_separation_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_while_loop_index_assertions_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_while_loop_simple_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_while_loop_tensor_constant_idx_serdes_nonstrict, test/export/test_serdes.py::SerDesExportNonStrictTestExport::test_wrapper_module_serdes_nonstrict 2025-10-10T01:46:12.8591610Z 2025-10-10T01:46:16.6467817Z Running dynamo/test_deque_reconstruct 1/1 ... [2025-10-10 01:46:16.646211] 2025-10-10T01:46:16.6468279Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:46:16.6471514Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_deque_reconstruct.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:46:16.646590] 2025-10-10T01:46:20.5711300Z 2025-10-10T01:46:20.5712447Z dynamo/test_deque_reconstruct 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_deque_reconstruct_1.1_f31cab565b18dfe5_.log 2025-10-10T01:46:20.5714288Z Running 3 items in this shard: test/dynamo/test_deque_reconstruct.py::TestDequeReconstruct::test_deque_reconstruct_in_globals, test/dynamo/test_deque_reconstruct.py::TestDequeReconstruct::test_deque_reconstruct_not_in_globals, test/dynamo/test_deque_reconstruct.py::TestDequeReconstruct::test_deque_reconstruct_shallows_globals 2025-10-10T01:46:20.5715867Z 2025-10-10T01:46:24.4400843Z Running inductor/test_cuda_select_algorithm 1/1 ... [2025-10-10 01:46:24.439417] 2025-10-10T01:46:24.4401326Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:46:24.4403749Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cuda_select_algorithm.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:46:24.439823] 2025-10-10T01:46:32.2208594Z 2025-10-10T01:46:32.2209468Z inductor/test_cuda_select_algorithm 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cuda_select_algorithm_1.1_8d0470f03f49756e_.log 2025-10-10T01:46:32.2243010Z Running 58 items in this shard: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-10-10T01:46:32.2275602Z 2025-10-10T01:46:36.1605100Z Running export/test_strict_export_v2 1/1 ... [2025-10-10 01:46:36.160049] 2025-10-10T01:46:36.1605556Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:46:36.1608193Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_strict_export_v2.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:46:36.160442] 2025-10-10T01:46:44.2916478Z 2025-10-10T01:46:44.2917615Z export/test_strict_export_v2 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_strict_export_v2_1.1_31b0726b944d3506_.log 2025-10-10T01:46:44.3207303Z Running 433 items in this shard: test/export/test_strict_export_v2.py::StrictExportV2TestDynamismExpression::test_export_assume_static_by_default_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestDynamismExpression::test_export_constraints_error_not_in_range_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestDynamismExpression::test_export_constraints_error_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestDynamismExpression::test_export_inline_constraints_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestDynamismExpression::test_export_slice_maxsize_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestDynamismExpression::test_export_slice_unbacked_dim1_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestDynamismExpression::test_export_strict_narrow_unbacked_expr_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestDynamismExpression::test_no_grad_param_inplace_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestDynamismExpression::test_reshape_view_backed_size_oblivious_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test__scaled_dot_product_flash_attention_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_additional_inputs_constants_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_allow_explicit_guards_as_runtime_asserts_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_args_type_checked_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_aten_lift_fresh_copy_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_attention_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_attr_assignment_extra_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_automatic_constrain_size_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_automatic_dynamic_shapes_constant_relation_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_automatic_dynamic_shapes_linear_relation_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_automatic_dynamic_shapes_simple_equality_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_baddbmm_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_basic_non_strict_fake_tensor_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_basic_non_strict_real_tensor_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_basic_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_bincount_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_buffer_util_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_capture_subclass_constructor_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_capture_subclass_constructor_torch_ir_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_capture_subclass_wrong_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_ccode_python_mod_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_cdist_forward_compute_mode_zero_export_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_check_specialized_int_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_checks_to_constrain_range_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_cleanup_dynamic_markers_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_colin_unbacked_backed_vr_sub_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_colon_parameter_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_compiling_state_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_cond_access_identical_symint_closure_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_cond_branches_return_constant_int_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_cond_branches_return_same_int_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_cond_buffers_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_cond_contains_unbacked_no_escape_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_cond_int_closure_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_cond_unflatten_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_cond_with_module_stack_export_with_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_cond_with_module_stack_export_with_unflatten_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_constant_aliasing_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_constant_input_naming_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_constant_no_user_inp_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_constant_output_dup_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_constant_output_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_constant_requires_grad_const_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_constant_return_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_constant_tensor_mutation_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_constant_tensor_with_non_functional_nested_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_constant_tensor_with_non_functional_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_constrain_decomp_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_constrain_size_in_eager_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_constrain_size_with_constrain_value_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_constrain_size_with_various_cases_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_conv_dynamic_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_crop_like_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_cse_for_symint_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_custom_op_auto_functionalize_pre_dispatch_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_custom_op_auto_functionalize_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_custom_op_auto_warn_pre_dispatch_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_custom_op_preserve_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_custom_pytree_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_custom_tag_metadata_re_export_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_decomp_batch_norm_functional_predispatch_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_decomp_item_in_prim_after_decomposition_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_decomp_item_in_prim_before_decomposition_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_default_decomposition_core_cia_ops_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_derived_dim_1_2_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_derived_dim_basic_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_derived_dim_integer_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_derived_dim_nested_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_derived_dim_out_of_order_repeat_derived_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_derived_dim_out_of_order_simplified_repeat_non_derived_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_derived_dim_out_of_order_simplified_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_derived_dim_out_of_order_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_derived_dim_repeat_derived_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_detect_leak_nonstrict_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_detect_leak_nonstrict_with_stacktrace_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_detect_leak_strict_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_device_to_dynamic_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_device_to_gpu_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_device_to_mutation_float_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_device_to_mutation_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_device_to_static_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_dim_1_2_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_dim_auto_and_dim_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_dim_dynamic_divisibility_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_dim_dynamic_specialization_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_dim_dynamic_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_dim_hint_range_violations_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_dim_hint_ranges_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_disable_forced_specializations_errors_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_disable_forced_specializations_ok_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_distributed_all_gather_into_tensor_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_distributed_all_gather_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_distributed_all_reduce_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_distributed_all_to_all_single_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_distributed_reduce_scatter_tensor_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_dont_duck_size_for_auto_dynamic_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_double_lifted_constants_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_draft_export_checks_aliasing_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_draft_export_checks_mutation_list_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_draft_export_checks_mutation_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_draft_export_checks_mutation_with_nan_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_draft_export_fake_kernel_inference_errors_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_draft_export_infers_fake_kernel_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_duplicate_modules_with_non_persistent_buffers_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_dynamic_lr_shift_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_dynamic_shapes_bounds_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_dynamic_shapes_builder_basic_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_dynamic_shapes_builder_kwargs_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_dynamic_shapes_builder_pytree_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_dynamic_shapes_dataclass_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_dynamic_shapes_inferred_basic_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_dynamic_shapes_serdes_generic_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_dynamic_shapes_serdes_user_errors_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_dynamic_shapes_serdes_various_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_dynamic_shapes_spec_with_pytree_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_dynamic_shapes_wrapped_with_shape_guards_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_dynamic_sym_round_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_ends_of_bounds_oblivious_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_error_does_not_reference_eager_fallback_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_error_when_passing_mutating_primitive_op_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_exception_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_expand_copy_export_handles_implicit_true_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_api_with_dynamic_shapes_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_as_backend_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_associative_scan_lifted_buffers_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_associative_scan_symbol_dim_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_associative_scan_symbol_scandim_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_aten_to_unflatten_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_aten_to_unflatten_subclass_pre_dispatch_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_aten_to_unflatten_subclass_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_cond_preserve_torch_fn_for_subgraphs_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_cond_symbool_pred_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_cond_warns_constant_pred_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_custom_decomp_table_basic_pop_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_custom_decomp_table_container_methods_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_custom_op_lib_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_custom_triton_kernel_mutable_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_custom_triton_kernel_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_cyclic_reference_leak_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_decomp_torture_case_1_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_decomp_torture_case_2_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_decomps_dynamic_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_decomps_simple_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_dynamo_config_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_for_training_run_decomp_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_for_training_with_container_type_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_for_training_with_dynamic_shapes_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_for_training_with_mutation_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_for_training_with_state_dict_hooks_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_func_with_default_kwargs_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_func_with_keyword_only_args_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_func_with_kwargs_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_func_with_pytree_kwargs_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_func_with_var_keyword_args_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_func_with_var_keyword_pytree_args_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_func_with_var_postional_args_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_function_schema_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_graph_with_no_inputs_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_input_mutation_bug_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_input_mutation_dynamic_shape_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_input_mutation_static_shape_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_leak_compile_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_linear_preserve_dynamic_shape_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_max_nonstrict_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_max_onnx_reported_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_method_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_mod_constraints_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_module_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_preserve_linear_at_aot_level_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_preserve_linear_but_not_custom_op_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_rnn_variants_with_warning_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_scan_pytree_output_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_script_module_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_statically_known_true_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_then_compile_tensor_ctor_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_with_autocast_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_with_fake_tensor_inputs_on_cuda_devices_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_with_fake_tensor_inputs_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_with_inline_constraints_complex_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_with_inline_constraints_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_with_set_grad_enabled_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_export_with_wrong_inputs_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_external_call_non_strict_real_tensor_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_fake_inputs_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_fake_weights_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_filter_traceback_frames_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_flex_attention_export_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_float_conversion_from_int_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_float_conversion_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_fqn_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_from_node_metadata_export_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_full_on_scalar_tensor_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_function_holding_tensor_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_hints_wrapper_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_hoo_inline_users_issue_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_if_functional_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_if_post_autograd_op_preserved_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_inductor_backend_inside_nonstrict_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_inline_script_class_method_recursive_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_inline_script_class_method_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_inline_script_function_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_inline_script_method_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_int_shape_specialization_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_intermediate_shape_comp_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_is_exporting_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_is_nonzero_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_isnonzero_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_issue_113041_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_issue_157289_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_issue_161902_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_istft_op_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_keep_composite_ops_invalid_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_keep_composite_ops_linear_convd_for_training_ir_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_keep_composite_ops_linear_convd_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_kwarg_dynamic_shapes_diff_order_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_kwargs_reorder_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_layer_norm_unbacked_normalized_shape_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_layer_sharing_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_lazy_module_kwargs_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_lifted_constants_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_linear_conv_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_malformed_fqn_from_source_name_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_map_buffers_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_map_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_mask_nonzero_static_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_masked_select_dynamic_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_math_pow_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_mismatched_dynamic_shapes_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_mixed_input_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_module_dict_key_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_module_input_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_module_input_subclasses_parameterization_nested_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_module_list_slice_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_module_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_module_with_dict_container_inp_out_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_modules_access_for_deleted_submodule_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_more_multidimensional_slicing_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_multidimensional_slicing_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_multinomial_dynamic_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_multiple_definitions_same_name_dim_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_namedtuple_input_export_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_native_multi_attention_head_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_nested_dynamic_shapes_spec_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_nested_module_fake_tensor_leak_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_nested_module_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_nested_module_with_constant_buffer_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_nested_module_with_init_buffer_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_nested_module_with_parameter_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_nn_module_stack_shared_submodule_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_nn_module_stack_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_no_check_is_size_error_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_no_suggested_fixes_for_data_dependent_errors_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_no_tensor_computation_2_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_no_tensor_computation_3_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_no_tensor_computation_4_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_no_tensor_computation_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_non_arg_name_dynamic_shapes_api_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_non_arg_name_dynamic_shapes_api_with_container_type_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_non_arg_name_dynamic_shapes_api_with_kwarg_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_non_persistent_buffer_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_non_strict_dynamic_shapes_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_non_strict_dynamic_shapes_suggested_fixes_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_none_buffers_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_nonstrict_retrace_preserves_metadata_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_nonzero_2_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_nonzero_dynamic_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_not_registered_parameter_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_operator_aten_tensor_mode_variant_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_output_node_name_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_pad_sequence_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_param_util_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_partial_patched_forward_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_placeholder_naming_collisions_hoo_subgraphs_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_placeholder_naming_collisions_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_placeholder_naming_order_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_placeholder_naming_order_variadic_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_placeholder_update_preserving_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_predispatch_cond_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_predispatch_grad_wrappers_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_preserve_annotation_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_preserve_module_call_signature_unflatten_specialization_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_preserve_requires_grad_placeholders_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_preserve_shape_dynamism_for_unused_inputs_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_profiling_code_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_python_asserts_with_sym_int_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_pytree_register_data_class_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_pytree_register_nested_data_class_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_raise_user_error_when_guard_on_data_dependent_operation_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_range_constraints_with_replacement_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_real_tensor_alias_dtype_mismatch_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_real_tensor_bool_cast_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_real_tensor_errors_on_aliasing_custom_op_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_real_tensor_for_max_op_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_real_tensor_size_mismatch_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_redundant_assert_max_upper_bound_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_redundant_asserts_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_refine_dynamic_shapes_from_suggested_fixes_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_register_constant_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_repeat_interleave_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_replace_unbacked_with_very_large_upperbound_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_replaced_unbacked_bindings_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_reshape_view_helper_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_retracable_ep_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_retrace_pre_autograd_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_run_decomposition_supports_user_input_mutation_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_run_decompositions_keep_metadata_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_run_decompositions_keep_tensor_constant_metadata_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_runtime_assert_for_prim_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_runtime_assert_for_prm_str_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_runtime_assert_with_size_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_sdpa_gqa_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_sequential_slicing_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_set_example_inputs_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_set_grad_as_side_effect_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_set_grad_empty_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_set_grad_unflatten_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_setgrad_lifted_tensor_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_shared_submodule_nn_module_stack_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_simple_export_for_training_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_simple_unbacked_view_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_size_input_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_slice_nn_module_stack_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_solver_unsupported_sympy_function_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_specialize_derived_dim_roots_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_split_const_gm_with_lifted_constants_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_stack_trace_make_fx_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_stack_trace_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_state_primitives_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_state_shape_attribute_assignment_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_state_tensors_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_static_dim_constraints_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_subclass_context_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_subclass_nested_attr_access_complicated_metadata_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_subclass_nested_attr_access_const_metadata_not_top_level_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_subclass_nested_attr_access_const_metadata_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_subclass_nested_attr_access_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_subclass_nested_attr_access_submodule_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_subclasses_parameterization_nested_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_subclasses_parameterization_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_suggest_torch_checks_with_non_negative_check_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_suggest_torch_checks_with_regular_check_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_suggested_fixes_for_data_dependent_errors_basic_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_suggested_fixes_for_data_dependent_errors_puzzlers_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_suggested_fixes_new_roots_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_sym_float_operators_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_sym_or_sym_and_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_sym_sqrt_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_symbool_item_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_symfloat_item_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_symint_input_additional_inputs_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_symint_input_basic_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_symint_input_ranges_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_symint_input_shapes_collection_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_symint_input_specialization_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_symint_item_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_symint_output_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_symint_tensor_return_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_tag_ac_export_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_tensor_attribute_zero_args_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_tensor_constant_aten_to_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_tensor_constant_with_wrapped_method_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_to_module_with_mutated_buffer_multiple_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_to_module_with_mutated_buffer_multiple_update_sub_later_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_to_module_with_mutated_buffer_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_tolist_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_torch_check_eq_commutativity_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_torch_fn_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_trace_under_fake_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_train_eval_on_exported_preautograd_module_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unbacked_3d_matmul_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unbacked_bincount_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unbacked_bindings_for_divisible_u_symint_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unbacked_deferred_runtime_retrace_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unbacked_expand_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unbacked_infer_size_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unbacked_kth_value_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unbacked_linear_layer_norm_input_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unbacked_noncontig_lin_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unbacked_pad_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unbacked_scalar_constructor_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unbacked_slice_forward_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unbacked_slice_simple_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unbacked_stack_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unbacked_to_cond_passthrough_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unbacked_to_cond_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unbacked_unsqueeze_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_asserts_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_buffer_update_child2parent_swap_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_closure_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_isinstance_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_multiple_graphs_dispatch_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_multiple_graphs_preserve_signature_no_error_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_multiple_graphs_shared_submodule_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_multiple_graphs_state_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_no_unroll_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_placeholder_update_child2parent_swap_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_placeholder_update_grandchild2cousin_swap_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_random_dag_5_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_random_dag_6_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_random_dag_buf_8_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_random_dag_const_preserving_3_1_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_random_dag_const_preserving_3_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_random_dag_mutating_buf_4_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_random_dag_mutating_buf_6_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_random_dag_mutating_buf_9_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_random_dag_mutating_buf_preserving_10_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_random_dag_mutating_buf_preserving_4_1_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_random_dag_mutating_buf_preserving_4_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_random_dag_mutating_buf_preserving_5_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_random_dag_mutating_buf_preserving_7_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unflatten_random_dag_preserving_4_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unused_aliases_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_unused_constant_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_use_embedding_twice_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_user_input_and_buffer_mutation_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_vmap_custom_autograd_function_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_vmap_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_vmap_to_assert_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_where_decomp_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_while_loop_assert_separation_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_while_loop_index_assertions_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_while_loop_simple_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_while_loop_tensor_constant_idx_strict_export_v2, test/export/test_strict_export_v2.py::StrictExportV2TestExport::test_wrapper_module_strict_export_v2 2025-10-10T01:46:44.3414020Z 2025-10-10T01:46:48.1617098Z Running inductor/test_deterministic 1/1 ... [2025-10-10 01:46:48.161166] 2025-10-10T01:46:48.1617581Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:46:48.1618943Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_deterministic.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:46:48.161541] 2025-10-10T01:46:55.4914119Z 2025-10-10T01:46:55.4915583Z inductor/test_deterministic 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_deterministic_1.1_d504b2f8497fdcf1_.log 2025-10-10T01:46:55.4920921Z Running 7 items in this shard: test/inductor/test_deterministic.py::DeterministicTest::test_max_autotune_deterministic_False, test/inductor/test_deterministic.py::DeterministicTest::test_max_autotune_deterministic_True, test/inductor/test_deterministic.py::DeterministicTest::test_mm_padding_deterministic_False, test/inductor/test_deterministic.py::DeterministicTest::test_mm_padding_deterministic_True, test/inductor/test_deterministic.py::DeterministicTest::test_pointwise_coordesc_tuning, test/inductor/test_deterministic.py::DeterministicTest::test_reduction_coordesc_tuning_deterministic_False, test/inductor/test_deterministic.py::DeterministicTest::test_reduction_coordesc_tuning_deterministic_True 2025-10-10T01:46:55.4925350Z 2025-10-10T01:46:59.3872234Z Running inductor/test_flex_decoding 1/1 ... [2025-10-10 01:46:59.386625] 2025-10-10T01:46:59.3872870Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:46:59.3874524Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_flex_decoding.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:46:59.387017] 2025-10-10T01:47:07.7190916Z 2025-10-10T01:47:07.7191861Z inductor/test_flex_decoding 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_flex_decoding_1.1_014e239ef74bb15f_.log 2025-10-10T01:47:07.7466506Z Running 572 items in this shard: test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod0_head_dims0_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod0_head_dims1_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod0_head_dims2_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod1_head_dims0_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod1_head_dims1_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod1_head_dims2_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod2_head_dims0_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod2_head_dims1_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod2_head_dims2_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod3_head_dims0_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod3_head_dims1_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod3_head_dims2_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod4_head_dims0_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod4_head_dims1_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod4_head_dims2_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod5_head_dims0_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod5_head_dims1_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod5_head_dims2_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod6_head_dims0_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod6_head_dims1_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod6_head_dims2_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod7_head_dims0_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod7_head_dims1_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod7_head_dims2_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod8_head_dims0_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod8_head_dims1_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_bfloat16_score_mod8_head_dims2_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod0_BLOCK_SIZE2_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod0_BLOCK_SIZE3_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod0_BLOCK_SIZE_128_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod0_BLOCK_SIZE_64_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod1_BLOCK_SIZE2_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod1_BLOCK_SIZE3_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod1_BLOCK_SIZE_128_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod1_BLOCK_SIZE_64_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod2_BLOCK_SIZE2_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod2_BLOCK_SIZE3_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod2_BLOCK_SIZE_128_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod2_BLOCK_SIZE_64_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod3_BLOCK_SIZE2_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod3_BLOCK_SIZE3_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod3_BLOCK_SIZE_128_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod3_BLOCK_SIZE_64_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod4_BLOCK_SIZE2_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod4_BLOCK_SIZE3_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod4_BLOCK_SIZE_128_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod4_BLOCK_SIZE_64_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod5_BLOCK_SIZE2_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod5_BLOCK_SIZE3_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod5_BLOCK_SIZE_128_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod5_BLOCK_SIZE_64_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod6_BLOCK_SIZE2_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod6_BLOCK_SIZE3_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod6_BLOCK_SIZE_128_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod6_BLOCK_SIZE_64_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod7_BLOCK_SIZE2_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod7_BLOCK_SIZE3_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod7_BLOCK_SIZE_128_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod7_BLOCK_SIZE_64_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod8_BLOCK_SIZE2_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod8_BLOCK_SIZE3_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod8_BLOCK_SIZE_128_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_bfloat16_score_mod8_BLOCK_SIZE_64_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod0_BLOCK_SIZE2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod0_BLOCK_SIZE3_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod0_BLOCK_SIZE_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod0_BLOCK_SIZE_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod1_BLOCK_SIZE2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod1_BLOCK_SIZE3_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod1_BLOCK_SIZE_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod1_BLOCK_SIZE_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod2_BLOCK_SIZE2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod2_BLOCK_SIZE3_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod2_BLOCK_SIZE_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod2_BLOCK_SIZE_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod3_BLOCK_SIZE2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod3_BLOCK_SIZE3_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod3_BLOCK_SIZE_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod3_BLOCK_SIZE_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod4_BLOCK_SIZE2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod4_BLOCK_SIZE3_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod4_BLOCK_SIZE_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod4_BLOCK_SIZE_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod5_BLOCK_SIZE2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod5_BLOCK_SIZE3_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod5_BLOCK_SIZE_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod5_BLOCK_SIZE_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod6_BLOCK_SIZE2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod6_BLOCK_SIZE3_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod6_BLOCK_SIZE_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod6_BLOCK_SIZE_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod7_BLOCK_SIZE2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod7_BLOCK_SIZE3_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod7_BLOCK_SIZE_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod7_BLOCK_SIZE_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod8_BLOCK_SIZE2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod8_BLOCK_SIZE3_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod8_BLOCK_SIZE_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float16_score_mod8_BLOCK_SIZE_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod0_BLOCK_SIZE2_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod0_BLOCK_SIZE3_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod0_BLOCK_SIZE_128_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod0_BLOCK_SIZE_64_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod1_BLOCK_SIZE2_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod1_BLOCK_SIZE3_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod1_BLOCK_SIZE_128_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod1_BLOCK_SIZE_64_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod2_BLOCK_SIZE2_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod2_BLOCK_SIZE3_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod2_BLOCK_SIZE_128_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod2_BLOCK_SIZE_64_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod3_BLOCK_SIZE2_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod3_BLOCK_SIZE3_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod3_BLOCK_SIZE_128_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod3_BLOCK_SIZE_64_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod4_BLOCK_SIZE2_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod4_BLOCK_SIZE3_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod4_BLOCK_SIZE_128_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod4_BLOCK_SIZE_64_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod5_BLOCK_SIZE2_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod5_BLOCK_SIZE3_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod5_BLOCK_SIZE_128_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod5_BLOCK_SIZE_64_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod6_BLOCK_SIZE2_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod6_BLOCK_SIZE3_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod6_BLOCK_SIZE_128_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod6_BLOCK_SIZE_64_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod7_BLOCK_SIZE2_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod7_BLOCK_SIZE3_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod7_BLOCK_SIZE_128_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod7_BLOCK_SIZE_64_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod8_BLOCK_SIZE2_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod8_BLOCK_SIZE3_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod8_BLOCK_SIZE_128_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_different_block_size_float32_score_mod8_BLOCK_SIZE_64_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod0_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod0_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod0_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod1_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod1_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod1_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod2_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod2_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod2_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod3_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod3_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod3_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod4_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod4_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod4_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod5_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod5_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod5_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod6_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod6_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod6_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod7_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod7_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod7_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod8_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod8_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float16_score_mod8_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod0_head_dims0_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod0_head_dims1_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod0_head_dims2_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod1_head_dims0_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod1_head_dims1_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod1_head_dims2_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod2_head_dims0_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod2_head_dims1_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod2_head_dims2_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod3_head_dims0_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod3_head_dims1_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod3_head_dims2_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod4_head_dims0_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod4_head_dims1_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod4_head_dims2_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod5_head_dims0_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod5_head_dims1_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod5_head_dims2_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod6_head_dims0_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod6_head_dims1_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod6_head_dims2_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod7_head_dims0_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod7_head_dims1_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod7_head_dims2_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod8_head_dims0_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod8_head_dims1_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_builtin_score_mods_float32_score_mod8_head_dims2_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_bw_decoding_fails_float16_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_captured_buffers_all_dims_bfloat16_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_captured_buffers_all_dims_float16_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_captured_buffers_all_dims_float32_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_captured_buffers_bfloat16_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_captured_buffers_float16_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_captured_buffers_float32_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_captured_reduction_float16_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_captured_scale_float16_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_decode_at_different_input_position_float16_score_mod0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_decode_at_different_input_position_float16_score_mod1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_decode_at_different_input_position_float16_score_mod2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_decode_at_different_input_position_float16_score_mod3_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_decode_at_different_input_position_float16_score_mod4_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_decode_at_different_input_position_float16_score_mod5_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_decode_at_different_input_position_float16_score_mod6_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_decode_at_different_input_position_float16_score_mod7_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_decode_at_different_input_position_float16_score_mod8_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_do_not_trigger_dynamic_shapes_on_empty_block_mask_cuda, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_fully_masked_out_rows_0_check_gqa_cuda, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_function_composition_bfloat16_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_function_composition_float16_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_function_composition_float32_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod0_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod0_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod0_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod1_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod1_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod1_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod2_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod2_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod2_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod3_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod3_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod3_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod4_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod4_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod4_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod5_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod5_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod5_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod6_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod6_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod6_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod7_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod7_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod7_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod8_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod8_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_head_dependent_mask_mod_float16_score_mod8_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims0_score_mod0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims0_score_mod1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims0_score_mod2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims0_score_mod3_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims0_score_mod4_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims0_score_mod5_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims0_score_mod6_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims0_score_mod7_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims0_score_mod8_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims1_score_mod0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims1_score_mod1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims1_score_mod2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims1_score_mod3_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims1_score_mod4_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims1_score_mod5_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims1_score_mod6_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims1_score_mod7_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims1_score_mod8_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims2_score_mod0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims2_score_mod1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims2_score_mod2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims2_score_mod3_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims2_score_mod4_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims2_score_mod5_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims2_score_mod6_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims2_score_mod7_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims2_score_mod8_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims3_score_mod0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims3_score_mod1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims3_score_mod2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims3_score_mod3_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims3_score_mod4_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims3_score_mod5_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims3_score_mod6_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims3_score_mod7_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims0_batch_dims3_score_mod8_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims0_score_mod0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims0_score_mod1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims0_score_mod2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims0_score_mod3_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims0_score_mod4_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims0_score_mod5_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims0_score_mod6_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims0_score_mod7_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims0_score_mod8_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims1_score_mod0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims1_score_mod1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims1_score_mod2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims1_score_mod3_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims1_score_mod4_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims1_score_mod5_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims1_score_mod6_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims1_score_mod7_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims1_score_mod8_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims2_score_mod0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims2_score_mod1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims2_score_mod2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims2_score_mod3_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims2_score_mod4_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims2_score_mod5_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims2_score_mod6_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims2_score_mod7_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims2_score_mod8_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims3_score_mod0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims3_score_mod1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims3_score_mod2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims3_score_mod3_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims3_score_mod4_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims3_score_mod5_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims3_score_mod6_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims3_score_mod7_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims1_batch_dims3_score_mod8_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims0_score_mod0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims0_score_mod1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims0_score_mod2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims0_score_mod3_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims0_score_mod4_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims0_score_mod5_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims0_score_mod6_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims0_score_mod7_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims0_score_mod8_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims1_score_mod0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims1_score_mod1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims1_score_mod2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims1_score_mod3_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims1_score_mod4_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims1_score_mod5_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims1_score_mod6_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims1_score_mod7_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims1_score_mod8_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims2_score_mod0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims2_score_mod1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims2_score_mod2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims2_score_mod3_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims2_score_mod4_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims2_score_mod5_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims2_score_mod6_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims2_score_mod7_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims2_score_mod8_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims3_score_mod0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims3_score_mod1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims3_score_mod2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims3_score_mod3_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims3_score_mod4_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims3_score_mod5_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims3_score_mod6_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims3_score_mod7_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_kv_batch_broadcast_float16_head_dims2_batch_dims3_score_mod8_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_larger_block_mask_bug_float16_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_load_from_bias_head_seq_batch_float16_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_load_from_bias_seq_batch_float16_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_load_from_bias_seq_only_float16_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_logsumexp_correctness_bfloat16_score_mod0_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_logsumexp_correctness_bfloat16_score_mod1_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_logsumexp_correctness_float16_score_mod0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_logsumexp_correctness_float16_score_mod1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_logsumexp_correctness_float32_score_mod0_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_logsumexp_correctness_float32_score_mod1_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_logsumexp_only_return_cuda, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_max_autotune_cuda, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_max_autotune_with_captured_cuda, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_mixed_dtypes_fails_cuda, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_multiple_score_mod_calls2_cuda, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_multiple_score_mod_calls_cuda, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_multiple_score_mod_calls_paged_attention2_cuda, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_multiple_score_mod_calls_paged_attention_cuda, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_njt_causal_bfloat16_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_njt_causal_float16_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_njt_causal_float32_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_divisible_multi_token_offset_mask_cuda, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_divisible_multi_token_offset_mask_with_captured_buffer_cuda, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_divisible_offset_mask_cuda, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_divisible_offset_mask_with_captured_buffer_cuda, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod0_bfloat16_head_dims0_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod0_bfloat16_head_dims1_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod0_float16_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod0_float16_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod0_float32_head_dims0_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod0_float32_head_dims1_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod1_bfloat16_head_dims0_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod1_bfloat16_head_dims1_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod1_float16_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod1_float16_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod1_float32_head_dims0_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod1_float32_head_dims1_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod2_bfloat16_head_dims0_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod2_bfloat16_head_dims1_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod2_float16_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod2_float16_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod2_float32_head_dims0_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod2_float32_head_dims1_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod3_bfloat16_head_dims0_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod3_bfloat16_head_dims1_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod3_float16_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod3_float16_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod3_float32_head_dims0_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod3_float32_head_dims1_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod4_bfloat16_head_dims0_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod4_bfloat16_head_dims1_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod4_float16_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod4_float16_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod4_float32_head_dims0_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod4_float32_head_dims1_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod5_bfloat16_head_dims0_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod5_bfloat16_head_dims1_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod5_float16_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod5_float16_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod5_float32_head_dims0_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod5_float32_head_dims1_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod6_bfloat16_head_dims0_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod6_bfloat16_head_dims1_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod6_float16_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod6_float16_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod6_float32_head_dims0_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod6_float32_head_dims1_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod7_bfloat16_head_dims0_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod7_bfloat16_head_dims1_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod7_float16_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod7_float16_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod7_float32_head_dims0_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod7_float32_head_dims1_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod8_bfloat16_head_dims0_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod8_bfloat16_head_dims1_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod8_float16_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod8_float16_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod8_float32_head_dims0_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_equal_head_dims_score_mod8_float32_head_dims1_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_non_sparse_mulitple_block_size_cuda, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_not_pw_of_two_cuda, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_padded_dense_causal_float16_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod0_head_dims0_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod0_head_dims0_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod0_head_dims0_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod0_head_dims1_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod0_head_dims1_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod0_head_dims1_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod0_head_dims2_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod0_head_dims2_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod0_head_dims2_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod1_head_dims0_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod1_head_dims0_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod1_head_dims0_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod1_head_dims1_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod1_head_dims1_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod1_head_dims1_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod1_head_dims2_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod1_head_dims2_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod1_head_dims2_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod2_head_dims0_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod2_head_dims0_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod2_head_dims0_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod2_head_dims1_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod2_head_dims1_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod2_head_dims1_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod2_head_dims2_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod2_head_dims2_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod2_head_dims2_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod3_head_dims0_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod3_head_dims0_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod3_head_dims0_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod3_head_dims1_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod3_head_dims1_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod3_head_dims1_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod3_head_dims2_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod3_head_dims2_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod3_head_dims2_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod4_head_dims0_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod4_head_dims0_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod4_head_dims0_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod4_head_dims1_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod4_head_dims1_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod4_head_dims1_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod4_head_dims2_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod4_head_dims2_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod4_head_dims2_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod5_head_dims0_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod5_head_dims0_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod5_head_dims0_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod5_head_dims1_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod5_head_dims1_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod5_head_dims1_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod5_head_dims2_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod5_head_dims2_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod5_head_dims2_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod6_head_dims0_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod6_head_dims0_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod6_head_dims0_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod6_head_dims1_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod6_head_dims1_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod6_head_dims1_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod6_head_dims2_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod6_head_dims2_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod6_head_dims2_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod7_head_dims0_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod7_head_dims0_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod7_head_dims0_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod7_head_dims1_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod7_head_dims1_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod7_head_dims1_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod7_head_dims2_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod7_head_dims2_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod7_head_dims2_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod8_head_dims0_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod8_head_dims0_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod8_head_dims0_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod8_head_dims1_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod8_head_dims1_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod8_head_dims1_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod8_head_dims2_page_size_128_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod8_head_dims2_page_size_256_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_paged_attention_page_size_float16_score_mod8_head_dims2_page_size_64_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_recompile_changed_score_mod_float16_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_seq_masking_float16_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_silu_on_score_float16_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_skip_odd_keys_bfloat16_cuda_bfloat16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_skip_odd_keys_float16_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_skip_odd_keys_float32_cuda_float32, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s0_v_s0_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s0_v_s0_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s0_v_s0_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s0_v_s1_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s0_v_s1_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s0_v_s1_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s0_v_s2_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s0_v_s2_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s0_v_s2_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s0_v_s3_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s0_v_s3_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s0_v_s3_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s1_v_s0_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s1_v_s0_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s1_v_s0_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s1_v_s1_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s1_v_s1_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s1_v_s1_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s1_v_s2_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s1_v_s2_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s1_v_s2_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s1_v_s3_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s1_v_s3_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s1_v_s3_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s2_v_s0_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s2_v_s0_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s2_v_s0_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s2_v_s1_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s2_v_s1_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s2_v_s1_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s2_v_s2_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s2_v_s2_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s2_v_s2_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s2_v_s3_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s2_v_s3_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s2_v_s3_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s3_v_s0_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s3_v_s0_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s3_v_s0_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s3_v_s1_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s3_v_s1_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s3_v_s1_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s3_v_s2_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s3_v_s2_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s3_v_s2_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s3_v_s3_head_dims0_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s3_v_s3_head_dims1_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_strided_inputs_float16_k_s3_v_s3_head_dims2_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_subgraph_respect_decompostion_float16_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_tma_decoding_float16_cuda_float16, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_windowed_full_mask_vs_sdpa_cuda, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_windowed_full_mask_vs_sdpa_paged_attention_cuda, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_windowed_no_mask_vs_sdpa_cuda, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_windowed_no_mask_vs_sdpa_paged_attention_cuda, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_windowed_partial_block_vs_sdpa_cuda, test/inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_windowed_partial_block_vs_sdpa_paged_attention_cuda 2025-10-10T01:47:07.7730140Z 2025-10-10T01:47:11.6021175Z Running export/test_unflatten_training_ir 1/1 ... [2025-10-10 01:47:11.601556] 2025-10-10T01:47:11.6021767Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:47:11.6025123Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_unflatten_training_ir.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:47:11.602073] 2025-10-10T01:47:15.5756317Z 2025-10-10T01:47:15.5757601Z export/test_unflatten_training_ir 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_unflatten_training_ir_1.1_db15d3e124fd5785_.log 2025-10-10T01:47:15.5779217Z Running 28 items in this shard: test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_assert_tensor_metadata_stack_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_attr_as_submod_input_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_dedup_sym_size_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_double_nested_submodule_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_duplicate_placeholder_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_fx_trace_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_nested_leaf_non_strict_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_placeholder_and_get_attr_ordering_after_unflattened_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_simple_alias_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_unflatten_buffer_mutation_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_unflatten_constant_obj_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_unflatten_constant_tensor_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_unflatten_container_type_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_unflatten_eager_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_unflatten_empty_branch_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_unflatten_nested_access_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_unflatten_nested_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_unflatten_none_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_unflatten_param_list_dict_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_unflatten_preserve_signature_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_unflatten_preserve_with_unused_input_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_unflatten_requires_grad_param_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_unflatten_shared_submodule_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_unflatten_skipped_call_module_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_unflatten_submodule_ordering_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_unflatten_with_inplace_compile_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_unflatten_wrong_input_training_ir, test/export/test_unflatten_training_ir.py::TrainingIRUnflattenTestUnflatten::test_unflattened_module_nodes_has_meta_val_training_ir 2025-10-10T01:47:15.5801884Z 2025-10-10T01:47:19.4984165Z Running inductor/test_aot_inductor_arrayref 1/1 ... [2025-10-10 01:47:19.497827] 2025-10-10T01:47:19.4984775Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:47:19.4987164Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor_arrayref.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:47:19.498257] 2025-10-10T01:47:29.0719287Z 2025-10-10T01:47:29.0720668Z inductor/test_torchinductor_dynamic_shapes 1/2 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_dynamic_shapes_1.2_57e050d15a0ef298_.log 2025-10-10T01:47:29.1118931Z Running 869 items in this shard: test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_AllenaiLongformerBase_repro_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test__dyn_quant_matmul_4bit_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test__unsafe_masked_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_adaptive_avg_pool1d_argmax_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_adaptive_avg_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_adaptive_avg_pool2d_low_prec_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_adaptive_avg_pool_with_output_size_0_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_adaptive_max_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_adaptive_max_pool2d2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_adaptive_pool_errors_with_long_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_add_complex3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_add_complex4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_add_complex7_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_add_complex8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_add_complex9_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_add_complex_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_add_complex_strided_fallback_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_add_const_float_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_add_const_int_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_add_inplace_permuted_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_addmm_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_addmv_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_alexnet_prefix_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_aliased_buffer_reuse_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_allow_reuse_disable_if_exceed_peak_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_angle_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_any_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_aoti_eager_cache_hit_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_aoti_eager_dtype_device_layout_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_aoti_eager_with_persistent_cache_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_arange2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_arange5_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_arange6_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_argmax_argmin1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_argmax_argmin2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_argmax_argmin_with_duplicates_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_argmax_argmin_with_nan_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_argmax_min_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_argmax_to_float_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_as_strided_on_views_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_avg_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_avg_pool2d4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_avg_pool2d8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_avg_pool2d_backward2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_avg_pool2d_backward_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_avg_pool3d_backward2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_avg_pool3d_backward3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_avg_pool3d_backward4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_avg_pool_errors_with_uint_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_batch_norm_2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bernoulli1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bitwise2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bitwise_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bmm2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_broadcast_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int16_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int16_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int16_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int16_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int32_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int64_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int64_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int64_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int64_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int8_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int8_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int8_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int8_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int8_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_uint8_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_uint8_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_buffer_copied_in_graph_with_different_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_buffer_use_after_remove_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_builtins_round_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_builtins_round_float_ndigits_zero_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_builtins_round_int_ndigits_pos_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_builtins_round_int_ndigits_zero_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cat_empty_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cat_inplace_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cat_negative_dim_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cat_of_loops_and_extern_kernel_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cat_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cat_unbacked_2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cat_upcasting_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_chunk_recompiles_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_clamp_type_promotion_non_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_clone_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_compar_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_complex_memory_overlap_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_config_option_dont_assume_alignment_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_consecutive_split_cumprod_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_consecutive_split_cumsum_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_constant_pad_1d_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_constant_pad_2d_strides_nonpositive_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_constant_pad_fill_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_constant_pad_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_conv1d_with_permute_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_conv2d_backward_channels_last_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_conv3d_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_conv_bn_fuse_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_conv_shape_check_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_conv_with_as_strided_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_convolution5_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_copy_with_scalar_src_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cpu_scalar_with_cpu_scalar_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cpu_scalar_with_cpu_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cpu_scalar_with_gpu_tensor_cpp_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cpu_tensor_with_gpu_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cudnn_rnn_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cummin_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_custom_op_1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_custom_op_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_custom_op_3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_custom_op_default_layout_constraint_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_custom_scan_op_compiled_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_custom_scan_op_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_custom_scan_would_split_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_data_type_propogation_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dense_mask_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_deterministic_codegen_on_graph_break_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_deterministic_codegen_with_suffix_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_device_assert_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_diagonal_copy_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dist_bf16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dist_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div7_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div9_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div_precision_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div_softmax_symfloat_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div_zero_dim_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dont_constant_fold_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dropout_deterministic_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dropout_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_bfloat16_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_bfloat16_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_bfloat16_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float16_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float16_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float16_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float16_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float16_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float16_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float16_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float16_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float32_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float32_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float32_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float32_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float64_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float64_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float64_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float64_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int16_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int16_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int16_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int16_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int16_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int16_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int32_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int32_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int32_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int32_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int64_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int64_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int64_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int64_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int8_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int8_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int8_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int8_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int8_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int8_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_uint8_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_uint8_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_uint8_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_uint8_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_embedding_bag_byte_unpack_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_embedding_bag_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_embedding_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_empty_strided_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_erfinv_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_exp_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_expanded_reduction_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fallback_mutable_op_basic_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fallback_mutable_op_list_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fill2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_float32_to_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_float_index_expression_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_float_index_expression_type_promotion_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_floordiv_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fmin_fmax_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fmod_zero_dim_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fractional_max_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fractional_max_pool2d2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fractional_max_pool2d4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_full_like_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_full_truncation_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_functionalize_rng_wrappers_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fusing_write_into_disjoint_read_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_gather1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_gather2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_gelu_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_generated_code_has_alignment_assert_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_gpu_scalar_with_cpu_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_graph_partition_argmax_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_graph_partition_constant_tensor1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_graph_partition_misaligned_input_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_graph_partition_mutation_real_name_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_graph_partition_no_inputs_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_graph_partition_pad_dynamic_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_graph_partition_scalar_inputs_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_graph_partition_unbacked_symint_as_output_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_grid_sampler_expand_preserves_view_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_hardsigmoid_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_hardswish_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_horizonal_fusion2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_dynamic_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_propagation_device_assert_masked_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_propagation_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_propagation_flip_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_propagation_nested_indirect_indexing_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_put1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_put3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_put_deterministic_fallback_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_put_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_remainder_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_select_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_inductor_multiple_specializations_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_inductor_triton_bucketize_respects_masking_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_inplace_activations_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_inplace_resize_as_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_input_mutation2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_input_mutation4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_int_input_dynamic_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_invalid_operand_issue1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_issue102546_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_large_grid_use_block_ptr_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_large_offset_pointwise_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_layer_norm_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_leaky_relu_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_lerp_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_lgamma_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_like_channels_last_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_like_rands_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_like_rands_sliced_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_linear_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_linear_mixed_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_linspace1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_linspace2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_log1p_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_log_softmax_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_logaddexp_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_logcumsumexp_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_logcumsumexp_zero_dim_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_logsumexp_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_low_memory_max_pool_dilation_1_dim_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_low_memory_max_pool_dilation_2_dim_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_low_memory_max_pool_dilation_2_dim_3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_masked_fill_promotion_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_masked_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_min_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d6_dilation_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d7_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d_with_indices_backward2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d_with_indices_backward3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d_with_indices_backward4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d_with_indices_backward6_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_min_max_reduction_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_misaligned_address_issue1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_mixed_mm2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_mixed_mm3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_mm_mixed_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_mm_views_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_mul_softmax_symfloat_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_multi_gpu_device_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_multi_threading_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_multilayer_var_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_multilayer_var_lowp_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_mutable_custom_op_fixed_layout_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_mutations_loop_fusion_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_nan_sort_stable_False_descending_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_nan_sort_stable_False_descending_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_nan_sort_stable_True_descending_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_nan_sort_stable_True_descending_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_needs_contiguous_strides_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_neg_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_nll_loss_forward_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pad_single_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pattern_matcher_multi_user_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pattern_matcher_unbacked_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_permute2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_philox_rand_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_bessel_j0_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_bessel_j1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_bessel_y0_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_chebyshev_polynomial_v_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_erfc_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_erfcx_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_erfinv_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_gammainc_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_hermite_polynomial_h_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_hermite_polynomial_he_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_laguerre_polynomial_l_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_legendre_polynomial_p_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_log1p_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_log_ndtr_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_logit_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_modified_bessel_i0_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_modified_bessel_i1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_modified_bessel_k1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_multigammaln_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_ndtri_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_polygamma_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_round_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_scaled_modified_bessel_k1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_shifted_chebyshev_polynomial_u_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_shifted_chebyshev_polynomial_w_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_sinc_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_xlog1py_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_zeta_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_polar_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pow3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pow_int_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pow_symfloat_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_prepare_softmax_with_fast_math_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_profiler_mark_wrapper_call_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_rand_like_deterministic_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_randn_like_empty_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_randn_with_dtype_and_device_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_reduction1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_reduction2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_reduction4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_reinterpret_dtypeview_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_remainder_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_remove_no_ops_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_remove_noop_clone_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_remove_noop_copy_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_remove_noop_view_default_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_repeat_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_repeat_interleave_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_repeat_interleave_Tensor_decomp_int32_nd_1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_repeat_interleave_Tensor_decomp_int32_nd_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_repeat_interleave_Tensor_decomp_int64_nd_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_repeat_interleave_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_replication_pad_errors_with_bool_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_require_stride_expanded_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_resize_as_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_reuse_buffers_with_aliasing_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_round_correctness_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_round_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scalar_input_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scalar_output_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scaled_dot_product_attention_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scaled_dot_product_efficient_attention_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scatter4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scatter5_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scatter_add1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scatter_add2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scatter_reduce1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scatter_reduce2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scatter_reduce3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sdpa_prefer_nd_tiling_True_use_block_ptr_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sdpa_prefer_nd_tiling_True_use_block_ptr_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sdpa_unaligned_mask_freezing_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_searchsorted_broadcast_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sgn_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sgn_extremal_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_shape_prop_torch_ones_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_should_pad_bench_for_bmm_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sign_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sin_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_single_elem_indirect_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_size_asserts_for_multi_output_fallback_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_slice4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_slice_mutation1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_slice_mutation2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_slice_scatter3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_slice_scatter5_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_slice_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_softmax_backward_data_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sort_bool_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sort_stable_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_special_polygamma_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_split_cumsum_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_split_cumsum_low_prec_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_split_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_split_failed_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_split_with_integer_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_split_with_unbacked_symints_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sqrt_dynamic_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_squeeze1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_squeeze2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_squeeze_varargs_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_stack_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_std_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_stride_preservation_with_stride_modifying_fx_pass_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_strided_inputs_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sum1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sum4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sum_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sum_keepdims_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_tanh_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_tensor2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_tmp_not_defined_issue3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_to_device_constant_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_topk_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_transpose_add_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_transposed_propagates_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_uint_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_unbacked_floordiv_simplify_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_unspec_inputs_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_unspec_inputs_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_unspec_inputs_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_unspec_inputs_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_unsqueeze_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_unsqueeze_inplace_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_upsample_bilinear2d_a_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_upsample_cat_conv_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_upsample_nearest2d_backward_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_var_mean_tile_reduction_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_vdd_clamp_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_vectorized_ops_masked_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_vectorized_ops_masked_var_novec_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_vertical_fusion1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_view_on_aliased_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_view_uint8_through_differing_bitwidths_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_views2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_views5_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_views7_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_where_broadcast_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_xblock_divides_xnumel_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_zero_dim_reductions_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test__dyn_quant_matmul_4bit_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test__dyn_quant_pack_4bit_weight_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test__unsafe_masked_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_adaptive_avg_pool1d_argmax_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_adaptive_avg_pool2d1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_adaptive_avg_pool2d2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_adaptive_avg_pool_errors_with_long_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_adaptive_pool_errors_with_long_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_add_complex4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_add_complex5_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_add_complex8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_add_complex_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_add_const_float_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_add_const_int_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_addmm_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_allow_reuse_active_if_under_peak_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_allow_reuse_disable_if_exceed_peak_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_angle_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_aoti_eager_support_out_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_aoti_eager_with_persistent_cache_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_arange3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_arange4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_arange5_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_argmax_argmin1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_argmax_argmin_with_duplicates_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_argmax_min_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_as_strided_on_views_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_assert_alignment_op_name_pass_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_assert_size_stride_op_name_fail_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_avg_pool2d6_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_avg_pool2d7_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_avg_pool2d_backward4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_avg_pool3d_backward2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_avg_pool3d_backward4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_avg_pool3d_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bfloat16_to_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bitwise2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bitwise_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bmm1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bool_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_both_scalars_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_add_autotune_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_broadcast_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_int16_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_int16_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_int16_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_int32_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_int32_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_int64_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_int64_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_int64_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_int8_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_int8_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_uint8_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_uint8_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_buffer_copied_in_graph_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_buffer_copied_in_graph_with_different_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cat_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cat_empty_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cat_extern_kernel_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cat_negative_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cat_of_loops_and_extern_kernel_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cat_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cat_unbacked_2d_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cat_unbacked_empty_1d_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cat_unbacked_legacy_empty_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cauchy_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_chunk_recompiles_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_clamp_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_clamp_type_promotion_non_tensor_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_complex_memory_overlap_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_concat_add_inplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_constant_pad_1d_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_constant_pad_2d_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_constant_pad_2d_strides_nonpositive_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_constant_pad_3d_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_constant_pad_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_constant_pad_nd_inplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_conv2d_channels_last_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_conv3d_channels_last_use_block_ptr_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_conv3d_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_conv_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_conv_with_as_strided_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_convolution2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_convolution3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_convolution4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_copy_non_blocking_is_pinned_use_cat_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_copy_with_scalar_src_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cos_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cpu_scalar_with_cpu_scalar_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cpu_scalar_with_gpu_tensor_cpp_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cpu_scalar_with_gpu_tensor_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cummin_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cumprod_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cumsum_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cumsum_no_mask_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cumsum_pattern_matcher_issue_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_custom_op_3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_custom_op_fixed_layout_sequential_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_custom_op_unbacked_symints_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_custom_scan_op_compiled_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_custom_scan_op_multi_input_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_custom_scan_would_split_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_data_type_propogation_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dense_mask_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_deterministic_codegen_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_deterministic_codegen_with_suffix_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_device_assert_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_diagonal_copy_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dist_bf16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_div1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_div3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_div5_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_div_prim_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_div_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dropout_deterministic_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dropout_trivial_1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_bfloat16_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_bfloat16_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_bfloat16_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_bfloat16_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_bfloat16_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float16_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float16_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float16_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float32_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float32_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float32_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float32_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float64_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float64_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float64_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float64_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float64_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float64_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_fusion_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int16_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int16_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int16_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int16_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int16_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int32_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int32_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int32_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int32_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int32_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int64_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int64_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int64_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int64_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int8_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int8_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_uint8_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_uint8_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_uint8_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_uint8_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_embedding_bag_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_embedding_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_empty_strided_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_erfc_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_exp_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fallback_mutable_op_list_tensor_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fallback_mutable_op_no_mutated_tensors_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fallback_mutable_op_with_return_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fft_real_input_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_float16_to_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_float32_to_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fmod_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fractional_max_pool2d1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fractional_max_pool2d4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_full_like_sliced_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_functionalize_rng_wrappers_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fuse_large_params_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fuse_tiled_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fusing_write_into_disjoint_read_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_gather3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_gather_scatter_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_gelu_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_generated_code_has_alignment_assert_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_generated_code_has_size_stride_assert_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_gpu_scalar_with_cpu_tensor_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_graph_partition_argmax_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_graph_partition_both_scalars_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_graph_partition_constant_tensor2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_graph_partition_misaligned_input_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_graph_partition_mutation_real_name_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_graph_partition_pad_dynamic_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_hardsigmoid_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_hardtanh_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_horizonal_fusion1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_dynamic_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_propagation_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_propagation_flip_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_propagation_floordiv_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_put3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_put4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_put_as_masked_fill_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_put_deterministic_fallback_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_put_reinplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_inductor_assert_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_inductor_multiple_specializations_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_inductor_triton_bucketize_respects_masking_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_inplace_flip_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_inplace_mixed_dtype_ops_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_inplace_resize_as_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_inplace_where_pointwise_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_input_mutation1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_input_mutation3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_insignificant_strides_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_int8_weight_only_quant_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_int_input_dynamic_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_invalid_operand_issue1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_isin_tensor_scalar_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_isinf2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_kernel_names_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_large_grid_use_block_ptr_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_large_offset_pointwise_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_large_tensor_reduction_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_layer_norm_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_lgamma_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_like_rands2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_like_rands3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_like_rands_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_linear2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_linear_dynamic_maxautotune_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_linspace1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_linspace2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_linspace3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_log2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_log_softmax_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_logcumsumexp_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_low_memory_max_pool_dilation_1_dim_2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_low_memory_max_pool_dilation_2_dim_2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_mark_dynamic_with_hint_override_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_masked_fill_promotion_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_masked_scatter_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_matmul_layer_norm_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d5_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d6_dilation_1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d7_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d_with_indices_backward2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d_with_indices_backward4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d_with_indices_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_min_max_reduction_nan_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_mixed_mm3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_mixed_mm_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_mm_views_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_mul_softmax_symfloat_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_multi_device_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_multi_gpu_recompile_on_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_multi_threading_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_multilayer_any_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_multilayer_sum_low_prec_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_multilayer_var_lowp_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_mutable_custom_op_fixed_layout_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_mutations_loop_fusion_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_nan_sort_stable_False_descending_False_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_nan_to_num_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_neg_max_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_new_empty_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_new_empty_strided_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pad_cast_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_philox_rand_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_airy_ai_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_bessel_j1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_bessel_y0_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_chebyshev_polynomial_u_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_expit_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_expm1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_gammainc_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_hermite_polynomial_he_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_i0_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_i1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_i1e_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_legendre_polynomial_p_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_logit_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_modified_bessel_k1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_ndtr_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_ndtri_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_polygamma_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_psi_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_scaled_modified_bessel_k1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_shifted_chebyshev_polynomial_v_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_sinc_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_spherical_bessel_j0_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_xlogy_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_polar_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pow1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pow3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pow_symfloat_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_prepare_softmax_with_fast_math_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_randint_distribution_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_randint_kernel_count_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_randn_generator_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_randn_like_empty_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_randn_with_dtype_and_device_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_reduction1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_reduction2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_remainder_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_remove_noop_copy_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_remove_noop_slice_scatter_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_remove_noop_view_dtype_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_repeat_interleave_Tensor_decomp_int32_nd_1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_repeat_interleave_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_replication_pad_errors_with_bool_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_resize_as_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_resize_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_reuse_buffers_with_aliasing_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_roi_align_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_round_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_rsqrt_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_scalar_input_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_scatter4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_scatter6_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_scatter_add1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_scatter_reduce3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_scheduler_vertical_fusion1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sdpa_prefer_nd_tiling_True_use_block_ptr_False_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sdpa_prefer_nd_tiling_True_use_block_ptr_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sdpa_unaligned_mask_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_searchsorted_broadcast_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_setitem_with_int_parameter_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sgn_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sgn_extremal_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_should_pad_bench_for_bmm_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sigmoid_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sign_dtype_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_silu_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_simplify_loops_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sin_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_single_elem_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_size_asserts_for_multi_output_fallback_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sizehint_issue1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_slice2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_slice3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_slice_mutation1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_slice_scatter2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_slice_scatter_dtype_consistency_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_slice_scatter_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_slice_view_with_graph_break_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_softmax_backward_data_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_softmax_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sort_bool_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sort_transpose_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_special_polygamma_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_split_cumprod_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_split_cumsum_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_split_cumsum_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_split_cumsum_low_prec_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_split_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_split_reduction_dynamic_shape_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_split_reduction_with_int64_size_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_squeeze1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_squeeze2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_squeeze_varargs_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_std_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_strided_inputs_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sum3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sum4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sum5_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sum_keepdims_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_tensor2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_tensor_index_put_slice_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_tmp_not_defined_issue2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_to_memory_format_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_triu_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_uint4x2_mixed_mm_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_uint_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_unfold_zero_dimension_tensor_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_unspec_inputs_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_unspec_inputs_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_unspec_inputs_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_unspec_inputs_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_unspec_inputs_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_unspec_inputs_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_unspec_inputs_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_unsqueeze_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_upsample_bicubic2d_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_upsample_nearest2d_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_var_mean_div_by_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_var_mean_tile_reduction_False_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_var_mean_tile_reduction_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_vdd_clamp_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_vectorized_ops_masked_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_view_as_real_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_view_detach_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_views2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_views5_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_adaptive_max_pool3d_with_indices_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_arithmetic_constant_folding_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_bool_mask_nobreak_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_dynamic_rblock_bounds_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_dynamic_stride_nobreak_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_float_is_integer_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_float_item_neginf_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_floor_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_full_symbolic_value_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_interpolate_ceil_eq_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_item_bool_nobreak_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_item_materialize_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_item_to_inputs_kernel_nobreak_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_math_ops_op1_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_math_ops_op2_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_math_ops_op3_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_math_ops_op7_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_math_ops_op8_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_nonzero_no_realloc_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_nonzero_size_factory_nobreak_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_pad_dynamic_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_shape_as_constant_reciprocal_float_exp_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_sort_dynamic_shape_with_check_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_sub_constant_folding_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_sym_sum_unbacked_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_unbacked_cat_backwards_save_data_dependent_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_unbacked_matmul_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_unbacked_reduction_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_unbacked_save_for_backwards_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_unspecialized_float_dynamic_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_unspecialized_float_fallback_specialization_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_unspecialized_float_fallback_symint_specialization_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_unspecialized_float_operations_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_unspecialized_float_softshrink_cuda 2025-10-10T01:47:29.1500237Z 2025-10-10T01:47:32.9538186Z Running dynamo/test_fx_passes_pre_grad 1/1 ... [2025-10-10 01:47:32.953236] 2025-10-10T01:47:32.9539008Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:47:32.9540475Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_fx_passes_pre_grad.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:47:32.953612] 2025-10-10T01:47:36.8763394Z 2025-10-10T01:47:36.8764705Z dynamo/test_fx_passes_pre_grad 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_fx_passes_pre_grad_1.1_04f0e6e8e98d46bf_.log 2025-10-10T01:47:36.8766292Z Running 1 items in this shard: test/dynamo/test_fx_passes_pre_grad.py::FxPassesPreGradTests::test_pass_execution_and_save 2025-10-10T01:47:36.8767019Z 2025-10-10T01:47:40.7418377Z Running inductor/test_aot_inductor_windows 1/1 ... [2025-10-10 01:47:40.741242] 2025-10-10T01:47:40.7419069Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:47:40.7420748Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor_windows.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:47:40.741614] 2025-10-10T01:47:47.9217295Z 2025-10-10T01:47:47.9218310Z inductor/test_aot_inductor_windows 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_windows_1.1_42a69d4772d58dd2_.log 2025-10-10T01:47:47.9219773Z Running 1 items in this shard: test/inductor/test_aot_inductor_windows.py::TestAOTInductorWindowsCrossCompilation::test_simple_so 2025-10-10T01:47:47.9220318Z 2025-10-10T01:47:51.7574786Z Running inductor/test_compiled_autograd 1/2 ... [2025-10-10 01:47:51.756949] 2025-10-10T01:47:51.7575790Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:47:51.7577563Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_compiled_autograd.py', '-m', 'not serial', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:47:51.757347] 2025-10-10T01:48:37.5561409Z 2025-10-10T01:48:37.5563612Z inductor/test_aot_inductor_arrayref 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_arrayref_1.1_9e8fa47aec179ee7_.log 2025-10-10T01:48:37.5752794Z Running 292 items in this shard: test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test__int_mm_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test__weight_int4pack_mm_m_32_n_64_q_group_32_num_groups_1_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test__weight_int4pack_mm_m_32_n_64_q_group_32_num_groups_2_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test__weight_int4pack_mm_m_32_n_64_q_group_64_num_groups_1_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test__weight_int4pack_mm_m_32_n_64_q_group_64_num_groups_2_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_32_num_groups_1_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_32_num_groups_2_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_64_num_groups_1_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_64_num_groups_2_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_add_complex_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_addmm_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_addmm_multiple_dynamic_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aliased_buffer_reuse_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_amp_fallback_random_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aot_inductor_consts_cpp_build_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_constant_tensor_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_constant_tensor_name_collision_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_debug_printer_codegen_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_debug_printer_cpp_kernel_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_debug_printer_fp8_dtype_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_debug_printer_sym_inputs_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_debug_printer_user_defined_triton_kernel_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_debug_printing_model_inputs_codegen_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_profiler_enable_kernel_profile_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_profiler_enable_kernel_profile_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_runtime_asserts_backed_symint_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_runtime_asserts_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_user_defined_triton_kernel_profiling_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_assert_async_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_assert_tensor_meta_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_autotune_int64_user_defined_triton_kernel_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_autotune_with_constant_folding_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_autotuning_args_reuse_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_backward_no_op_logging_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_bmm_multiple_dynamic_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_bool_input_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_boolean_indexing_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_mutation_1_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_mutation_2_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_mutation_3_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_mutation_4_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_mutation_and_force_mmap_weights_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_reuse_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_clamp_decomposition_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_composed_dynamic_size_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_mismatched_branch_output_dynamic_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_mismatched_branch_output_dynamic_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_nested_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_non_tensor_predicates_dynamic_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_non_tensor_predicates_dynamic_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_share_predicte_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_simple_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_symint_input_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_unbacked_symint_closure_dynamic_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_unbacked_symint_closure_dynamic_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_use_buffers_from_outer_scope_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_with_multiple_outputs_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_with_outer_code_before_after_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_with_parameters_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_with_reinterpret_view_inputs_outputs_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_consecutive_compiles_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_constant_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_constant_folding_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_constant_folding_with_update_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_constant_original_fqn_and_dtype_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_constant_type_propagation_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_conv3d_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_conv_freezing_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_convolution_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_copy_non_blocking_is_pinned_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_custom_op_in_subgraph_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_d2h_copy_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_deconv_freezing_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_dup_unbacked_sym_decl_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_dup_unbacked_sym_decl_with_refinement_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_duplicate_constant_folding_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_duplicated_params_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_dynamic_cat_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_dynamic_scalar_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_dynamic_smem_above_default_limit_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_embedding_bag_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_empty_cat_dtype_promotion_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_empty_constant_folding_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_empty_graph_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_extract_constants_map_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fake_tensor_device_validation_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fallback_kernel_with_symexpr_output_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fallback_mem_leak_fix_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fft_c2c_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fill__fallback_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_foreach_multiple_dynamic_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fp8_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fp8_view_of_param_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fqn_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_free_inactive_buffer_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_freezing_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fx_gm_return_tuple_validation_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_index_put_fallback_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_index_put_with_none_index_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_inf_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_input_codegen_with_sympy_expr_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_int_list_input_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_issue_140766_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_large_dynamic_dim_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_large_grid_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_large_mmaped_weights_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_large_weight_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_libtorch_free_so_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_linear_dynamic_maxautotune_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_linear_freezing_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_load_package_multiple_gpus_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_masked_select_dynamic_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_misaligned_input_1_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_misaligned_input_2_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_misc_1_max_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_misc_1_max_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_missing_cubin_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_missing_output_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_model_modified_weights_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_multi_device_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_multiple_output_alias_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_nan_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_narrow_fallback_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_nested_tensor_from_jagged_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_no_args_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_non_contiguous_output_alias_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_non_default_gpu_device_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_non_tensor_input_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_none_args_aot_codegen_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_normal_functional_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_on_gpu_device1_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_output_misaligned_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_output_path_1_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_output_path_2_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_pad_fallback_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_pad_non_zero_memory_leak_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_poi_multiple_dynamic_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_profile_benchmark_harness_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_proxy_executor_abs_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_proxy_executor_hann_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_proxy_executor_permute_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_proxy_executor_squeeze_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_pytree_inputs_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_quanatized_int8_linear_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_quantized_linear_bias_none_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_quantized_linear_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_repeat_interleave_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_repeat_output_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_repeated_calling_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_repeated_user_defined_triton_kernel_embed_kernel_binary_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_repeated_user_defined_triton_kernel_embed_kernel_binary_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_replace_unbacked_symbol_with_backed_expr_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_replicate_on_devices_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_return_constant_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_return_view_constant_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_reuse_kernel_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_reuse_kernel_dynamic_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_rocm_triton_autotuning_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_run_with_grad_enabled_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_complex_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_device_type_failed_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_dtype_failed_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_fp8_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_large_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_shape_failed_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_same_backing_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_scaled_dot_product_efficient_attention_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_scaled_grouped_mm_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_scatter_fallback_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_scatter_reduce_fallback_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_sdpa_2_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_sdpa_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_seq_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_shifted_constraint_ranges_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_simple_dynamic_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_simple_embed_kernel_binary_False_max_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_simple_embed_kernel_binary_False_max_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_simple_embed_kernel_binary_True_max_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_simple_embed_kernel_binary_True_max_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_simple_multi_arch_embed_kernel_binary_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_simple_split_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_size_from_multi_output_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_size_with_unbacked_add_and_mul_expr_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_size_with_unbacked_add_expr_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_size_with_unbacked_add_expr_transitive_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_small_constant_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_so_without_weight_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_stft_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_stride_with_unbacked_expr_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_subclasses_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_sym_expr_indexing_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_sym_i64_input_codegen_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_symbool_item_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_symfloat_item_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_symint_item_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_sympy_cpp_printer_min_max_minmax0_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_sympy_cpp_printer_min_max_minmax1_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_torchvision_transforms_functional_tensor_resize_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_autotuning_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_dynamic_launcher_grid_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_dynamic_launcher_grid_infer_from_tensor_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_bool_param_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_dynamic_grid_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_dynamic_shape_with_div_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_equal_to_1_arg_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_equal_to_1_float_arg_dynamic_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_equal_to_1_float_arg_dynamic_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_extern_kernel_arg_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_multi_output_arg_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_on_device_tma_dynamic_False_tma_version_new_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_on_device_tma_dynamic_False_tma_version_old_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_on_device_tma_dynamic_True_tma_version_new_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_on_device_tma_dynamic_True_tma_version_old_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_reinterpret_view_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_reinterpret_view_mem_leak_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_sympy_expr_arg_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_sympy_fn_like_arg_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_new_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_old_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_tma_descriptor_1d_dynamic_True_tma_version_new_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_tma_descriptor_1d_dynamic_True_tma_version_old_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_tma_descriptor_2d_dynamic_False_tma_version_new_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_tma_descriptor_2d_dynamic_False_tma_version_old_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_new_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_old_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_weird_param_order_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_with_none_input_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_with_none_inputs_and_equal_to_1_arg_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_mutated_autotuning_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_next_power_of_2_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_unbounded_expr_substitutions_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_update_constant_buffer_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_update_constant_buffer_simple_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_update_inactive_constant_buffer_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_update_user_managed_buffer_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_upper_bound_i64_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_using_model_name_for_files_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_view_outputs_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_weight_on_disk_legacy_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_nested_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_simple_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_conv_dynamic_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_conv_dynamic_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_mixed_device_dynamic_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_mixed_device_dynamic_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_outer_buffers_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_outer_code_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_parameters_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_pytree_inputs_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_sym_expr_cond_dynamic_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_sym_expr_cond_dynamic_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_unbacked_symint_closure_dynamic_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_unbacked_symint_closure_dynamic_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_with_cudagraphs_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_with_no_triton_profiler_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_with_offset_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_with_profiler_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_zero_grid_with_backed_symbols_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_zero_grid_with_unbacked_symbols_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_zero_size_buffer_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_zero_size_weight_cpu_with_stack_allocation 2025-10-10T01:48:37.5934158Z 2025-10-10T01:48:41.4168178Z Running inductor/test_metrics 1/1 ... [2025-10-10 01:48:41.416279] 2025-10-10T01:48:41.4168731Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:48:41.4170148Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_metrics.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:48:41.416654] 2025-10-10T01:48:48.7465374Z 2025-10-10T01:48:48.7466802Z inductor/test_metrics 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_metrics_1.1_2736f0076b1f6312_.log 2025-10-10T01:48:48.7470260Z Running 6 items in this shard: test/inductor/test_metrics.py::TestMetrics::test_atomic_add, test/inductor/test_metrics.py::TestMetrics::test_count_args, test/inductor/test_metrics.py::TestMetrics::test_count_pattern, test/inductor/test_metrics.py::TestMetrics::test_kernel_args_num_gb, test/inductor/test_metrics.py::TestMetrics::test_parse_proper_kernel_fn_code, test/inductor/test_metrics.py::TestMetrics::test_parse_reduction_hint 2025-10-10T01:48:48.7472904Z 2025-10-10T01:48:52.6427115Z Running inductor/test_custom_post_grad_passes 1/1 ... [2025-10-10 01:48:52.642169] 2025-10-10T01:48:52.6427624Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:48:52.6429416Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_custom_post_grad_passes.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:48:52.642541] 2025-10-10T01:48:59.9234505Z 2025-10-10T01:48:59.9236049Z inductor/test_custom_post_grad_passes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_custom_post_grad_passes_1.1_c3f5292f4425e102_.log 2025-10-10T01:48:59.9240603Z Running 6 items in this shard: test/inductor/test_custom_post_grad_passes.py::TestPostGradCustomPrePostPass::test_custom_backend_pass, test/inductor/test_custom_post_grad_passes.py::TestPostGradCustomPrePostPass::test_custom_joint_pass_post, test/inductor/test_custom_post_grad_passes.py::TestPostGradCustomPrePostPass::test_custom_joint_pass_pre, test/inductor/test_custom_post_grad_passes.py::TestPostGradCustomPrePostPass::test_custom_post_pass, test/inductor/test_custom_post_grad_passes.py::TestPostGradCustomPrePostPass::test_custom_pre_grad_pass, test/inductor/test_custom_post_grad_passes.py::TestPostGradCustomPrePostPass::test_custom_pre_pass 2025-10-10T01:48:59.9244466Z 2025-10-10T01:49:03.8076322Z 2025-10-10T01:49:03.8077854Z inductor/test_torchinductor_opinfo 6/11 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_6.11_49c72f5e054f3bbb_.log 2025-10-10T01:49:03.8301276Z Running 322 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_T_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rdiv___cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmul___cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__batch_norm_with_update_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__chunk_cat_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__chunk_cat_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_abs_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acosh_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acosh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acosh_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_add_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addcdiv_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addr_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_alias_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_all_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amin_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_aminmax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_any_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_any_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_arange_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmin_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argsort_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argsort_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argwhere_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_scatter_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_scatter_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_scatter_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asinh_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atanh_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_2d_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_3d_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bfloat16_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bfloat16_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bfloat16_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bincount_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_and_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bmm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_tensors_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_tensors_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_to_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bucketize_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_byte_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cartesian_prod_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cat_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ceil_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_char_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cholesky_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_min_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_min_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clone_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_combinations_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_copysign_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cosh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cross_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummax_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumprod_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_deg2rad_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diag_embed_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagflat_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_digamma_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_permuted_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_strided_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_equal_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfinv_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfinv_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_as_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft2_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft2_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftn_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftshift_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftshift_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftn_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftshift_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftshift_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft2_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfftn_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flip_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fliplr_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flipud_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flipud_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_like_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gcd_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_geometric_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_grid_sampler_3d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_half_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hash_tensor_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_histc_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_i0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amax_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isin_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isinf_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_item_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_unary_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_kron_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_kthvalue_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ldexp_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ldexp_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cross_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_diagonal_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_ldl_factor_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_norm_subgradients_at_zero_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_qr_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_solve_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_svdvals_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vander_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vecdot_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vector_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_tensor_overload_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_with_dtype_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logdet_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_not_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_not_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_or_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_or_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logit_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logit_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mT_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_argmin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_logsumexp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_mean_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_binary_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_binary_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_maximum_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_median_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_binary_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_no_dim_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_with_dim_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_minimum_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_movedim_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ne_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ne_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_full_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nextafter_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nextafter_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_max_pool3d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_alpha_dropout_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_avg_pool2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv_transpose2d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_dropout2d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_embedding_bag_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_nearest-exact_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_trilinear_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_layer_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_pool2d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool1d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_mse_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_circular_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_constant_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu6_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_selu_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_smooth_l1_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softmin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softplus_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softsign_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softsign_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_upsample_bilinear_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_normal_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_like_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_outer_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_outer_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_copy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_put_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_qr_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rand_like_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randint_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randint_like_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ravel_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_interleave_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize__cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize_as__cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_conj_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_roll_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_roll_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_round_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_round_decimals_3_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_add_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_amax_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_sum_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_select_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_short_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sigmoid_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinc_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_with_dtype_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_with_dtype_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sort_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sparse_mm_reduce_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sparse_mm_reduce_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_j1_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y1_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_t_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_t_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_u_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_entr_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_erfcx_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i0e_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_i1_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_k0_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtr_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_u_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_list_args_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_stack_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_stack_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_stft_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_take_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tan_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tan_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tanh_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tensor_split_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tensordot_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_sparse_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_topk_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trapz_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_triangular_solve_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_indices_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trunc_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unflatten_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unflatten_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_consecutive_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_as_complex_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_where_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_xlogy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_like_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_like_cuda_float64 2025-10-10T01:49:03.8511097Z 2025-10-10T01:49:03.8511374Z Running inductor/test_aot_inductor_package 1/1 ... [2025-10-10 01:49:03.838250] 2025-10-10T01:49:03.8511844Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:49:03.8512880Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor_package.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:49:03.838698] 2025-10-10T01:49:07.6466782Z Running inductor/test_provenance_tracing 1/1 ... [2025-10-10 01:49:07.646016] 2025-10-10T01:49:07.6467434Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:49:07.6468627Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_provenance_tracing.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:49:07.646438] 2025-10-10T01:49:11.2689484Z 2025-10-10T01:49:11.2690794Z inductor/test_aot_inductor_package 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_package_1.1_d6ff9edb97f5fbc0_.log 2025-10-10T01:49:11.2723969Z Running 88 items in this shard: test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cpu::test_add, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cpu::test_bool_input, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cpu::test_compile_after_package, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cpu::test_compile_after_package_multi_arch, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cpu::test_compile_after_package_static, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cpu::test_compile_standalone_cos, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cpu::test_compile_with_exporter, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cpu::test_compile_with_exporter_weights, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cpu::test_deepcopy_compiled_model, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cpu::test_duplicate_calls, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cpu::test_linear, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cpu::test_loading_wrong_model, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cpu::test_metadata, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cpu::test_multiple_methods, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cpu::test_package_shared_weights, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cpu::test_package_user_managed_weight, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cpu::test_package_weights_on_disk_nested_module, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cpu::test_package_without_weight, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cpu::test_remove_intermediate_files, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cpu::test_save_buffer, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cpu::test_specified_output_dir, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cpu::test_update_weights, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cpu::test_add, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cpu::test_bool_input, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cpu::test_compile_after_package, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cpu::test_compile_after_package_multi_arch, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cpu::test_compile_after_package_static, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cpu::test_compile_standalone_cos, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cpu::test_compile_with_exporter, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cpu::test_compile_with_exporter_weights, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cpu::test_deepcopy_compiled_model, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cpu::test_duplicate_calls, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cpu::test_linear, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cpu::test_loading_wrong_model, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cpu::test_metadata, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cpu::test_multiple_methods, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cpu::test_package_shared_weights, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cpu::test_package_user_managed_weight, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cpu::test_package_weights_on_disk_nested_module, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cpu::test_package_without_weight, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cpu::test_remove_intermediate_files, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cpu::test_save_buffer, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cpu::test_specified_output_dir, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cpu::test_update_weights, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cuda::test_add, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cuda::test_bool_input, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cuda::test_compile_after_package, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cuda::test_compile_after_package_multi_arch, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cuda::test_compile_after_package_static, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cuda::test_compile_standalone_cos, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cuda::test_compile_with_exporter, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cuda::test_compile_with_exporter_weights, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cuda::test_deepcopy_compiled_model, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cuda::test_duplicate_calls, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cuda::test_linear, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cuda::test_loading_wrong_model, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cuda::test_metadata, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cuda::test_multiple_methods, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cuda::test_package_shared_weights, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cuda::test_package_user_managed_weight, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cuda::test_package_weights_on_disk_nested_module, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cuda::test_package_without_weight, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cuda::test_remove_intermediate_files, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cuda::test_save_buffer, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cuda::test_specified_output_dir, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackage_cuda::test_update_weights, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cuda::test_add, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cuda::test_bool_input, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cuda::test_compile_after_package, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cuda::test_compile_after_package_multi_arch, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cuda::test_compile_after_package_static, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cuda::test_compile_standalone_cos, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cuda::test_compile_with_exporter, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cuda::test_compile_with_exporter_weights, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cuda::test_deepcopy_compiled_model, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cuda::test_duplicate_calls, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cuda::test_linear, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cuda::test_loading_wrong_model, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cuda::test_metadata, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cuda::test_multiple_methods, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cuda::test_package_shared_weights, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cuda::test_package_user_managed_weight, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cuda::test_package_weights_on_disk_nested_module, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cuda::test_package_without_weight, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cuda::test_remove_intermediate_files, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cuda::test_save_buffer, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cuda::test_specified_output_dir, test/inductor/test_aot_inductor_package.py::TestAOTInductorPackageCpp_cuda::test_update_weights 2025-10-10T01:49:11.2756218Z 2025-10-10T01:49:15.0267131Z 2025-10-10T01:49:15.0268232Z inductor/test_provenance_tracing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_provenance_tracing_1.1_0c5bb1bcda5cff74_.log 2025-10-10T01:49:15.0274141Z Running 11 items in this shard: test/inductor/test_provenance_tracing.py::TestProvenanceTracingArtifact::test_triton_kernel_to_post_grad_tracing_combo_kernel, test/inductor/test_provenance_tracing.py::TestProvenanceTracingArtifact::test_triton_kernel_to_post_grad_tracing_cpu, test/inductor/test_provenance_tracing.py::TestProvenanceTracingArtifact::test_triton_kernel_to_post_grad_tracing_cuda, test/inductor/test_provenance_tracing.py::TestProvenanceTracingArtifact::test_triton_kernel_to_post_grad_tracing_extern_kernel, test/inductor/test_provenance_tracing.py::TestProvenanceTracingNodeMapping::test_create_node_mapping, test/inductor/test_provenance_tracing.py::TestProvenanceTracingNodeMeta::test_pattern_matcher_transfer_meta, test/inductor/test_provenance_tracing.py::TestProvenanceTracingStackTraces::test_cpu_extern_kernel, test/inductor/test_provenance_tracing.py::TestProvenanceTracingStackTraces::test_create_kernel_information_json_function, test/inductor/test_provenance_tracing.py::TestProvenanceTracingStackTraces::test_kernel_information_generation, test/inductor/test_provenance_tracing.py::TestProvenanceTracingStackTraces::test_no_kernel_information_without_provenance_tracking, test/inductor/test_provenance_tracing.py::TestProvenanceTracingStackTraces::test_tlparse_kernel_stack_traces 2025-10-10T01:49:15.0279354Z 2025-10-10T01:49:15.0717558Z Running inductor/test_fx_fusion 1/1 ... [2025-10-10 01:49:15.071290] 2025-10-10T01:49:15.0718137Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:49:15.0721056Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_fx_fusion.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:49:15.071753] 2025-10-10T01:49:18.9174174Z Running inductor/test_loop_ordering 1/1 ... [2025-10-10 01:49:18.916771] 2025-10-10T01:49:18.9174665Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:49:18.9177115Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_loop_ordering.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:49:18.917159] 2025-10-10T01:49:20.5477423Z 2025-10-10T01:49:20.5478902Z inductor/test_fx_fusion 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_fx_fusion_1.1_ba8b15390ffb14b4_.log 2025-10-10T01:49:20.5480781Z Running 4 items in this shard: test/inductor/test_fx_fusion.py::TestFxFusion::test_linear_permute_fusion, test/inductor/test_fx_fusion.py::TestFxFusion::test_permute_bmm_fusion, test/inductor/test_fx_fusion.py::TestFxFusion::test_permute_linear_fusion, test/inductor/test_fx_fusion.py::TestFxFusion::test_sink_cat_after_pointwise 2025-10-10T01:49:20.5482225Z 2025-10-10T01:49:24.3728307Z Running export/test_functionalized_assertions 1/1 ... [2025-10-10 01:49:24.372272] 2025-10-10T01:49:24.3729025Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:49:24.3730887Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_functionalized_assertions.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:49:24.372681] 2025-10-10T01:49:26.3464317Z 2025-10-10T01:49:26.3465467Z inductor/test_loop_ordering 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_loop_ordering_1.1_9ae8f1f2b3081ac0_.log 2025-10-10T01:49:26.3480336Z Running 49 items in this shard: test/inductor/test_loop_ordering.py::ImplDetailTest::test_merge_loops_invalidate_pw_dep_cache, test/inductor/test_loop_ordering.py::ImplDetailTest::test_reorder_and_merge_loops, test/inductor/test_loop_ordering.py::ImplDetailTest::test_reorder_modular_indexing, test/inductor/test_loop_ordering.py::ImplDetailTest::test_reorder_twice, test/inductor/test_loop_ordering.py::LoopOrderingTest::test_3dred_pw_2d_outer_red, test/inductor/test_loop_ordering.py::LoopOrderingTest::test_apbt_realize, test/inductor/test_loop_ordering.py::LoopOrderingTest::test_different_broadcast_shapes, test/inductor/test_loop_ordering.py::LoopOrderingTest::test_different_reduction_order, test/inductor/test_loop_ordering.py::LoopOrderingTest::test_for_reordering_reindex, test/inductor/test_loop_ordering.py::LoopOrderingTest::test_fp8_cast_and_t, test/inductor/test_loop_ordering.py::LoopOrderingTest::test_fp8_pattern_2, test/inductor/test_loop_ordering.py::LoopOrderingTest::test_fuse_reduction_with_tiled_pw, test/inductor/test_loop_ordering.py::LoopOrderingTest::test_fuse_with_scalar_shared_memory, test/inductor/test_loop_ordering.py::LoopOrderingTest::test_interaction_with_triton_template, test/inductor/test_loop_ordering.py::LoopOrderingTest::test_keep_fake_dep, test/inductor/test_loop_ordering.py::LoopOrderingTest::test_outer_dimension_softmax, test/inductor/test_loop_ordering.py::LoopOrderingTest::test_outer_dimension_sum_fuse_with_pw, test/inductor/test_loop_ordering.py::LoopOrderingTest::test_pw_outer_red, test/inductor/test_loop_ordering.py::LoopOrderingTest::test_pw_outer_red_2, test/inductor/test_loop_ordering.py::LoopOrderingTest::test_sum_and_t, test/inductor/test_loop_ordering.py::LoopOrderingTest::test_view, test/inductor/test_loop_ordering.py::MemoryCoalescingTest::test_coalescing, test/inductor/test_loop_ordering.py::MemoryCoalescingTest::test_induced_fused_tiling, test/inductor/test_loop_ordering.py::MemoryCoalescingTest::test_inferred_splits_inps0, test/inductor/test_loop_ordering.py::MemoryCoalescingTest::test_inferred_splits_inps1, test/inductor/test_loop_ordering.py::MemoryCoalescingTest::test_inferred_splits_inps2, test/inductor/test_loop_ordering.py::MemoryCoalescingTest::test_inferred_splits_inps3, test/inductor/test_loop_ordering.py::MemoryCoalescingTest::test_reduction_no_pointwise, test/inductor/test_loop_ordering.py::MemoryCoalescingTest::test_reduction_pointwise, test/inductor/test_loop_ordering.py::MemoryCoalescingTest::test_remapped_reads, test/inductor/test_loop_ordering.py::MemoryCoalescingTest::test_remapped_reads_split, test/inductor/test_loop_ordering.py::MemoryCoalescingTest::test_solve_for_tiling, test/inductor/test_loop_ordering.py::MemoryCoalescingTest::test_solve_for_zero, test/inductor/test_loop_ordering.py::MemoryCoalescingTest::test_tiled_coalesce_analysis_downcast_transposed_v_False, test/inductor/test_loop_ordering.py::MemoryCoalescingTest::test_tiled_coalesce_analysis_downcast_transposed_v_True, test/inductor/test_loop_ordering.py::TestTiling::test_3d_pointwise, test/inductor/test_loop_ordering.py::TestTiling::test_cat, test/inductor/test_loop_ordering.py::TestTiling::test_mutation_deps, test/inductor/test_loop_ordering.py::TestTiling::test_penalized_small_dim, test/inductor/test_loop_ordering.py::TestTiling::test_pointwise_a_NHWC_b_NHWC, test/inductor/test_loop_ordering.py::TestTiling::test_pointwise_a_NHWC_b_T, test/inductor/test_loop_ordering.py::TestTiling::test_pointwise_a_NHWC_b_cont, test/inductor/test_loop_ordering.py::TestTiling::test_pointwise_a_T_b_NHWC, test/inductor/test_loop_ordering.py::TestTiling::test_pointwise_a_T_b_T, test/inductor/test_loop_ordering.py::TestTiling::test_pointwise_a_T_b_cont, test/inductor/test_loop_ordering.py::TestTiling::test_pointwise_a_cont_b_NHWC, test/inductor/test_loop_ordering.py::TestTiling::test_pointwise_a_cont_b_T, test/inductor/test_loop_ordering.py::TestTiling::test_pointwise_a_cont_b_cont, test/inductor/test_loop_ordering.py::TestTiling::test_tiled_reduction 2025-10-10T01:49:26.3494955Z 2025-10-10T01:49:28.2451981Z 2025-10-10T01:49:28.2453329Z export/test_functionalized_assertions 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_functionalized_assertions_1.1_924a8cfb1a37b86d_.log 2025-10-10T01:49:28.2456124Z Running 2 items in this shard: test/export/test_functionalized_assertions.py::TestFuntionalAssertions::test_functional_assert_async_msg, test/export/test_functionalized_assertions.py::TestFuntionalAssertions::test_functional_sym_constrain_range 2025-10-10T01:49:28.2457765Z 2025-10-10T01:49:30.2265763Z Running inductor/test_segmented_tree 1/1 ... [2025-10-10 01:49:30.226087] 2025-10-10T01:49:30.2266423Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:49:30.2271451Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_segmented_tree.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:49:30.226473] 2025-10-10T01:49:32.1727786Z Running inductor/test_compiled_optimizers 1/1 ... [2025-10-10 01:49:32.172225] 2025-10-10T01:49:32.1728426Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:49:32.1730275Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_compiled_optimizers.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:49:32.172665] 2025-10-10T01:49:34.0992744Z 2025-10-10T01:49:34.0993783Z inductor/test_segmented_tree 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_segmented_tree_1.1_b3c8364b31241fe7_.log 2025-10-10T01:49:34.0998012Z Running 12 items in this shard: test/inductor/test_segmented_tree.py::TestSegmentedTree::test_basic_construction, test/inductor/test_segmented_tree.py::TestSegmentedTree::test_boundary_conditions, test/inductor/test_segmented_tree.py::TestSegmentedTree::test_empty_array, test/inductor/test_segmented_tree.py::TestSegmentedTree::test_full_array_range, test/inductor/test_segmented_tree.py::TestSegmentedTree::test_invalid_ranges, test/inductor/test_segmented_tree.py::TestSegmentedTree::test_max_query_matches_naive, test/inductor/test_segmented_tree.py::TestSegmentedTree::test_multiple_operations, test/inductor/test_segmented_tree.py::TestSegmentedTree::test_out_of_bounds, test/inductor/test_segmented_tree.py::TestSegmentedTree::test_overlapping_updates, test/inductor/test_segmented_tree.py::TestSegmentedTree::test_range_update, test/inductor/test_segmented_tree.py::TestSegmentedTree::test_sequential_updates_and_queries, test/inductor/test_segmented_tree.py::TestSegmentedTree::test_single_element_ranges 2025-10-10T01:49:34.1001812Z 2025-10-10T01:49:38.0189624Z Running inductor/test_decompose_mem_bound_mm 1/1 ... [2025-10-10 01:49:38.018454] 2025-10-10T01:49:38.0190304Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:49:38.0192110Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_decompose_mem_bound_mm.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:49:38.018839] 2025-10-10T01:49:45.4503475Z 2025-10-10T01:49:45.4504987Z inductor/test_decompose_mem_bound_mm 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_decompose_mem_bound_mm_1.1_a391d2cf64967b9d_.log 2025-10-10T01:49:45.4538929Z Running 37 items in this shard: test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_check_device, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_bmm_b_10240_m_2_k_2_n_2_should_decompose_True, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_bmm_b_10240_m_2_k_32_n_32_should_decompose_False, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_bmm_b_2000_m_2_k_2_n_2_should_decompose_False, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_bmm_cpu_b_1_m_2_k_2_n_2_should_decompose_True, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_bmm_cpu_b_2_m_2_k_2_n_2_should_decompose_False, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_linear_m_20480_k_32_n_2_should_decompose_False_has_bias_False, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_linear_m_20480_k_32_n_2_should_decompose_False_has_bias_True, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_linear_m_20480_k_5_n_2_should_decompose_True_has_bias_False, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_linear_m_20480_k_5_n_2_should_decompose_True_has_bias_True, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_linear_m_2048_k_2_n_2_should_decompose_False_has_bias_False, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_linear_m_2048_k_2_n_2_should_decompose_False_has_bias_True, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_linear_mixed_precision_m_20480_k_32_n_2_should_decompose_False_has_bias_False, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_linear_mixed_precision_m_20480_k_32_n_2_should_decompose_False_has_bias_True, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_linear_mixed_precision_m_20480_k_5_n_2_should_decompose_True_has_bias_False, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_linear_mixed_precision_m_20480_k_5_n_2_should_decompose_True_has_bias_True, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_linear_mixed_precision_m_2048_k_2_n_2_should_decompose_False_has_bias_False, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_linear_mixed_precision_m_2048_k_2_n_2_should_decompose_False_has_bias_True, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_mm_cpu_m_1_k_64_n_16_should_decompose_True, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_mm_cpu_m_1_k_64_n_32_should_decompose_True, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_mm_cpu_m_2_k_64_n_16_should_decompose_False, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_mm_m_20480_k_32_n_2_should_decompose_False_has_bias_False, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_mm_m_20480_k_32_n_2_should_decompose_False_has_bias_True, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_mm_m_20480_k_5_n_2_should_decompose_True_has_bias_False, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_mm_m_20480_k_5_n_2_should_decompose_True_has_bias_True, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_mm_m_2048_k_2_n_2_should_decompose_False_has_bias_False, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_mm_m_2048_k_2_n_2_should_decompose_False_has_bias_True, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_mm_mixed_precision_m_20480_k_32_n_2_should_decompose_False_has_bias_False, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_mm_mixed_precision_m_20480_k_32_n_2_should_decompose_False_has_bias_True, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_mm_mixed_precision_m_20480_k_5_n_2_should_decompose_True_has_bias_False, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_mm_mixed_precision_m_20480_k_5_n_2_should_decompose_True_has_bias_True, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_mm_mixed_precision_m_2048_k_2_n_2_should_decompose_False_has_bias_False, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_decompose_mm_mixed_precision_m_2048_k_2_n_2_should_decompose_False_has_bias_True, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_dynamic_shape_decompose_addmm, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_dynamic_shape_m_20480_k_5_n_2_should_decompose_True_has_bias_False, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_dynamic_shape_m_20480_k_5_n_2_should_decompose_True_has_bias_True, test/inductor/test_decompose_mem_bound_mm.py::TestDecomposeMemMM::test_realize_input 2025-10-10T01:49:45.4571991Z 2025-10-10T01:49:49.3621179Z Running dynamo/test_base_output 1/1 ... [2025-10-10 01:49:49.361532] 2025-10-10T01:49:49.3621789Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:49:49.3623883Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_base_output.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:49:49.361961] 2025-10-10T01:49:53.0842200Z 2025-10-10T01:49:53.0843336Z dynamo/test_base_output 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_base_output_1.1_47daba58d208aa02_.log 2025-10-10T01:49:53.0845337Z Running 6 items in this shard: test/dynamo/test_base_output.py::TestBaseOutput::test_assign, test/dynamo/test_base_output.py::TestBaseOutput::test_create, test/dynamo/test_base_output.py::TestBaseOutput::test_getattr, test/dynamo/test_base_output.py::TestBaseOutput::test_getitem, test/dynamo/test_base_output.py::TestBaseOutput::test_index, test/dynamo/test_base_output.py::TestBaseOutput::test_tuple 2025-10-10T01:49:53.0846777Z 2025-10-10T01:49:56.8793058Z Running dynamo/test_backends 1/1 ... [2025-10-10 01:49:56.878768] 2025-10-10T01:49:56.8793634Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:49:56.8795697Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_backends.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:49:56.879202] 2025-10-10T01:50:04.2586465Z 2025-10-10T01:50:04.2587472Z dynamo/test_backends 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_backends_1.1_8303145e9a12b9c7_.log 2025-10-10T01:50:04.2597682Z Running 21 items in this shard: test/dynamo/test_backends.py::NormalizeIRTests::test_inplace_normalize, test/dynamo/test_backends.py::MPSSupportedTest::test_mps_supported, test/dynamo/test_backends.py::TestExplainWithBackend::test_explain_with_backend, test/dynamo/test_backends.py::TestCustomBackendAPI::test_aot_autograd_api, test/dynamo/test_backends.py::TestCustomBackendAPI::test_backend_graph_freeze, test/dynamo/test_backends.py::TestCustomBackendAPI::test_backend_recompilation, test/dynamo/test_backends.py::TestCustomBackendAPI::test_lookup_backend, test/dynamo/test_backends.py::TestCustomBackendAPI::test_lookup_custom_backend, test/dynamo/test_backends.py::TestCustomBackendAPI::test_register_backend_api, test/dynamo/test_backends.py::TestOptimizationsCUDA::test_aot_cudagraphs_cuda, test/dynamo/test_backends.py::TestOptimizationsCUDA::test_aot_eager_cuda, test/dynamo/test_backends.py::TestOptimizationsCUDA::test_aot_eager_decomp_partition_cuda, test/dynamo/test_backends.py::TestOptimizationsCUDA::test_aot_ts_cuda, test/dynamo/test_backends.py::TestOptimizationsCUDA::test_eager_cuda, test/dynamo/test_backends.py::TestOptimizationsCUDA::test_eager_noexcept_cuda, test/dynamo/test_backends.py::TestOptimizationsCUDA::test_example_inputs_cuda, test/dynamo/test_backends.py::TestOptimizationsCUDA::test_example_inputs_runtime_use_cuda, test/dynamo/test_backends.py::TestOptimizationsCUDA::test_intel_gaudi_backend_cuda, test/dynamo/test_backends.py::TestOptimizationsCUDA::test_list_backends_cuda, test/dynamo/test_backends.py::TestOptimizationsCUDA::test_torchscript_cuda, test/dynamo/test_backends.py::TestOptimizationsCUDA::test_tvm_cuda 2025-10-10T01:50:04.2608686Z 2025-10-10T01:50:08.0906844Z Running dynamo/test_fx_graph_runnable 1/1 ... [2025-10-10 01:50:08.090149] 2025-10-10T01:50:08.0907439Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:50:08.0909660Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_fx_graph_runnable.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:50:08.090568] 2025-10-10T01:50:15.5211234Z 2025-10-10T01:50:15.5212271Z dynamo/test_fx_graph_runnable 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_fx_graph_runnable_1.1_47dd4b448534f9a4_.log 2025-10-10T01:50:15.5218094Z Running 15 items in this shard: test/dynamo/test_fx_graph_runnable.py::FxGraphRunnableTest::test_all_gather_collective, test/dynamo/test_fx_graph_runnable.py::FxGraphRunnableTest::test_all_reduce_collective, test/dynamo/test_fx_graph_runnable.py::FxGraphRunnableTest::test_basic_tensor_add, test/dynamo/test_fx_graph_runnable.py::FxGraphRunnableTest::test_broadcast_add_dynamic, test/dynamo/test_fx_graph_runnable.py::FxGraphRunnableTest::test_broadcast_collective, test/dynamo/test_fx_graph_runnable.py::FxGraphRunnableTest::test_dtensor_compile_redistribute, test/dynamo/test_fx_graph_runnable.py::FxGraphRunnableTest::test_dynamic_shapes_run, test/dynamo/test_fx_graph_runnable.py::FxGraphRunnableTest::test_reduce_scatter_collective, test/dynamo/test_fx_graph_runnable.py::FxGraphRunnableTest::test_scalar_multiply, test/dynamo/test_fx_graph_runnable.py::FxGraphRunnableTest::test_toy_model_basic, test/dynamo/test_fx_graph_runnable.py::FxGraphRunnableTest::test_toy_model_batch_processing, test/dynamo/test_fx_graph_runnable.py::FxGraphRunnableTest::test_toy_model_dynamic_batch, test/dynamo/test_fx_graph_runnable.py::FxGraphRunnableTest::test_two_inputs_matmul, test/dynamo/test_fx_graph_runnable.py::FxGraphRunnableTest::test_user_defined_triton_kernel, test/dynamo/test_fx_graph_runnable.py::FxGraphRunnableTest::test_user_defined_triton_kernel_autotune 2025-10-10T01:50:15.5223111Z 2025-10-10T01:50:19.4503475Z Running inductor/test_compile_worker 1/1 ... [2025-10-10 01:50:19.449695] 2025-10-10T01:50:19.4504181Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:50:19.4505891Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_compile_worker.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:50:19.450174] 2025-10-10T01:50:26.6795288Z 2025-10-10T01:50:26.6796474Z inductor/test_compile_worker 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_compile_worker_1.1_154742363e488171_.log 2025-10-10T01:50:26.6802050Z Running 5 items in this shard: test/inductor/test_compile_worker.py::TestCompileWorker::test_basic_jobs, test/inductor/test_compile_worker.py::TestCompileWorker::test_crash, test/inductor/test_compile_worker.py::TestCompileWorker::test_exception, test/inductor/test_compile_worker.py::TestCompileWorker::test_logging, test/inductor/test_compile_worker.py::TestCompileWorker::test_quiesce 2025-10-10T01:50:26.6803854Z 2025-10-10T01:50:30.5332827Z Running inductor/test_move_constructors_to_cuda 1/1 ... [2025-10-10 01:50:30.532771] 2025-10-10T01:50:30.5333490Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:50:30.5335480Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_move_constructors_to_cuda.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:50:30.533146] 2025-10-10T01:50:37.6116410Z 2025-10-10T01:50:37.6117782Z inductor/test_move_constructors_to_cuda 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_move_constructors_to_cuda_1.1_12d7a03e610f0010_.log 2025-10-10T01:50:37.6123831Z Running 7 items in this shard: test/inductor/test_move_constructors_to_cuda.py::TestMoveConstructorsToCuda::test_multi_gpu, test/inductor/test_move_constructors_to_cuda.py::TestMoveConstructorsToCuda::test_multiple_constructors, test/inductor/test_move_constructors_to_cuda.py::TestMoveConstructorsToCuda::test_no_gpu, test/inductor/test_move_constructors_to_cuda.py::TestMoveConstructorsToCuda::test_non_convertable_op_failure, test/inductor/test_move_constructors_to_cuda.py::TestMoveConstructorsToCuda::test_output_failure, test/inductor/test_move_constructors_to_cuda.py::TestMoveConstructorsToCuda::test_sets_equiv, test/inductor/test_move_constructors_to_cuda.py::TestMoveConstructorsToCuda::test_simple 2025-10-10T01:50:37.6128517Z 2025-10-10T01:50:41.3765312Z Running inductor/test_subgraph_choice 1/1 ... [2025-10-10 01:50:41.375881] 2025-10-10T01:50:41.3766084Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:50:41.3767860Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_subgraph_choice.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:50:41.376255] 2025-10-10T01:50:48.3047334Z 2025-10-10T01:50:48.3048210Z inductor/test_subgraph_choice 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_subgraph_choice_1.1_27f3d74e8c36f856_.log 2025-10-10T01:50:48.3049580Z Running 2 items in this shard: test/inductor/test_subgraph_choice.py::TestSubgraphChoice::test_subgraph_decompose_k, test/inductor/test_subgraph_choice.py::TestSubgraphChoice::test_subgraph_freeze_layout 2025-10-10T01:50:48.3050356Z 2025-10-10T01:50:52.0606958Z Running export/test_export_strict 1/1 ... [2025-10-10 01:50:52.060170] 2025-10-10T01:50:52.0607775Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:50:52.0609648Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_export_strict.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:50:52.060535] 2025-10-10T01:50:59.9410570Z 2025-10-10T01:50:59.9411380Z export/test_export_strict 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_export_strict_1.1_a3e324381a9966d5_.log 2025-10-10T01:50:59.9576202Z Running 433 items in this shard: test/export/test_export_strict.py::StrictExportTestDynamismExpression::test_export_assume_static_by_default_strict, test/export/test_export_strict.py::StrictExportTestDynamismExpression::test_export_constraints_error_not_in_range_strict, test/export/test_export_strict.py::StrictExportTestDynamismExpression::test_export_constraints_error_strict, test/export/test_export_strict.py::StrictExportTestDynamismExpression::test_export_inline_constraints_strict, test/export/test_export_strict.py::StrictExportTestDynamismExpression::test_export_slice_maxsize_strict, test/export/test_export_strict.py::StrictExportTestDynamismExpression::test_export_slice_unbacked_dim1_strict, test/export/test_export_strict.py::StrictExportTestDynamismExpression::test_export_strict_narrow_unbacked_expr_strict, test/export/test_export_strict.py::StrictExportTestDynamismExpression::test_no_grad_param_inplace_strict, test/export/test_export_strict.py::StrictExportTestDynamismExpression::test_reshape_view_backed_size_oblivious_strict, test/export/test_export_strict.py::StrictExportTestExport::test__scaled_dot_product_flash_attention_strict, test/export/test_export_strict.py::StrictExportTestExport::test_additional_inputs_constants_strict, test/export/test_export_strict.py::StrictExportTestExport::test_allow_explicit_guards_as_runtime_asserts_strict, test/export/test_export_strict.py::StrictExportTestExport::test_args_type_checked_strict, test/export/test_export_strict.py::StrictExportTestExport::test_aten_lift_fresh_copy_strict, test/export/test_export_strict.py::StrictExportTestExport::test_attention_strict, test/export/test_export_strict.py::StrictExportTestExport::test_attr_assignment_extra_strict, test/export/test_export_strict.py::StrictExportTestExport::test_automatic_constrain_size_strict, test/export/test_export_strict.py::StrictExportTestExport::test_automatic_dynamic_shapes_constant_relation_strict, test/export/test_export_strict.py::StrictExportTestExport::test_automatic_dynamic_shapes_linear_relation_strict, test/export/test_export_strict.py::StrictExportTestExport::test_automatic_dynamic_shapes_simple_equality_strict, test/export/test_export_strict.py::StrictExportTestExport::test_baddbmm_strict, test/export/test_export_strict.py::StrictExportTestExport::test_basic_non_strict_fake_tensor_strict, test/export/test_export_strict.py::StrictExportTestExport::test_basic_non_strict_real_tensor_strict, test/export/test_export_strict.py::StrictExportTestExport::test_basic_strict, test/export/test_export_strict.py::StrictExportTestExport::test_bincount_strict, test/export/test_export_strict.py::StrictExportTestExport::test_buffer_util_strict, test/export/test_export_strict.py::StrictExportTestExport::test_capture_subclass_constructor_strict, test/export/test_export_strict.py::StrictExportTestExport::test_capture_subclass_constructor_torch_ir_strict, test/export/test_export_strict.py::StrictExportTestExport::test_capture_subclass_wrong_strict, test/export/test_export_strict.py::StrictExportTestExport::test_ccode_python_mod_strict, test/export/test_export_strict.py::StrictExportTestExport::test_cdist_forward_compute_mode_zero_export_strict, test/export/test_export_strict.py::StrictExportTestExport::test_check_specialized_int_strict, test/export/test_export_strict.py::StrictExportTestExport::test_checks_to_constrain_range_strict, test/export/test_export_strict.py::StrictExportTestExport::test_cleanup_dynamic_markers_strict, test/export/test_export_strict.py::StrictExportTestExport::test_colin_unbacked_backed_vr_sub_strict, test/export/test_export_strict.py::StrictExportTestExport::test_colon_parameter_strict, test/export/test_export_strict.py::StrictExportTestExport::test_compiling_state_strict, test/export/test_export_strict.py::StrictExportTestExport::test_cond_access_identical_symint_closure_strict, test/export/test_export_strict.py::StrictExportTestExport::test_cond_branches_return_constant_int_strict, test/export/test_export_strict.py::StrictExportTestExport::test_cond_branches_return_same_int_strict, test/export/test_export_strict.py::StrictExportTestExport::test_cond_buffers_strict, test/export/test_export_strict.py::StrictExportTestExport::test_cond_contains_unbacked_no_escape_strict, test/export/test_export_strict.py::StrictExportTestExport::test_cond_int_closure_strict, test/export/test_export_strict.py::StrictExportTestExport::test_cond_unflatten_strict, test/export/test_export_strict.py::StrictExportTestExport::test_cond_with_module_stack_export_with_strict, test/export/test_export_strict.py::StrictExportTestExport::test_cond_with_module_stack_export_with_unflatten_strict, test/export/test_export_strict.py::StrictExportTestExport::test_constant_aliasing_strict, test/export/test_export_strict.py::StrictExportTestExport::test_constant_input_naming_strict, test/export/test_export_strict.py::StrictExportTestExport::test_constant_no_user_inp_strict, test/export/test_export_strict.py::StrictExportTestExport::test_constant_output_dup_strict, test/export/test_export_strict.py::StrictExportTestExport::test_constant_output_strict, test/export/test_export_strict.py::StrictExportTestExport::test_constant_requires_grad_const_strict, test/export/test_export_strict.py::StrictExportTestExport::test_constant_return_strict, test/export/test_export_strict.py::StrictExportTestExport::test_constant_tensor_mutation_strict, test/export/test_export_strict.py::StrictExportTestExport::test_constant_tensor_with_non_functional_nested_strict, test/export/test_export_strict.py::StrictExportTestExport::test_constant_tensor_with_non_functional_strict, test/export/test_export_strict.py::StrictExportTestExport::test_constrain_decomp_strict, test/export/test_export_strict.py::StrictExportTestExport::test_constrain_size_in_eager_strict, test/export/test_export_strict.py::StrictExportTestExport::test_constrain_size_with_constrain_value_strict, test/export/test_export_strict.py::StrictExportTestExport::test_constrain_size_with_various_cases_strict, test/export/test_export_strict.py::StrictExportTestExport::test_conv_dynamic_strict, test/export/test_export_strict.py::StrictExportTestExport::test_crop_like_strict, test/export/test_export_strict.py::StrictExportTestExport::test_cse_for_symint_strict, test/export/test_export_strict.py::StrictExportTestExport::test_custom_op_auto_functionalize_pre_dispatch_strict, test/export/test_export_strict.py::StrictExportTestExport::test_custom_op_auto_functionalize_strict, test/export/test_export_strict.py::StrictExportTestExport::test_custom_op_auto_warn_pre_dispatch_strict, test/export/test_export_strict.py::StrictExportTestExport::test_custom_op_preserve_strict, test/export/test_export_strict.py::StrictExportTestExport::test_custom_pytree_strict, test/export/test_export_strict.py::StrictExportTestExport::test_custom_tag_metadata_re_export_strict, test/export/test_export_strict.py::StrictExportTestExport::test_decomp_batch_norm_functional_predispatch_strict, test/export/test_export_strict.py::StrictExportTestExport::test_decomp_item_in_prim_after_decomposition_strict, test/export/test_export_strict.py::StrictExportTestExport::test_decomp_item_in_prim_before_decomposition_strict, test/export/test_export_strict.py::StrictExportTestExport::test_default_decomposition_core_cia_ops_strict, test/export/test_export_strict.py::StrictExportTestExport::test_derived_dim_1_2_strict, test/export/test_export_strict.py::StrictExportTestExport::test_derived_dim_basic_strict, test/export/test_export_strict.py::StrictExportTestExport::test_derived_dim_integer_strict, test/export/test_export_strict.py::StrictExportTestExport::test_derived_dim_nested_strict, test/export/test_export_strict.py::StrictExportTestExport::test_derived_dim_out_of_order_repeat_derived_strict, test/export/test_export_strict.py::StrictExportTestExport::test_derived_dim_out_of_order_simplified_repeat_non_derived_strict, test/export/test_export_strict.py::StrictExportTestExport::test_derived_dim_out_of_order_simplified_strict, test/export/test_export_strict.py::StrictExportTestExport::test_derived_dim_out_of_order_strict, test/export/test_export_strict.py::StrictExportTestExport::test_derived_dim_repeat_derived_strict, test/export/test_export_strict.py::StrictExportTestExport::test_detect_leak_nonstrict_strict, test/export/test_export_strict.py::StrictExportTestExport::test_detect_leak_nonstrict_with_stacktrace_strict, test/export/test_export_strict.py::StrictExportTestExport::test_detect_leak_strict_strict, test/export/test_export_strict.py::StrictExportTestExport::test_device_to_dynamic_strict, test/export/test_export_strict.py::StrictExportTestExport::test_device_to_gpu_strict, test/export/test_export_strict.py::StrictExportTestExport::test_device_to_mutation_float_strict, test/export/test_export_strict.py::StrictExportTestExport::test_device_to_mutation_strict, test/export/test_export_strict.py::StrictExportTestExport::test_device_to_static_strict, test/export/test_export_strict.py::StrictExportTestExport::test_dim_1_2_strict, test/export/test_export_strict.py::StrictExportTestExport::test_dim_auto_and_dim_strict, test/export/test_export_strict.py::StrictExportTestExport::test_dim_dynamic_divisibility_strict, test/export/test_export_strict.py::StrictExportTestExport::test_dim_dynamic_specialization_strict, test/export/test_export_strict.py::StrictExportTestExport::test_dim_dynamic_strict, test/export/test_export_strict.py::StrictExportTestExport::test_dim_hint_range_violations_strict, test/export/test_export_strict.py::StrictExportTestExport::test_dim_hint_ranges_strict, test/export/test_export_strict.py::StrictExportTestExport::test_disable_forced_specializations_errors_strict, test/export/test_export_strict.py::StrictExportTestExport::test_disable_forced_specializations_ok_strict, test/export/test_export_strict.py::StrictExportTestExport::test_distributed_all_gather_into_tensor_strict, test/export/test_export_strict.py::StrictExportTestExport::test_distributed_all_gather_strict, test/export/test_export_strict.py::StrictExportTestExport::test_distributed_all_reduce_strict, test/export/test_export_strict.py::StrictExportTestExport::test_distributed_all_to_all_single_strict, test/export/test_export_strict.py::StrictExportTestExport::test_distributed_reduce_scatter_tensor_strict, test/export/test_export_strict.py::StrictExportTestExport::test_dont_duck_size_for_auto_dynamic_strict, test/export/test_export_strict.py::StrictExportTestExport::test_double_lifted_constants_strict, test/export/test_export_strict.py::StrictExportTestExport::test_draft_export_checks_aliasing_strict, test/export/test_export_strict.py::StrictExportTestExport::test_draft_export_checks_mutation_list_strict, test/export/test_export_strict.py::StrictExportTestExport::test_draft_export_checks_mutation_strict, test/export/test_export_strict.py::StrictExportTestExport::test_draft_export_checks_mutation_with_nan_strict, test/export/test_export_strict.py::StrictExportTestExport::test_draft_export_fake_kernel_inference_errors_strict, test/export/test_export_strict.py::StrictExportTestExport::test_draft_export_infers_fake_kernel_strict, test/export/test_export_strict.py::StrictExportTestExport::test_duplicate_modules_with_non_persistent_buffers_strict, test/export/test_export_strict.py::StrictExportTestExport::test_dynamic_lr_shift_strict, test/export/test_export_strict.py::StrictExportTestExport::test_dynamic_shapes_bounds_strict, test/export/test_export_strict.py::StrictExportTestExport::test_dynamic_shapes_builder_basic_strict, test/export/test_export_strict.py::StrictExportTestExport::test_dynamic_shapes_builder_kwargs_strict, test/export/test_export_strict.py::StrictExportTestExport::test_dynamic_shapes_builder_pytree_strict, test/export/test_export_strict.py::StrictExportTestExport::test_dynamic_shapes_dataclass_strict, test/export/test_export_strict.py::StrictExportTestExport::test_dynamic_shapes_inferred_basic_strict, test/export/test_export_strict.py::StrictExportTestExport::test_dynamic_shapes_serdes_generic_strict, test/export/test_export_strict.py::StrictExportTestExport::test_dynamic_shapes_serdes_user_errors_strict, test/export/test_export_strict.py::StrictExportTestExport::test_dynamic_shapes_serdes_various_strict, test/export/test_export_strict.py::StrictExportTestExport::test_dynamic_shapes_spec_with_pytree_strict, test/export/test_export_strict.py::StrictExportTestExport::test_dynamic_shapes_wrapped_with_shape_guards_strict, test/export/test_export_strict.py::StrictExportTestExport::test_dynamic_sym_round_strict, test/export/test_export_strict.py::StrictExportTestExport::test_ends_of_bounds_oblivious_strict, test/export/test_export_strict.py::StrictExportTestExport::test_error_does_not_reference_eager_fallback_strict, test/export/test_export_strict.py::StrictExportTestExport::test_error_when_passing_mutating_primitive_op_strict, test/export/test_export_strict.py::StrictExportTestExport::test_exception_strict, test/export/test_export_strict.py::StrictExportTestExport::test_expand_copy_export_handles_implicit_true_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_api_with_dynamic_shapes_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_as_backend_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_associative_scan_lifted_buffers_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_associative_scan_symbol_dim_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_associative_scan_symbol_scandim_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_aten_to_unflatten_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_aten_to_unflatten_subclass_pre_dispatch_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_aten_to_unflatten_subclass_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_cond_preserve_torch_fn_for_subgraphs_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_cond_symbool_pred_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_cond_warns_constant_pred_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_custom_decomp_table_basic_pop_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_custom_decomp_table_container_methods_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_custom_op_lib_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_custom_triton_kernel_mutable_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_custom_triton_kernel_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_cyclic_reference_leak_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_decomp_torture_case_1_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_decomp_torture_case_2_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_decomps_dynamic_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_decomps_simple_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_dynamo_config_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_for_training_run_decomp_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_for_training_with_container_type_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_for_training_with_dynamic_shapes_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_for_training_with_mutation_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_for_training_with_state_dict_hooks_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_func_with_default_kwargs_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_func_with_keyword_only_args_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_func_with_kwargs_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_func_with_pytree_kwargs_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_func_with_var_keyword_args_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_func_with_var_keyword_pytree_args_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_func_with_var_postional_args_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_function_schema_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_graph_with_no_inputs_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_input_mutation_bug_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_input_mutation_dynamic_shape_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_input_mutation_static_shape_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_leak_compile_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_linear_preserve_dynamic_shape_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_max_nonstrict_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_max_onnx_reported_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_method_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_mod_constraints_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_module_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_preserve_linear_at_aot_level_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_preserve_linear_but_not_custom_op_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_rnn_variants_with_warning_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_scan_pytree_output_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_script_module_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_statically_known_true_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_then_compile_tensor_ctor_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_with_autocast_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_with_fake_tensor_inputs_on_cuda_devices_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_with_fake_tensor_inputs_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_with_inline_constraints_complex_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_with_inline_constraints_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_with_set_grad_enabled_strict, test/export/test_export_strict.py::StrictExportTestExport::test_export_with_wrong_inputs_strict, test/export/test_export_strict.py::StrictExportTestExport::test_external_call_non_strict_real_tensor_strict, test/export/test_export_strict.py::StrictExportTestExport::test_fake_inputs_strict, test/export/test_export_strict.py::StrictExportTestExport::test_fake_weights_strict, test/export/test_export_strict.py::StrictExportTestExport::test_filter_traceback_frames_strict, test/export/test_export_strict.py::StrictExportTestExport::test_flex_attention_export_strict, test/export/test_export_strict.py::StrictExportTestExport::test_float_conversion_from_int_strict, test/export/test_export_strict.py::StrictExportTestExport::test_float_conversion_strict, test/export/test_export_strict.py::StrictExportTestExport::test_fqn_strict, test/export/test_export_strict.py::StrictExportTestExport::test_from_node_metadata_export_strict, test/export/test_export_strict.py::StrictExportTestExport::test_full_on_scalar_tensor_strict, test/export/test_export_strict.py::StrictExportTestExport::test_function_holding_tensor_strict, test/export/test_export_strict.py::StrictExportTestExport::test_hints_wrapper_strict, test/export/test_export_strict.py::StrictExportTestExport::test_hoo_inline_users_issue_strict, test/export/test_export_strict.py::StrictExportTestExport::test_if_functional_strict, test/export/test_export_strict.py::StrictExportTestExport::test_if_post_autograd_op_preserved_strict, test/export/test_export_strict.py::StrictExportTestExport::test_inductor_backend_inside_nonstrict_strict, test/export/test_export_strict.py::StrictExportTestExport::test_inline_script_class_method_recursive_strict, test/export/test_export_strict.py::StrictExportTestExport::test_inline_script_class_method_strict, test/export/test_export_strict.py::StrictExportTestExport::test_inline_script_function_strict, test/export/test_export_strict.py::StrictExportTestExport::test_inline_script_method_strict, test/export/test_export_strict.py::StrictExportTestExport::test_int_shape_specialization_strict, test/export/test_export_strict.py::StrictExportTestExport::test_intermediate_shape_comp_strict, test/export/test_export_strict.py::StrictExportTestExport::test_is_exporting_strict, test/export/test_export_strict.py::StrictExportTestExport::test_is_nonzero_strict, test/export/test_export_strict.py::StrictExportTestExport::test_isnonzero_strict, test/export/test_export_strict.py::StrictExportTestExport::test_issue_113041_strict, test/export/test_export_strict.py::StrictExportTestExport::test_issue_157289_strict, test/export/test_export_strict.py::StrictExportTestExport::test_issue_161902_strict, test/export/test_export_strict.py::StrictExportTestExport::test_istft_op_strict, test/export/test_export_strict.py::StrictExportTestExport::test_keep_composite_ops_invalid_strict, test/export/test_export_strict.py::StrictExportTestExport::test_keep_composite_ops_linear_convd_for_training_ir_strict, test/export/test_export_strict.py::StrictExportTestExport::test_keep_composite_ops_linear_convd_strict, test/export/test_export_strict.py::StrictExportTestExport::test_kwarg_dynamic_shapes_diff_order_strict, test/export/test_export_strict.py::StrictExportTestExport::test_kwargs_reorder_strict, test/export/test_export_strict.py::StrictExportTestExport::test_layer_norm_unbacked_normalized_shape_strict, test/export/test_export_strict.py::StrictExportTestExport::test_layer_sharing_strict, test/export/test_export_strict.py::StrictExportTestExport::test_lazy_module_kwargs_strict, test/export/test_export_strict.py::StrictExportTestExport::test_lifted_constants_strict, test/export/test_export_strict.py::StrictExportTestExport::test_linear_conv_strict, test/export/test_export_strict.py::StrictExportTestExport::test_malformed_fqn_from_source_name_strict, test/export/test_export_strict.py::StrictExportTestExport::test_map_buffers_strict, test/export/test_export_strict.py::StrictExportTestExport::test_map_strict, test/export/test_export_strict.py::StrictExportTestExport::test_mask_nonzero_static_strict, test/export/test_export_strict.py::StrictExportTestExport::test_masked_select_dynamic_strict, test/export/test_export_strict.py::StrictExportTestExport::test_math_pow_strict, test/export/test_export_strict.py::StrictExportTestExport::test_mismatched_dynamic_shapes_strict, test/export/test_export_strict.py::StrictExportTestExport::test_mixed_input_strict, test/export/test_export_strict.py::StrictExportTestExport::test_module_dict_key_strict, test/export/test_export_strict.py::StrictExportTestExport::test_module_input_strict, test/export/test_export_strict.py::StrictExportTestExport::test_module_input_subclasses_parameterization_nested_strict, test/export/test_export_strict.py::StrictExportTestExport::test_module_list_slice_strict, test/export/test_export_strict.py::StrictExportTestExport::test_module_strict, test/export/test_export_strict.py::StrictExportTestExport::test_module_with_dict_container_inp_out_strict, test/export/test_export_strict.py::StrictExportTestExport::test_modules_access_for_deleted_submodule_strict, test/export/test_export_strict.py::StrictExportTestExport::test_more_multidimensional_slicing_strict, test/export/test_export_strict.py::StrictExportTestExport::test_multidimensional_slicing_strict, test/export/test_export_strict.py::StrictExportTestExport::test_multinomial_dynamic_strict, test/export/test_export_strict.py::StrictExportTestExport::test_multiple_definitions_same_name_dim_strict, test/export/test_export_strict.py::StrictExportTestExport::test_namedtuple_input_export_strict, test/export/test_export_strict.py::StrictExportTestExport::test_native_multi_attention_head_strict, test/export/test_export_strict.py::StrictExportTestExport::test_nested_dynamic_shapes_spec_strict, test/export/test_export_strict.py::StrictExportTestExport::test_nested_module_fake_tensor_leak_strict, test/export/test_export_strict.py::StrictExportTestExport::test_nested_module_strict, test/export/test_export_strict.py::StrictExportTestExport::test_nested_module_with_constant_buffer_strict, test/export/test_export_strict.py::StrictExportTestExport::test_nested_module_with_init_buffer_strict, test/export/test_export_strict.py::StrictExportTestExport::test_nested_module_with_parameter_strict, test/export/test_export_strict.py::StrictExportTestExport::test_nn_module_stack_shared_submodule_strict, test/export/test_export_strict.py::StrictExportTestExport::test_nn_module_stack_strict, test/export/test_export_strict.py::StrictExportTestExport::test_no_check_is_size_error_strict, test/export/test_export_strict.py::StrictExportTestExport::test_no_suggested_fixes_for_data_dependent_errors_strict, test/export/test_export_strict.py::StrictExportTestExport::test_no_tensor_computation_2_strict, test/export/test_export_strict.py::StrictExportTestExport::test_no_tensor_computation_3_strict, test/export/test_export_strict.py::StrictExportTestExport::test_no_tensor_computation_4_strict, test/export/test_export_strict.py::StrictExportTestExport::test_no_tensor_computation_strict, test/export/test_export_strict.py::StrictExportTestExport::test_non_arg_name_dynamic_shapes_api_strict, test/export/test_export_strict.py::StrictExportTestExport::test_non_arg_name_dynamic_shapes_api_with_container_type_strict, test/export/test_export_strict.py::StrictExportTestExport::test_non_arg_name_dynamic_shapes_api_with_kwarg_strict, test/export/test_export_strict.py::StrictExportTestExport::test_non_persistent_buffer_strict, test/export/test_export_strict.py::StrictExportTestExport::test_non_strict_dynamic_shapes_strict, test/export/test_export_strict.py::StrictExportTestExport::test_non_strict_dynamic_shapes_suggested_fixes_strict, test/export/test_export_strict.py::StrictExportTestExport::test_none_buffers_strict, test/export/test_export_strict.py::StrictExportTestExport::test_nonstrict_retrace_preserves_metadata_strict, test/export/test_export_strict.py::StrictExportTestExport::test_nonzero_2_strict, test/export/test_export_strict.py::StrictExportTestExport::test_nonzero_dynamic_strict, test/export/test_export_strict.py::StrictExportTestExport::test_not_registered_parameter_strict, test/export/test_export_strict.py::StrictExportTestExport::test_operator_aten_tensor_mode_variant_strict, test/export/test_export_strict.py::StrictExportTestExport::test_output_node_name_strict, test/export/test_export_strict.py::StrictExportTestExport::test_pad_sequence_strict, test/export/test_export_strict.py::StrictExportTestExport::test_param_util_strict, test/export/test_export_strict.py::StrictExportTestExport::test_partial_patched_forward_strict, test/export/test_export_strict.py::StrictExportTestExport::test_placeholder_naming_collisions_hoo_subgraphs_strict, test/export/test_export_strict.py::StrictExportTestExport::test_placeholder_naming_collisions_strict, test/export/test_export_strict.py::StrictExportTestExport::test_placeholder_naming_order_strict, test/export/test_export_strict.py::StrictExportTestExport::test_placeholder_naming_order_variadic_strict, test/export/test_export_strict.py::StrictExportTestExport::test_placeholder_update_preserving_strict, test/export/test_export_strict.py::StrictExportTestExport::test_predispatch_cond_strict, test/export/test_export_strict.py::StrictExportTestExport::test_predispatch_grad_wrappers_strict, test/export/test_export_strict.py::StrictExportTestExport::test_preserve_annotation_strict, test/export/test_export_strict.py::StrictExportTestExport::test_preserve_module_call_signature_unflatten_specialization_strict, test/export/test_export_strict.py::StrictExportTestExport::test_preserve_requires_grad_placeholders_strict, test/export/test_export_strict.py::StrictExportTestExport::test_preserve_shape_dynamism_for_unused_inputs_strict, test/export/test_export_strict.py::StrictExportTestExport::test_profiling_code_strict, test/export/test_export_strict.py::StrictExportTestExport::test_python_asserts_with_sym_int_strict, test/export/test_export_strict.py::StrictExportTestExport::test_pytree_register_data_class_strict, test/export/test_export_strict.py::StrictExportTestExport::test_pytree_register_nested_data_class_strict, test/export/test_export_strict.py::StrictExportTestExport::test_raise_user_error_when_guard_on_data_dependent_operation_strict, test/export/test_export_strict.py::StrictExportTestExport::test_range_constraints_with_replacement_strict, test/export/test_export_strict.py::StrictExportTestExport::test_real_tensor_alias_dtype_mismatch_strict, test/export/test_export_strict.py::StrictExportTestExport::test_real_tensor_bool_cast_strict, test/export/test_export_strict.py::StrictExportTestExport::test_real_tensor_errors_on_aliasing_custom_op_strict, test/export/test_export_strict.py::StrictExportTestExport::test_real_tensor_for_max_op_strict, test/export/test_export_strict.py::StrictExportTestExport::test_real_tensor_size_mismatch_strict, test/export/test_export_strict.py::StrictExportTestExport::test_redundant_assert_max_upper_bound_strict, test/export/test_export_strict.py::StrictExportTestExport::test_redundant_asserts_strict, test/export/test_export_strict.py::StrictExportTestExport::test_refine_dynamic_shapes_from_suggested_fixes_strict, test/export/test_export_strict.py::StrictExportTestExport::test_register_constant_strict, test/export/test_export_strict.py::StrictExportTestExport::test_repeat_interleave_strict, test/export/test_export_strict.py::StrictExportTestExport::test_replace_unbacked_with_very_large_upperbound_strict, test/export/test_export_strict.py::StrictExportTestExport::test_replaced_unbacked_bindings_strict, test/export/test_export_strict.py::StrictExportTestExport::test_reshape_view_helper_strict, test/export/test_export_strict.py::StrictExportTestExport::test_retracable_ep_strict, test/export/test_export_strict.py::StrictExportTestExport::test_retrace_pre_autograd_strict, test/export/test_export_strict.py::StrictExportTestExport::test_run_decomposition_supports_user_input_mutation_strict, test/export/test_export_strict.py::StrictExportTestExport::test_run_decompositions_keep_metadata_strict, test/export/test_export_strict.py::StrictExportTestExport::test_run_decompositions_keep_tensor_constant_metadata_strict, test/export/test_export_strict.py::StrictExportTestExport::test_runtime_assert_for_prim_strict, test/export/test_export_strict.py::StrictExportTestExport::test_runtime_assert_for_prm_str_strict, test/export/test_export_strict.py::StrictExportTestExport::test_runtime_assert_with_size_strict, test/export/test_export_strict.py::StrictExportTestExport::test_sdpa_gqa_strict, test/export/test_export_strict.py::StrictExportTestExport::test_sequential_slicing_strict, test/export/test_export_strict.py::StrictExportTestExport::test_set_example_inputs_strict, test/export/test_export_strict.py::StrictExportTestExport::test_set_grad_as_side_effect_strict, test/export/test_export_strict.py::StrictExportTestExport::test_set_grad_empty_strict, test/export/test_export_strict.py::StrictExportTestExport::test_set_grad_unflatten_strict, test/export/test_export_strict.py::StrictExportTestExport::test_setgrad_lifted_tensor_strict, test/export/test_export_strict.py::StrictExportTestExport::test_shared_submodule_nn_module_stack_strict, test/export/test_export_strict.py::StrictExportTestExport::test_simple_export_for_training_strict, test/export/test_export_strict.py::StrictExportTestExport::test_simple_unbacked_view_strict, test/export/test_export_strict.py::StrictExportTestExport::test_size_input_strict, test/export/test_export_strict.py::StrictExportTestExport::test_slice_nn_module_stack_strict, test/export/test_export_strict.py::StrictExportTestExport::test_solver_unsupported_sympy_function_strict, test/export/test_export_strict.py::StrictExportTestExport::test_specialize_derived_dim_roots_strict, test/export/test_export_strict.py::StrictExportTestExport::test_split_const_gm_with_lifted_constants_strict, test/export/test_export_strict.py::StrictExportTestExport::test_stack_trace_make_fx_strict, test/export/test_export_strict.py::StrictExportTestExport::test_stack_trace_strict, test/export/test_export_strict.py::StrictExportTestExport::test_state_primitives_strict, test/export/test_export_strict.py::StrictExportTestExport::test_state_shape_attribute_assignment_strict, test/export/test_export_strict.py::StrictExportTestExport::test_state_tensors_strict, test/export/test_export_strict.py::StrictExportTestExport::test_static_dim_constraints_strict, test/export/test_export_strict.py::StrictExportTestExport::test_subclass_context_strict, test/export/test_export_strict.py::StrictExportTestExport::test_subclass_nested_attr_access_complicated_metadata_strict, test/export/test_export_strict.py::StrictExportTestExport::test_subclass_nested_attr_access_const_metadata_not_top_level_strict, test/export/test_export_strict.py::StrictExportTestExport::test_subclass_nested_attr_access_const_metadata_strict, test/export/test_export_strict.py::StrictExportTestExport::test_subclass_nested_attr_access_strict, test/export/test_export_strict.py::StrictExportTestExport::test_subclass_nested_attr_access_submodule_strict, test/export/test_export_strict.py::StrictExportTestExport::test_subclasses_parameterization_nested_strict, test/export/test_export_strict.py::StrictExportTestExport::test_subclasses_parameterization_strict, test/export/test_export_strict.py::StrictExportTestExport::test_suggest_torch_checks_with_non_negative_check_strict, test/export/test_export_strict.py::StrictExportTestExport::test_suggest_torch_checks_with_regular_check_strict, test/export/test_export_strict.py::StrictExportTestExport::test_suggested_fixes_for_data_dependent_errors_basic_strict, test/export/test_export_strict.py::StrictExportTestExport::test_suggested_fixes_for_data_dependent_errors_puzzlers_strict, test/export/test_export_strict.py::StrictExportTestExport::test_suggested_fixes_new_roots_strict, test/export/test_export_strict.py::StrictExportTestExport::test_sym_float_operators_strict, test/export/test_export_strict.py::StrictExportTestExport::test_sym_or_sym_and_strict, test/export/test_export_strict.py::StrictExportTestExport::test_sym_sqrt_strict, test/export/test_export_strict.py::StrictExportTestExport::test_symbool_item_strict, test/export/test_export_strict.py::StrictExportTestExport::test_symfloat_item_strict, test/export/test_export_strict.py::StrictExportTestExport::test_symint_input_additional_inputs_strict, test/export/test_export_strict.py::StrictExportTestExport::test_symint_input_basic_strict, test/export/test_export_strict.py::StrictExportTestExport::test_symint_input_ranges_strict, test/export/test_export_strict.py::StrictExportTestExport::test_symint_input_shapes_collection_strict, test/export/test_export_strict.py::StrictExportTestExport::test_symint_input_specialization_strict, test/export/test_export_strict.py::StrictExportTestExport::test_symint_item_strict, test/export/test_export_strict.py::StrictExportTestExport::test_symint_output_strict, test/export/test_export_strict.py::StrictExportTestExport::test_symint_tensor_return_strict, test/export/test_export_strict.py::StrictExportTestExport::test_tag_ac_export_strict, test/export/test_export_strict.py::StrictExportTestExport::test_tensor_attribute_zero_args_strict, test/export/test_export_strict.py::StrictExportTestExport::test_tensor_constant_aten_to_strict, test/export/test_export_strict.py::StrictExportTestExport::test_tensor_constant_with_wrapped_method_strict, test/export/test_export_strict.py::StrictExportTestExport::test_to_module_with_mutated_buffer_multiple_strict, test/export/test_export_strict.py::StrictExportTestExport::test_to_module_with_mutated_buffer_multiple_update_sub_later_strict, test/export/test_export_strict.py::StrictExportTestExport::test_to_module_with_mutated_buffer_strict, test/export/test_export_strict.py::StrictExportTestExport::test_tolist_strict, test/export/test_export_strict.py::StrictExportTestExport::test_torch_check_eq_commutativity_strict, test/export/test_export_strict.py::StrictExportTestExport::test_torch_fn_strict, test/export/test_export_strict.py::StrictExportTestExport::test_trace_under_fake_strict, test/export/test_export_strict.py::StrictExportTestExport::test_train_eval_on_exported_preautograd_module_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unbacked_3d_matmul_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unbacked_bincount_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unbacked_bindings_for_divisible_u_symint_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unbacked_deferred_runtime_retrace_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unbacked_expand_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unbacked_infer_size_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unbacked_kth_value_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unbacked_linear_layer_norm_input_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unbacked_noncontig_lin_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unbacked_pad_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unbacked_scalar_constructor_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unbacked_slice_forward_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unbacked_slice_simple_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unbacked_stack_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unbacked_to_cond_passthrough_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unbacked_to_cond_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unbacked_unsqueeze_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_asserts_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_buffer_update_child2parent_swap_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_closure_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_isinstance_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_multiple_graphs_dispatch_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_multiple_graphs_preserve_signature_no_error_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_multiple_graphs_shared_submodule_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_multiple_graphs_state_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_no_unroll_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_placeholder_update_child2parent_swap_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_placeholder_update_grandchild2cousin_swap_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_random_dag_5_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_random_dag_6_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_random_dag_buf_8_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_random_dag_const_preserving_3_1_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_random_dag_const_preserving_3_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_random_dag_mutating_buf_4_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_random_dag_mutating_buf_6_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_random_dag_mutating_buf_9_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_random_dag_mutating_buf_preserving_10_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_random_dag_mutating_buf_preserving_4_1_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_random_dag_mutating_buf_preserving_4_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_random_dag_mutating_buf_preserving_5_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_random_dag_mutating_buf_preserving_7_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unflatten_random_dag_preserving_4_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unused_aliases_strict, test/export/test_export_strict.py::StrictExportTestExport::test_unused_constant_strict, test/export/test_export_strict.py::StrictExportTestExport::test_use_embedding_twice_strict, test/export/test_export_strict.py::StrictExportTestExport::test_user_input_and_buffer_mutation_strict, test/export/test_export_strict.py::StrictExportTestExport::test_vmap_custom_autograd_function_strict, test/export/test_export_strict.py::StrictExportTestExport::test_vmap_strict, test/export/test_export_strict.py::StrictExportTestExport::test_vmap_to_assert_strict, test/export/test_export_strict.py::StrictExportTestExport::test_where_decomp_strict, test/export/test_export_strict.py::StrictExportTestExport::test_while_loop_assert_separation_strict, test/export/test_export_strict.py::StrictExportTestExport::test_while_loop_index_assertions_strict, test/export/test_export_strict.py::StrictExportTestExport::test_while_loop_simple_strict, test/export/test_export_strict.py::StrictExportTestExport::test_while_loop_tensor_constant_idx_strict, test/export/test_export_strict.py::StrictExportTestExport::test_wrapper_module_strict 2025-10-10T01:50:59.9732179Z 2025-10-10T01:51:03.7078552Z Running inductor/test_cutedsl_template 1/1 ... [2025-10-10 01:51:03.707261] 2025-10-10T01:51:03.7079200Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:51:03.7080936Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cutedsl_template.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:51:03.707628] 2025-10-10T01:51:10.7866456Z 2025-10-10T01:51:10.7867457Z inductor/test_cutedsl_template 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cutedsl_template_1.1_08913ad265b5d1c7_.log 2025-10-10T01:51:10.7872421Z Running 13 items in this shard: test/inductor/test_cutedsl_template.py::TestCuteDSLTemplate::test_cse_integration, test/inductor/test_cutedsl_template.py::TestCuteDSLTemplate::test_cutedsl_add_e2e, test/inductor/test_cutedsl_template.py::TestCuteDSLTemplate::test_cutedsl_add_e2e_autotune, test/inductor/test_cutedsl_template.py::TestCuteDSLTemplate::test_cutedsl_op_overrides, test/inductor/test_cutedsl_template.py::TestCuteDSLTemplate::test_gen_defines, test/inductor/test_cutedsl_template.py::TestCuteDSLTemplate::test_gen_imports, test/inductor/test_cutedsl_template.py::TestCuteDSLTemplate::test_get_output_hook, test/inductor/test_cutedsl_template.py::TestCuteDSLTemplate::test_indented_buffer_usage, test/inductor/test_cutedsl_template.py::TestCuteDSLTemplate::test_modification_subgraph, test/inductor/test_cutedsl_template.py::TestCuteDSLTemplate::test_multiple_templates_unique_names, test/inductor/test_cutedsl_template.py::TestCuteDSLTemplate::test_render_includes_imports, test/inductor/test_cutedsl_template.py::TestCuteDSLTemplate::test_template_aliasing, test/inductor/test_cutedsl_template.py::TestCuteDSLTemplate::test_template_env_contains_hooks 2025-10-10T01:51:10.7876938Z 2025-10-10T01:51:14.6531018Z Running dynamo/test_inline_and_install 1/1 ... [2025-10-10 01:51:14.652418] 2025-10-10T01:51:14.6532273Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:51:14.6535186Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_inline_and_install.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:51:14.652814] 2025-10-10T01:51:19.0774194Z 2025-10-10T01:51:19.0775419Z dynamo/test_inline_and_install 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_inline_and_install_1.1_b99f352f2c845190_.log 2025-10-10T01:51:19.0862032Z Running 184 items in this shard: test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_access_class_method_from_user_class_attr_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_access_class_method_from_user_class_builtin_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_byte_tensor_does_not_crash_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_capture_symbolic_tracing_simple_within_fake_mode_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_capture_symbolic_tracing_within_fake_mode_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_cond_free_variables_overlapping_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_cond_op_param_buffer_lifted_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_cond_raise_user_error_on_branch_args_mismatch_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_cond_raise_user_error_on_branch_return_multiple_tensors_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_cond_raise_user_error_on_branch_return_non_tensor_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_cond_raise_user_error_on_mismatch_return_length_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_cond_raise_user_error_on_mismatch_return_tensor_meta_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_cond_raise_user_error_on_missing_args_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_cond_raise_user_error_on_non_list_operands_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_cond_raise_user_error_on_non_tensor_operands_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_cond_raise_user_error_on_unsupported_pred_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_cond_supported_pred_types_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_constraint_violation_error_messages_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_dataclass_input_output_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_dict_return_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_dict_return_with_aten_graph_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_dupes_2_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_dupes_2_with_aten_graph_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_dupes_and_bypass_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_dupes_and_bypass_reorder_with_non_tensor_arg_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_dupes_and_bypass_reorder_with_non_tensor_arg_with_aten_graph_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_dupes_and_bypass_with_aten_graph_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_dupes_and_bypass_with_non_tensor_arg_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_dupes_and_bypass_with_non_tensor_arg_with_aten_graph_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_dupes_and_bypass_with_non_tensor_output_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_dupes_and_bypass_with_non_tensor_output_with_aten_graph_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_dupes_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_dupes_with_aten_graph_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_dynamic_slicing_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_dynamic_slicing_invalid_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_dynamic_slicing_simple_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_dynamo_enum_in_tuple_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_dynamo_list_index_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_empty_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_enforce_equalities_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_compare_optimize_with_make_fx_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_cond_in_aten_symbolic_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_control_flow_with_getattr_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_decomp_asserts_bad_args_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_decomp_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_defaults_ok_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_dynamic_control_flow_error_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_dynamic_dim_cleanup_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_dynamic_dim_not_1_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_dynamic_dim_range_constraint_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_graph_bypass_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_graph_bypass_with_aten_graph_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_graph_with_complex_reorder_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_graph_with_complex_reorder_with_aten_graph_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_graph_with_list_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_graph_with_list_with_aten_graph_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_identity_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_masking_with_no_grad_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_meta_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_meta_val_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_mismatched_out_2_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_mismatched_out_2_with_aten_graph_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_mismatched_out_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_mismatched_out_with_aten_graph_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_module_specify_constraints_signature_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_multi_dynamic_dim_constraint_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_multi_dynamic_dim_unsafe_relationship_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_nn_module_stack_patched_module_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_no_raise_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_no_tensor_computation_with_aten_graph_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_pass_arg_by_name_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_pass_arg_by_name_star_args_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_persist_assert_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_preserve_constraints_as_metadata_tensor_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_preserves_nn_module_stack_for_get_attr_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_raise_guard_full_constraint_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_raise_guard_partial_constraint_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_raise_on_relationship_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_shape_control_flow_1_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_specialized_int_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_symbolic_shape_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_args_and_empty_kwargs_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_args_with_default_None_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_args_with_default_float_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_args_with_default_tensor_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_args_with_default_tuple_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_aten_graph_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_builtin_op_on_assume_constant_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_cond_branches_calling_methods_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_cond_closure_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_cond_dynamic_shape_pred_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_cond_with_closed_function_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_constant_dict_values_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_constant_free_function_and_class_method_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_constant_free_function_and_class_method_multiarg_diff_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_constant_free_function_and_class_method_multiarg_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_constant_free_function_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_constant_global_function_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_constant_in_unspecialized_nn_module_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_constant_list_nonzero_free_function_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_constant_list_nonzero_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_constant_method_on_module_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_constant_method_on_module_invoke_twice_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_constant_none_control_flow_free_func_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_constant_none_control_flow_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_constant_not_none_control_flow_free_func_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_constant_not_none_control_flow_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_constant_not_none_control_flow_pos_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_constant_not_return_const_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_constant_tuple_nonzero_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_functools_wrapped_fn_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_functools_wrapped_method_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_kwargs_and_empty_args_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_kwargs_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_kwargs_with_default_None_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_kwargs_with_default_float_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_kwargs_with_default_tensor_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_kwargs_with_default_tuple_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_map_cond_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_map_zero_sized_tensor_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_map_zero_sized_tensor_suppress_errors_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_module_layer_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_nonzero_static_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_shallow_list_copy_with_side_effects_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_shallow_list_copy_wo_side_effects_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_stack_trace_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_symbool_inputs_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_export_with_wrapped_fn_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_exported_graph_serialization_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_func_return_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_func_return_with_aten_graph_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_fx_pytree_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_immutable_list_dict_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_input_container_type_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_invalid_input_global_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_invalid_input_global_multiple_access_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_invalid_input_nonlocal_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_invalid_input_unused_nonlocal_ok_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_list_contains_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_list_not_contains_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_list_unpack_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_list_unpack_with_aten_graph_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_map_cond_param_buffer_lifted_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_mixed_real_and_fake_inputs_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_multiple_outputs_op_with_evaluator_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_nested_cond_op_param_buffer_lifted_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_no_tensor_computation_2_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_no_tensor_computation_2_with_aten_graph_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_no_tensor_computation_fail_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_no_tensor_computation_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_not_functionalize_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_param_buffer_safe_from_mutation_recurse_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_param_buffer_safe_from_mutation_simple_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_pre_dispatch_simple_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_predispatch_with_for_out_dtype_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_predispatch_with_for_out_dtype_nested_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_predispatch_with_higher_order_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_predispatch_with_higher_order_nested_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_preserve_fx_node_metadata_graph_break_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_preserve_fx_node_metadata_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_preserve_fx_node_metadata_inline_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_preserve_fx_node_metadata_recompile_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_remove_redundant_dynamic_dim_in_error_message_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_retracibility_dict_container_inp_out_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_retracibility_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_retracibility_nested_list_out_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_round_dynamic_shapes_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_strict_fake_tensor_prop_real_tensors_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_subclass_parameters_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_sum_param_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_sym_contains_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_symbolic_tracing_within_fake_mode_with_constraints_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_symbolic_tracing_within_fake_mode_with_constraints_with_parameters_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_symbool_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_torch_inference_mode_ctx_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_trivial_constraint_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_uncaptured_higher_order_op_error_not_suppresed_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_untracked_inputs_in_constraints_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_zeroes_in_and_out_different_shape_on_test_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_zeroes_in_and_out_different_shape_on_test_with_aten_graph_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_zeroes_in_new_shape_scalar_out_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_zeroes_in_new_shape_scalar_out_permute_dupe_and_bypass_inline_and_install, test/dynamo/test_inline_and_install.py::InlineAndInstallExportTests::test_zeroes_in_new_shape_scalar_out_permute_inline_and_install 2025-10-10T01:51:19.0944776Z 2025-10-10T01:51:22.9807306Z Running export/test_tree_utils 1/1 ... [2025-10-10 01:51:22.980231] 2025-10-10T01:51:22.9807838Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:51:22.9811013Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_tree_utils.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:51:22.980664] 2025-10-10T01:51:26.8545733Z 2025-10-10T01:51:26.8546675Z export/test_tree_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_tree_utils_1.1_043345f8551ff430_.log 2025-10-10T01:51:26.8548165Z Running 2 items in this shard: test/export/test_tree_utils.py::TestTreeUtils::test_equivalence_check, test/export/test_tree_utils.py::TestTreeUtils::test_reorder_kwargs 2025-10-10T01:51:26.8548879Z 2025-10-10T01:51:30.7856784Z Running dynamo/test_recompiles 1/1 ... [2025-10-10 01:51:30.785109] 2025-10-10T01:51:30.7857227Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:51:30.7859208Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_recompiles.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:51:30.785526] 2025-10-10T01:51:34.8596616Z 2025-10-10T01:51:34.8597580Z dynamo/test_recompiles 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_recompiles_1.1_521468b3edd9c16b_.log 2025-10-10T01:51:34.8606009Z Running 18 items in this shard: test/dynamo/test_recompiles.py::RecompileTests::test_aliasing_guard_failures, test/dynamo/test_recompiles.py::RecompileTests::test_aliasing_guard_failures_with_globals, test/dynamo/test_recompiles.py::RecompileTests::test_ambient_autocast_recompile, test/dynamo/test_recompiles.py::RecompileTests::test_autocast_constant_fold, test/dynamo/test_recompiles.py::RecompileTests::test_automatic_dynamic_on_closed_ints, test/dynamo/test_recompiles.py::RecompileTests::test_automatic_dynamic_reduce_recompiles, test/dynamo/test_recompiles.py::RecompileTests::test_automatic_dynamic_shapes_mark_as_oblivious, test/dynamo/test_recompiles.py::RecompileTests::test_automatic_dynamic_shapes_mark_as_oblivious_fail_counterfactual, test/dynamo/test_recompiles.py::RecompileTests::test_automatic_dynamic_shapes_mark_as_unbacked, test/dynamo/test_recompiles.py::RecompileTests::test_automatic_dynamic_tensor_scalar_change, test/dynamo/test_recompiles.py::RecompileTests::test_dunder_call_recompile, test/dynamo/test_recompiles.py::RecompileTests::test_dynamic_shape_parameter_recompile, test/dynamo/test_recompiles.py::RecompileTests::test_inline_inbuilt_nn_modules_candidate, test/dynamo/test_recompiles.py::RecompileTests::test_no_recompile_over_unused_objects, test/dynamo/test_recompiles.py::RecompileTests::test_no_recursive_compile_after_cache_limit_hit, test/dynamo/test_recompiles.py::RecompileTests::test_recompiles_true_false_flop, test/dynamo/test_recompiles.py::RecompileTests::test_run_mode_after_cache_limit_hit, test/dynamo/test_recompiles.py::RecompileTests::test_simple_module_recompile 2025-10-10T01:51:34.8612007Z 2025-10-10T01:51:38.7690051Z Running dynamo/test_einops 1/1 ... [2025-10-10 01:51:38.768506] 2025-10-10T01:51:38.7690640Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:51:38.7693099Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_einops.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:51:38.768915] 2025-10-10T01:51:42.3410206Z 2025-10-10T01:51:42.3411026Z dynamo/test_einops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_einops_1.1_42e8d60b337a2722_.log 2025-10-10T01:51:42.3412442Z Running 3 items in this shard: test/dynamo/test_einops.py::TestEinops::test_functions_version_none, test/dynamo/test_einops.py::TestEinops::test_layers_version_none, test/dynamo/test_einops.py::TestEinops::test_no_recompile_on_lazy_state_version_none 2025-10-10T01:51:42.3413377Z 2025-10-10T01:51:46.2435577Z Running inductor/test_foreach 1/1 ... [2025-10-10 01:51:46.242944] 2025-10-10T01:51:46.2436167Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:51:46.2439061Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_foreach.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:51:46.243347] 2025-10-10T01:51:54.8270859Z 2025-10-10T01:51:54.8271780Z inductor/test_foreach 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_foreach_1.1_134f2aabe256d0e2_.log 2025-10-10T01:51:54.8447678Z Running 534 items in this shard: test/inductor/test_foreach.py::ForeachTests::test_2d_block_mixed_sizes_with_mask, test/inductor/test_foreach.py::ForeachTests::test_2d_block_no_mixed_sizes_no_mask, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_aliasing, test/inductor/test_foreach.py::ForeachTests::test_broadcasting__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_broadcasting__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_broadcasting__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_broadcasting__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_broadcasting__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_broadcasting__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_broadcasting__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_broadcasting__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_broadcasting_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_broadcasting_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_broadcasting_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_broadcasting_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_broadcasting_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_broadcasting_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_broadcasting_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_broadcasting_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_decomp__foreach_addcdiv, test/inductor/test_foreach.py::ForeachTests::test_decomp__foreach_addcmul, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_enable_dynamic_shapes_cpp_wrapper_cuda, test/inductor/test_foreach.py::ForeachTests::test_enable_dynamic_shapes_python_wrapper, test/inductor/test_foreach.py::ForeachTests::test_foreach_cpp_wrapper_cuda, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_binary_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_binary_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_binary_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_binary_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_binary_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_binary_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_binary_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_binary_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_binary_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_binary_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_unary_foreach_map_abs, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_unary_foreach_map_neg, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_unary_foreach_map_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_unary_foreach_map_sign, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_input_mutation, test/inductor/test_foreach.py::ForeachTests::test_fuse_concat, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_multi_device, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_abs, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_neg, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_rsqrt, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_sign, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_sqrt, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_abs, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_addcmul_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_neg, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_recipaddmul_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_sign, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_abs, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_neg, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_rsqrt, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_sign, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_sqrt, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_abs, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_addcmul_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_neg, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_recipaddmul_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_sign, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_abs, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_neg, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_rsqrt, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_sign, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_sqrt, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_abs, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_addcmul_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_neg, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_recipaddmul_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_sign, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_reinplacing__foreach_add_, test/inductor/test_foreach.py::ForeachTests::test_reinplacing__foreach_div_, test/inductor/test_foreach.py::ForeachTests::test_reinplacing__foreach_mul_, test/inductor/test_foreach.py::ForeachTests::test_reinplacing__foreach_sub_, test/inductor/test_foreach.py::ForeachTests::test_reinplacing_mut_after__foreach_add_, test/inductor/test_foreach.py::ForeachTests::test_reinplacing_mut_after__foreach_div_, test/inductor/test_foreach.py::ForeachTests::test_reinplacing_mut_after__foreach_mul_, test/inductor/test_foreach.py::ForeachTests::test_reinplacing_mut_after__foreach_sub_, test/inductor/test_foreach.py::ForeachTests::test_reinplacing_mut_before__foreach_add_, test/inductor/test_foreach.py::ForeachTests::test_reinplacing_mut_before__foreach_div_, test/inductor/test_foreach.py::ForeachTests::test_reinplacing_mut_before__foreach_mul_, test/inductor/test_foreach.py::ForeachTests::test_reinplacing_mut_before__foreach_sub_, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_abs, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_neg, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_rsqrt, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_sign, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_sqrt, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_abs, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_addcmul_op, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_neg, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_recipaddmul_op, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_sign, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_abs, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_neg, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_rsqrt, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_sign, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_sqrt, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_abs, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_addcmul_op, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_neg, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_recipaddmul_op, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_sign, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_single_scalar__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_single_scalar__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_single_scalar__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_single_scalar__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_single_scalar__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_single_scalar__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_single_scalar__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_single_scalar__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_abs, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_neg, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_rsqrt, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_sign, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_sqrt, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_abs, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_addcmul_op, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_neg, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_recipaddmul_op, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_sign, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_type_promotion__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_type_promotion__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_type_promotion__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_type_promotion__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_type_promotion__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_type_promotion__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_type_promotion__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_type_promotion__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_type_promotion__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_type_promotion_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_type_promotion_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_type_promotion_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_type_promotion_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_type_promotion_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_type_promotion_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_type_promotion_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_type_promotion_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_type_promotion_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_type_promotion_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_type_promotion_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_zero_elems 2025-10-10T01:51:54.8615124Z 2025-10-10T01:51:58.7685962Z Running inductor/test_minifier_utils 1/1 ... [2025-10-10 01:51:58.768010] 2025-10-10T01:51:58.7686445Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:51:58.7688200Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_minifier_utils.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:51:58.768432] 2025-10-10T01:52:02.8932621Z 2025-10-10T01:52:02.8933672Z inductor/test_minifier_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_minifier_utils_1.1_75acc1efa6172974_.log 2025-10-10T01:52:02.8935336Z Running 3 items in this shard: test/inductor/test_minifier_utils.py::MinifierUtilsTests::test_convert_module_to_string, test/inductor/test_minifier_utils.py::MinifierUtilsTests::test_invalid_output, test/inductor/test_minifier_utils.py::MinifierUtilsTests::test_non_exportable 2025-10-10T01:52:02.8936401Z 2025-10-10T01:52:06.8290005Z Running dynamo/test_sdpa 1/1 ... [2025-10-10 01:52:06.828490] 2025-10-10T01:52:06.8290520Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:52:06.8293226Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_sdpa.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:52:06.828903] 2025-10-10T01:52:11.0023584Z 2025-10-10T01:52:11.0024331Z dynamo/test_sdpa 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_sdpa_1.1_8e823cdb223e529d_.log 2025-10-10T01:52:11.0026672Z Running 6 items in this shard: test/dynamo/test_sdpa.py::TestSDPA::test_graph_break_SDPAParams, test/dynamo/test_sdpa.py::TestSDPA::test_input_SDPAParams, test/dynamo/test_sdpa.py::TestSDPA::test_intermediate_attr_access_SDPAParams, test/dynamo/test_sdpa.py::TestSDPA::test_returns_SDPAParams, test/dynamo/test_sdpa.py::TestSDPA::test_sdpa_c_functions_no_graph_break, test/dynamo/test_sdpa.py::TestSDPA::test_sdpa_kernel_decorator_with_compile 2025-10-10T01:52:11.0028250Z 2025-10-10T01:52:14.9022931Z Running inductor/test_compile_subprocess 1/1 ... [2025-10-10 01:52:14.901677] 2025-10-10T01:52:14.9023585Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:52:14.9024862Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_compile_subprocess.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:52:14.902075] 2025-10-10T01:52:22.6946445Z 2025-10-10T01:52:22.6947886Z inductor/test_compiled_optimizers 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_compiled_optimizers_1.1_8320ff453d4d719c_.log 2025-10-10T01:52:22.7463404Z Running 682 items in this shard: test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_S429861, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_recompile, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_rho_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_rho_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_rho_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_weight_decay_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_weight_decay_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_weight_decay_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_weight_decay_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_weight_decay_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_initial_accumulator_value_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_initial_accumulator_value_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_initial_accumulator_value_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_lr_decay_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_lr_decay_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_lr_decay_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_recompile, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_weight_decay_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_weight_decay_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_weight_decay_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_recompile, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_amsgrad_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_amsgrad_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_amsgrad_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_amsgrad_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_amsgrad_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_recompile, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_maximize_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_maximize_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_recompile, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_amsgrad_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_amsgrad_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_amsgrad_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_amsgrad_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_amsgrad_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_lambd_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_lambd_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_lambd_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_maximize_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_maximize_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_recompile_default, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_recompile_foreach, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_recompile_single, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_t0_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_t0_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_t0_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_maximize_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_maximize_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_basic_shampoo, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_closure_graph_break, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_compile_time_smoketest, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_foreach_map_adam, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_get_value_on_static_address, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_guard_on_none_grads, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_momentum_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_momentum_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_momentum_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_recompile, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_decoupled_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_decoupled_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_decoupled_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_capturable_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_capturable_weight_decay_decoupled_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_capturable_weight_decay_decoupled_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_capturable_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_eps_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_eps_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_eps_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_weight_decay_decoupled_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_weight_decay_decoupled_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_weight_decay_decoupled_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_weight_decay_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_weight_decay_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_weight_decay_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_maximize_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_maximize_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_maximize_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_recompile, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_centered_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_centered_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_centered_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_centered_momentum_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_centered_momentum_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_centered_momentum_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_centered_momentum_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_centered_momentum_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_centered_momentum_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_maximize_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_maximize_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_etas_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_etas_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_etas_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_recompile, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_step_sizes_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_step_sizes_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_step_sizes_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_dampening_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_dampening_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_dampening_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_nesterov_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_nesterov_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_nesterov_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_recompile_foreach, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_recompile_single, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_weight_decay_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_weight_decay_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_weight_decay_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_static_address_finalizer, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_ASGD_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_ASGD_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Adadelta_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Adadelta_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Adafactor_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Adafactor_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Adagrad_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Adagrad_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_AdamW_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_AdamW_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Adam_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Adam_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Adamax_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Adamax_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_LBFGS_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_LBFGS_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Muon_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Muon_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_NAdam_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_NAdam_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_RAdam_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_RAdam_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_RMSprop_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_RMSprop_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Rprop_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Rprop_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_SGD_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_SGD_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_SparseAdam_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_SparseAdam_use_closure_True_cuda_float32 2025-10-10T01:52:22.7939910Z 2025-10-10T01:52:26.7230915Z Running export/test_cpp_serdes 1/1 ... [2025-10-10 01:52:26.722483] 2025-10-10T01:52:26.7231795Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:52:26.7233759Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_cpp_serdes.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:52:26.722898] 2025-10-10T01:52:34.8533601Z 2025-10-10T01:52:34.8534714Z export/test_cpp_serdes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_cpp_serdes_1.1_e86dd5e5bb44350b_.log 2025-10-10T01:52:34.8685189Z Running 424 items in this shard: test/export/test_cpp_serdes.py::CppSerdesTestExport::test__scaled_dot_product_flash_attention_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_additional_inputs_constants_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_allow_explicit_guards_as_runtime_asserts_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_args_type_checked_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_aten_lift_fresh_copy_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_attention_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_attr_assignment_extra_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_automatic_constrain_size_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_automatic_dynamic_shapes_constant_relation_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_automatic_dynamic_shapes_linear_relation_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_automatic_dynamic_shapes_simple_equality_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_baddbmm_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_basic_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_basic_non_strict_fake_tensor_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_basic_non_strict_real_tensor_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_bincount_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_buffer_util_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_capture_subclass_constructor_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_capture_subclass_constructor_torch_ir_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_capture_subclass_wrong_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_ccode_python_mod_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_cdist_forward_compute_mode_zero_export_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_check_specialized_int_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_checks_to_constrain_range_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_cleanup_dynamic_markers_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_colin_unbacked_backed_vr_sub_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_colon_parameter_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_compiling_state_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_cond_access_identical_symint_closure_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_cond_branches_return_constant_int_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_cond_branches_return_same_int_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_cond_buffers_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_cond_contains_unbacked_no_escape_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_cond_int_closure_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_cond_unflatten_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_cond_with_module_stack_export_with_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_cond_with_module_stack_export_with_unflatten_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_constant_aliasing_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_constant_input_naming_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_constant_no_user_inp_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_constant_output_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_constant_output_dup_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_constant_requires_grad_const_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_constant_return_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_constant_tensor_mutation_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_constant_tensor_with_non_functional_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_constant_tensor_with_non_functional_nested_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_constrain_decomp_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_constrain_size_in_eager_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_constrain_size_with_constrain_value_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_constrain_size_with_various_cases_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_conv_dynamic_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_crop_like_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_cse_for_symint_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_custom_op_auto_functionalize_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_custom_op_auto_functionalize_pre_dispatch_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_custom_op_auto_warn_pre_dispatch_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_custom_op_preserve_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_custom_pytree_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_custom_tag_metadata_re_export_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_decomp_batch_norm_functional_predispatch_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_decomp_item_in_prim_after_decomposition_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_decomp_item_in_prim_before_decomposition_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_default_decomposition_core_cia_ops_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_derived_dim_1_2_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_derived_dim_basic_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_derived_dim_integer_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_derived_dim_nested_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_derived_dim_out_of_order_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_derived_dim_out_of_order_repeat_derived_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_derived_dim_out_of_order_simplified_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_derived_dim_out_of_order_simplified_repeat_non_derived_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_derived_dim_repeat_derived_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_detect_leak_nonstrict_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_detect_leak_nonstrict_with_stacktrace_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_detect_leak_strict_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_device_to_dynamic_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_device_to_gpu_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_device_to_mutation_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_device_to_mutation_float_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_device_to_static_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_dim_1_2_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_dim_auto_and_dim_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_dim_dynamic_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_dim_dynamic_divisibility_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_dim_dynamic_specialization_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_dim_hint_range_violations_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_dim_hint_ranges_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_disable_forced_specializations_errors_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_disable_forced_specializations_ok_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_distributed_all_gather_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_distributed_all_gather_into_tensor_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_distributed_all_reduce_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_distributed_all_to_all_single_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_distributed_reduce_scatter_tensor_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_dont_duck_size_for_auto_dynamic_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_double_lifted_constants_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_draft_export_checks_aliasing_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_draft_export_checks_mutation_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_draft_export_checks_mutation_list_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_draft_export_checks_mutation_with_nan_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_draft_export_fake_kernel_inference_errors_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_draft_export_infers_fake_kernel_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_duplicate_modules_with_non_persistent_buffers_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_dynamic_lr_shift_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_dynamic_shapes_bounds_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_dynamic_shapes_builder_basic_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_dynamic_shapes_builder_kwargs_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_dynamic_shapes_builder_pytree_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_dynamic_shapes_dataclass_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_dynamic_shapes_inferred_basic_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_dynamic_shapes_serdes_generic_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_dynamic_shapes_serdes_user_errors_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_dynamic_shapes_serdes_various_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_dynamic_shapes_spec_with_pytree_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_dynamic_shapes_wrapped_with_shape_guards_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_dynamic_sym_round_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_ends_of_bounds_oblivious_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_error_does_not_reference_eager_fallback_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_error_when_passing_mutating_primitive_op_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_exception_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_expand_copy_export_handles_implicit_true_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_api_with_dynamic_shapes_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_as_backend_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_associative_scan_lifted_buffers_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_associative_scan_symbol_dim_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_associative_scan_symbol_scandim_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_aten_to_unflatten_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_aten_to_unflatten_subclass_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_aten_to_unflatten_subclass_pre_dispatch_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_cond_preserve_torch_fn_for_subgraphs_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_cond_symbool_pred_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_cond_warns_constant_pred_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_custom_decomp_table_basic_pop_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_custom_decomp_table_container_methods_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_custom_op_lib_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_custom_triton_kernel_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_custom_triton_kernel_mutable_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_cyclic_reference_leak_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_decomp_torture_case_1_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_decomp_torture_case_2_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_decomps_dynamic_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_decomps_simple_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_dynamo_config_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_for_training_run_decomp_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_for_training_with_container_type_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_for_training_with_dynamic_shapes_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_for_training_with_mutation_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_for_training_with_state_dict_hooks_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_func_with_default_kwargs_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_func_with_keyword_only_args_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_func_with_kwargs_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_func_with_pytree_kwargs_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_func_with_var_keyword_args_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_func_with_var_keyword_pytree_args_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_func_with_var_postional_args_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_function_schema_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_graph_with_no_inputs_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_input_mutation_bug_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_input_mutation_dynamic_shape_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_input_mutation_static_shape_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_leak_compile_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_linear_preserve_dynamic_shape_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_max_nonstrict_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_max_onnx_reported_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_method_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_mod_constraints_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_module_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_preserve_linear_at_aot_level_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_preserve_linear_but_not_custom_op_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_rnn_variants_with_warning_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_scan_pytree_output_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_script_module_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_statically_known_true_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_then_compile_tensor_ctor_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_with_autocast_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_with_fake_tensor_inputs_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_with_fake_tensor_inputs_on_cuda_devices_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_with_inline_constraints_complex_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_with_inline_constraints_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_with_set_grad_enabled_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_export_with_wrong_inputs_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_external_call_non_strict_real_tensor_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_fake_inputs_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_fake_weights_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_filter_traceback_frames_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_flex_attention_export_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_float_conversion_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_float_conversion_from_int_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_fqn_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_from_node_metadata_export_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_full_on_scalar_tensor_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_function_holding_tensor_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_hints_wrapper_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_hoo_inline_users_issue_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_if_functional_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_if_post_autograd_op_preserved_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_inductor_backend_inside_nonstrict_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_inline_script_class_method_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_inline_script_class_method_recursive_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_inline_script_function_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_inline_script_method_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_int_shape_specialization_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_intermediate_shape_comp_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_is_exporting_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_is_nonzero_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_isnonzero_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_issue_113041_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_issue_157289_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_issue_161902_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_istft_op_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_keep_composite_ops_invalid_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_keep_composite_ops_linear_convd_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_keep_composite_ops_linear_convd_for_training_ir_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_kwarg_dynamic_shapes_diff_order_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_kwargs_reorder_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_layer_norm_unbacked_normalized_shape_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_layer_sharing_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_lazy_module_kwargs_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_lifted_constants_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_linear_conv_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_malformed_fqn_from_source_name_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_map_buffers_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_map_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_mask_nonzero_static_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_masked_select_dynamic_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_math_pow_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_mismatched_dynamic_shapes_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_mixed_input_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_module_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_module_dict_key_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_module_input_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_module_input_subclasses_parameterization_nested_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_module_list_slice_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_module_with_dict_container_inp_out_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_modules_access_for_deleted_submodule_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_more_multidimensional_slicing_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_multidimensional_slicing_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_multinomial_dynamic_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_multiple_definitions_same_name_dim_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_namedtuple_input_export_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_native_multi_attention_head_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_nested_dynamic_shapes_spec_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_nested_module_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_nested_module_fake_tensor_leak_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_nested_module_with_constant_buffer_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_nested_module_with_init_buffer_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_nested_module_with_parameter_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_nn_module_stack_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_nn_module_stack_shared_submodule_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_no_check_is_size_error_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_no_suggested_fixes_for_data_dependent_errors_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_no_tensor_computation_2_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_no_tensor_computation_3_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_no_tensor_computation_4_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_no_tensor_computation_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_non_arg_name_dynamic_shapes_api_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_non_arg_name_dynamic_shapes_api_with_container_type_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_non_arg_name_dynamic_shapes_api_with_kwarg_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_non_persistent_buffer_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_non_strict_dynamic_shapes_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_non_strict_dynamic_shapes_suggested_fixes_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_none_buffers_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_nonstrict_retrace_preserves_metadata_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_nonzero_2_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_nonzero_dynamic_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_not_registered_parameter_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_operator_aten_tensor_mode_variant_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_output_node_name_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_pad_sequence_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_param_util_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_partial_patched_forward_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_placeholder_naming_collisions_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_placeholder_naming_collisions_hoo_subgraphs_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_placeholder_naming_order_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_placeholder_naming_order_variadic_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_placeholder_update_preserving_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_predispatch_cond_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_predispatch_grad_wrappers_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_preserve_annotation_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_preserve_module_call_signature_unflatten_specialization_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_preserve_requires_grad_placeholders_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_preserve_shape_dynamism_for_unused_inputs_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_profiling_code_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_python_asserts_with_sym_int_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_pytree_register_data_class_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_pytree_register_nested_data_class_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_raise_user_error_when_guard_on_data_dependent_operation_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_range_constraints_with_replacement_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_real_tensor_alias_dtype_mismatch_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_real_tensor_bool_cast_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_real_tensor_errors_on_aliasing_custom_op_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_real_tensor_for_max_op_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_real_tensor_size_mismatch_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_redundant_assert_max_upper_bound_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_redundant_asserts_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_refine_dynamic_shapes_from_suggested_fixes_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_register_constant_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_repeat_interleave_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_replace_unbacked_with_very_large_upperbound_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_replaced_unbacked_bindings_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_reshape_view_helper_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_retracable_ep_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_retrace_pre_autograd_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_run_decomposition_supports_user_input_mutation_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_run_decompositions_keep_metadata_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_run_decompositions_keep_tensor_constant_metadata_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_runtime_assert_for_prim_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_runtime_assert_for_prm_str_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_runtime_assert_with_size_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_sdpa_gqa_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_sequential_slicing_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_set_example_inputs_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_set_grad_as_side_effect_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_set_grad_empty_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_set_grad_unflatten_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_setgrad_lifted_tensor_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_shared_submodule_nn_module_stack_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_simple_export_for_training_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_simple_unbacked_view_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_size_input_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_slice_nn_module_stack_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_solver_unsupported_sympy_function_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_specialize_derived_dim_roots_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_split_const_gm_with_lifted_constants_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_stack_trace_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_stack_trace_make_fx_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_state_primitives_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_state_shape_attribute_assignment_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_state_tensors_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_static_dim_constraints_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_subclass_context_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_subclass_nested_attr_access_complicated_metadata_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_subclass_nested_attr_access_const_metadata_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_subclass_nested_attr_access_const_metadata_not_top_level_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_subclass_nested_attr_access_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_subclass_nested_attr_access_submodule_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_subclasses_parameterization_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_subclasses_parameterization_nested_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_suggest_torch_checks_with_non_negative_check_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_suggest_torch_checks_with_regular_check_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_suggested_fixes_for_data_dependent_errors_basic_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_suggested_fixes_for_data_dependent_errors_puzzlers_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_suggested_fixes_new_roots_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_sym_float_operators_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_sym_or_sym_and_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_sym_sqrt_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_symbool_item_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_symfloat_item_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_symint_input_additional_inputs_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_symint_input_basic_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_symint_input_ranges_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_symint_input_shapes_collection_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_symint_input_specialization_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_symint_item_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_symint_output_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_symint_tensor_return_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_tag_ac_export_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_tensor_attribute_zero_args_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_tensor_constant_aten_to_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_tensor_constant_with_wrapped_method_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_to_module_with_mutated_buffer_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_to_module_with_mutated_buffer_multiple_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_to_module_with_mutated_buffer_multiple_update_sub_later_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_tolist_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_torch_check_eq_commutativity_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_torch_fn_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_trace_under_fake_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_train_eval_on_exported_preautograd_module_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unbacked_3d_matmul_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unbacked_bincount_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unbacked_bindings_for_divisible_u_symint_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unbacked_deferred_runtime_retrace_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unbacked_expand_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unbacked_infer_size_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unbacked_kth_value_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unbacked_linear_layer_norm_input_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unbacked_noncontig_lin_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unbacked_pad_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unbacked_scalar_constructor_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unbacked_slice_forward_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unbacked_slice_simple_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unbacked_stack_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unbacked_to_cond_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unbacked_to_cond_passthrough_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unbacked_unsqueeze_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_asserts_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_buffer_update_child2parent_swap_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_closure_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_isinstance_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_multiple_graphs_dispatch_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_multiple_graphs_preserve_signature_no_error_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_multiple_graphs_shared_submodule_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_multiple_graphs_state_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_no_unroll_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_placeholder_update_child2parent_swap_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_placeholder_update_grandchild2cousin_swap_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_random_dag_5_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_random_dag_6_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_random_dag_buf_8_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_random_dag_const_preserving_3_1_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_random_dag_const_preserving_3_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_random_dag_mutating_buf_4_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_random_dag_mutating_buf_6_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_random_dag_mutating_buf_9_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_random_dag_mutating_buf_preserving_10_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_random_dag_mutating_buf_preserving_4_1_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_random_dag_mutating_buf_preserving_4_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_random_dag_mutating_buf_preserving_5_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_random_dag_mutating_buf_preserving_7_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unflatten_random_dag_preserving_4_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unused_aliases_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_unused_constant_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_use_embedding_twice_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_user_input_and_buffer_mutation_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_vmap_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_vmap_custom_autograd_function_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_vmap_to_assert_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_where_decomp_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_while_loop_assert_separation_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_while_loop_index_assertions_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_while_loop_simple_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_while_loop_tensor_constant_idx_cpp_serdes, test/export/test_cpp_serdes.py::CppSerdesTestExport::test_wrapper_module_cpp_serdes 2025-10-10T01:52:34.8829880Z 2025-10-10T01:52:38.6309120Z Running inductor/test_debug_trace 1/1 ... [2025-10-10 01:52:38.630325] 2025-10-10T01:52:38.6309861Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:52:38.6312053Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_debug_trace.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:52:38.630715] 2025-10-10T01:52:46.0109975Z 2025-10-10T01:52:46.0111012Z inductor/test_debug_trace 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_debug_trace_1.1_8370b203a1624b92_.log 2025-10-10T01:52:46.0112534Z Running 3 items in this shard: test/inductor/test_debug_trace.py::TestDebugTrace::test_debug_multi_tempalte, test/inductor/test_debug_trace.py::TestDebugTrace::test_debug_printer_const, test/inductor/test_debug_trace.py::TestDebugTrace::test_debug_trace 2025-10-10T01:52:46.0113494Z 2025-10-10T01:52:49.8713310Z Running inductor/test_memory 1/1 ... [2025-10-10 01:52:49.870763] 2025-10-10T01:52:49.8713748Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:52:49.8715050Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_memory.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:52:49.871160] 2025-10-10T01:52:56.9488721Z 2025-10-10T01:52:56.9490176Z inductor/test_memory 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_memory_1.1_93bb42c254f3d933_.log 2025-10-10T01:52:56.9496274Z Running 7 items in this shard: test/inductor/test_memory.py::TestOperatorReorderForPeakMemory::test_fusing_reductions_increase_peak_memory, test/inductor/test_memory.py::TestOperatorReorderForPeakMemory::test_multiple_mutations_of_buf, test/inductor/test_memory.py::TestOperatorReorderForPeakMemory::test_mutation_size_propogation, test/inductor/test_memory.py::TestOperatorReorderForPeakMemory::test_reorder_peak_memory, test/inductor/test_memory.py::TestOperatorReorderForPeakMemory::test_reorder_peak_memory_bfs, test/inductor/test_memory.py::TestOperatorReorderForPeakMemory::test_reorder_peak_memory_dfs, test/inductor/test_memory.py::TestOperatorReorderForPeakMemory::test_reorder_peak_memory_lpmf 2025-10-10T01:52:56.9501215Z 2025-10-10T01:53:00.8512481Z Running dynamo/test_frame_init 1/1 ... [2025-10-10 01:53:00.850678] 2025-10-10T01:53:00.8512949Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:53:00.8514718Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_frame_init.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:53:00.851103] 2025-10-10T01:53:04.7232524Z 2025-10-10T01:53:04.7234042Z dynamo/test_frame_init 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_frame_init_1.1_b9ced74bf7b0c34e_.log 2025-10-10T01:53:04.7235171Z Running 1 items in this shard: test/dynamo/test_frame_init.py::FrameInitTests::test_frame_init 2025-10-10T01:53:04.7235558Z 2025-10-10T01:53:08.6091750Z Running inductor/test_kernel_optimization 1/1 ... [2025-10-10 01:53:08.608526] 2025-10-10T01:53:08.6092259Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:53:08.6093366Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_kernel_optimization.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:53:08.608922] 2025-10-10T01:53:15.8887120Z 2025-10-10T01:53:15.8888590Z inductor/test_kernel_optimization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_kernel_optimization_1.1_e3492474b9ed8890_.log 2025-10-10T01:53:15.8889840Z Running 0 items in this shard: 2025-10-10T01:53:15.8890180Z 2025-10-10T01:53:19.6704557Z Running inductor/test_combo_kernels 1/1 ... [2025-10-10 01:53:19.669819] 2025-10-10T01:53:19.6705322Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:53:19.6707505Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_combo_kernels.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:53:19.670212] 2025-10-10T01:53:27.2015472Z 2025-10-10T01:53:27.2016659Z inductor/test_combo_kernels 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_combo_kernels_1.1_071e2ab93f2c127a_.log 2025-10-10T01:53:27.2025217Z Running 21 items in this shard: test/inductor/test_combo_kernels.py::ComboKernelTests::test_2d_blocking_partitioning, test/inductor/test_combo_kernels.py::ComboKernelTests::test_activation_functions, test/inductor/test_combo_kernels.py::ComboKernelTests::test_mutated_args, test/inductor/test_combo_kernels.py::ComboKernelTests::test_reduce_functions, test/inductor/test_combo_kernels.py::ComboKernelTests::test_reduce_split, test/inductor/test_combo_kernels.py::ComboKernelBenchmarkTests::test_2d_blocking_benchmark, test/inductor/test_combo_kernels.py::ComboKernelBenchmarkTests::test_activation_benchmark, test/inductor/test_combo_kernels.py::ComboKernelBenchmarkTests::test_mutated_benchmark, test/inductor/test_combo_kernels.py::ComboKernelBenchmarkTests::test_persistent_reduction_no_x_dim, test/inductor/test_combo_kernels.py::ComboKernelBenchmarkTests::test_reduce_benchmark, test/inductor/test_combo_kernels.py::ComboKernelBenchmarkTests::test_round_robin_dispatch, test/inductor/test_combo_kernels.py::ComboKernelDynamicShapesTests::test_dynamic_shapes_2d_blocking, test/inductor/test_combo_kernels.py::ComboKernelDynamicShapesTests::test_dynamic_shapes_2d_blocking_round_robin, test/inductor/test_combo_kernels.py::ComboKernelDynamicShapesTests::test_dynamic_shapes_activations, test/inductor/test_combo_kernels.py::ComboKernelDynamicShapesTests::test_dynamic_shapes_activations_no_autotune, test/inductor/test_combo_kernels.py::ComboKernelDynamicShapesTests::test_dynamic_shapes_mutated, test/inductor/test_combo_kernels.py::ComboKernelDynamicShapesTests::test_dynamic_shapes_persistent_reduction_mixed_x_dim_cuda, test/inductor/test_combo_kernels.py::ComboKernelDynamicShapesTests::test_dynamic_shapes_persistent_reduction_no_x_dim, test/inductor/test_combo_kernels.py::ComboKernelDynamicShapesTests::test_dynamic_shapes_persistent_reduction_no_x_dim_2, test/inductor/test_combo_kernels.py::ComboKernelDynamicShapesTests::test_dynamic_shapes_reduce, test/inductor/test_combo_kernels.py::ComboKernelDynamicShapesTests::test_helper_fn_defined 2025-10-10T01:53:27.2033393Z 2025-10-10T01:53:31.0581112Z Running inductor/test_inplacing_pass 1/1 ... [2025-10-10 01:53:31.057523] 2025-10-10T01:53:31.0581752Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:53:31.0583375Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_inplacing_pass.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:53:31.057899] 2025-10-10T01:53:38.1347323Z 2025-10-10T01:53:38.1348582Z inductor/test_inplacing_pass 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_inplacing_pass_1.1_09815c94dc7811ce_.log 2025-10-10T01:53:38.1359165Z Running 22 items in this shard: test/inductor/test_inplacing_pass.py::TestReinplacingPassCorrectness::test_counters_functionalize_old, test/inductor/test_inplacing_pass.py::TestReinplacingPassCorrectness::test_counters_functionalize_v2, test/inductor/test_inplacing_pass.py::TestReinplacingPassCorrectness::test_dont_modify_input, test/inductor/test_inplacing_pass.py::TestReinplacingPassCorrectness::test_dont_modify_live, test/inductor/test_inplacing_pass.py::TestReinplacingPassCorrectness::test_dont_modify_view_of_live, test/inductor/test_inplacing_pass.py::TestReinplacingPassCorrectness::test_generalized_scatter, test/inductor/test_inplacing_pass.py::TestReinplacingPassCorrectness::test_lists_functionalize_v2, test/inductor/test_inplacing_pass.py::TestReinplacingPassCorrectness::test_lists_old_functionalize, test/inductor/test_inplacing_pass.py::TestReinplacingPassCorrectness::test_multi_output_intermediate, test/inductor/test_inplacing_pass.py::TestReinplacingPassCorrectness::test_multiple_intermediate, test/inductor/test_inplacing_pass.py::TestReinplacingPassCorrectness::test_multiple_mutations, test/inductor/test_inplacing_pass.py::TestReinplacingPassCorrectness::test_partitioner_recomputes_factory_empty_like_sin_op, test/inductor/test_inplacing_pass.py::TestReinplacingPassCorrectness::test_partitioner_recomputes_factory_empty_like_sin_triton, test/inductor/test_inplacing_pass.py::TestReinplacingPassCorrectness::test_partitioner_recomputes_factory_ones_like_sin_op, test/inductor/test_inplacing_pass.py::TestReinplacingPassCorrectness::test_partitioner_recomputes_factory_ones_like_sin_triton, test/inductor/test_inplacing_pass.py::TestReinplacingPassCorrectness::test_should_modify_inner, test/inductor/test_inplacing_pass.py::TestReinplacingPassCorrectness::test_should_modify_input, test/inductor/test_inplacing_pass.py::TestReinplacingPassCorrectness::test_view_inplaced2_functionalize_v2, test/inductor/test_inplacing_pass.py::TestReinplacingPassCorrectness::test_view_inplaced_functionalize_v2, test/inductor/test_inplacing_pass.py::TestReinplacingPassCorrectness::test_views_not_inplaced2_functionalize_v2, test/inductor/test_inplacing_pass.py::TestReinplacingPassCorrectness::test_views_not_inplaced3_functionalize_v2, test/inductor/test_inplacing_pass.py::TestReinplacingPassCorrectness::test_views_not_inplaced_functionalize_v2 2025-10-10T01:53:38.1368500Z 2025-10-10T01:53:41.9620455Z Running dynamo/test_skip_non_tensor 1/1 ... [2025-10-10 01:53:41.961453] 2025-10-10T01:53:41.9621070Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:53:41.9622529Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_skip_non_tensor.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:53:41.961839] 2025-10-10T01:53:45.9343128Z 2025-10-10T01:53:45.9344046Z dynamo/test_skip_non_tensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_skip_non_tensor_1.1_e54bcb5cd15b165c_.log 2025-10-10T01:53:45.9346917Z Running 8 items in this shard: test/dynamo/test_skip_non_tensor.py::SkipNonTensorTests::test_add_skip, test/dynamo/test_skip_non_tensor.py::SkipNonTensorTests::test_add_tensor1, test/dynamo/test_skip_non_tensor.py::SkipNonTensorTests::test_add_tensor2, test/dynamo/test_skip_non_tensor.py::SkipNonTensorTests::test_add_tensor_dict, test/dynamo/test_skip_non_tensor.py::SkipNonTensorTests::test_add_tensor_list, test/dynamo/test_skip_non_tensor.py::SkipNonTensorTests::test_custom_list, test/dynamo/test_skip_non_tensor.py::SkipNonTensorTests::test_do_not_skip_side_effects, test/dynamo/test_skip_non_tensor.py::SkipNonTensorTests::test_recursive_list 2025-10-10T01:53:45.9349734Z 2025-10-10T01:53:49.8278642Z Running inductor/test_op_dtype_prop 1/1 ... [2025-10-10 01:53:49.827224] 2025-10-10T01:53:49.8279105Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:53:49.8280112Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_op_dtype_prop.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:53:49.827627] 2025-10-10T01:53:59.1122995Z 2025-10-10T01:53:59.1123966Z inductor/test_op_dtype_prop 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_op_dtype_prop_1.1_e430cf9b6161cf23_.log 2025-10-10T01:53:59.1336865Z Running 567 items in this shard: test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_any_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_assoc_scan_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_binary_math_mixed_precision_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_codegen_upcast_to_fp32_upcast_to_fp32_False_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_codegen_upcast_to_fp32_upcast_to_fp32_True_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_constant_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_downcast_div_mod_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_abs_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_abs_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_abs_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_abs_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_acos_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_acos_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_acos_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_acos_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_acosh_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_acosh_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_acosh_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_acosh_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_asin_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_asin_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_asin_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_asin_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_asinh_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_asinh_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_asinh_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_asinh_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_atan2_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_atan2_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_atan2_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_atan2_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_atan_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_atan_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_atan_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_atan_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_atanh_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_atanh_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_atanh_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_atanh_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_ceil_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_ceil_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_ceil_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_ceil_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_copysign_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_copysign_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_copysign_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_copysign_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_cos_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_cos_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_cos_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_cos_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_cosh_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_cosh_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_cosh_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_cosh_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_erf_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_erf_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_erf_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_erf_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_erfc_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_erfc_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_erfc_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_erfc_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_erfinv_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_erfinv_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_erfinv_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_erfinv_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_exp2_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_exp2_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_exp2_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_exp2_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_exp_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_exp_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_exp_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_exp_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_expm1_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_expm1_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_expm1_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_expm1_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_floor_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_floor_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_floor_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_floor_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_fmod_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_fmod_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_fmod_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_fmod_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_hypot_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_hypot_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_hypot_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_hypot_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_isinf_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_isinf_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_isinf_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_isinf_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_isnan_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_isnan_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_isnan_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_isnan_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_lgamma_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_lgamma_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_lgamma_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_lgamma_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_log10_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_log10_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_log10_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_log10_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_log1p_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_log1p_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_log1p_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_log1p_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_log2_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_log2_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_log2_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_log2_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_log_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_log_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_log_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_log_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_nextafter_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_nextafter_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_nextafter_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_nextafter_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_pow_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_pow_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_pow_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_pow_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_round_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_round_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_round_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_round_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_rsqrt_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_rsqrt_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_rsqrt_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_rsqrt_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_sigmoid_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_sigmoid_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_sigmoid_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_sigmoid_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_sin_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_sin_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_sin_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_sin_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_sinh_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_sinh_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_sinh_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_sinh_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_sqrt_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_sqrt_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_sqrt_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_sqrt_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_tan_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_tan_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_tan_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_tan_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_tanh_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_tanh_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_tanh_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_tanh_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_trunc_load_upcast_to_fp32_False_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_trunc_load_upcast_to_fp32_False_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_trunc_load_upcast_to_fp32_True_bfloat16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_dtype_aware_codegen_op_name_trunc_load_upcast_to_fp32_True_float16_cuda, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_abs_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_abs_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_abs_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_abs_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_abs_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_acos_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_acos_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_acos_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_acos_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_acos_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_acosh_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_acosh_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_acosh_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_acosh_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_acosh_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_add_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_add_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_add_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_add_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_add_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_angle_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_angle_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_angle_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_angle_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_angle_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_asin_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_asin_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_asin_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_asin_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_asin_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_asinh_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_asinh_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_asinh_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_asinh_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_asinh_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_atan2_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_atan2_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_atan2_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_atan2_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_atan2_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_atan_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_atan_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_atan_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_atan_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_atan_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_atanh_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_atanh_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_atanh_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_atanh_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_atanh_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_bitwise_and_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_bitwise_and_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_bitwise_and_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_bitwise_left_shift_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_bitwise_left_shift_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_bitwise_not_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_bitwise_not_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_bitwise_not_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_bitwise_or_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_bitwise_or_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_bitwise_or_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_bitwise_right_shift_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_bitwise_right_shift_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_bitwise_xor_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_bitwise_xor_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_bitwise_xor_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_ceil_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_ceil_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_ceil_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_ceil_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_clamp_max_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_clamp_max_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_clamp_max_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_clamp_max_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_clamp_max_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_clamp_min_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_clamp_min_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_clamp_min_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_clamp_min_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_clamp_min_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_clone_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_clone_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_clone_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_clone_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_clone_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_copysign_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_copysign_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_copysign_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_copysign_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_copysign_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_cos_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_cos_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_cos_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_cos_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_cos_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_cosh_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_cosh_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_cosh_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_cosh_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_cosh_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_digamma_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_digamma_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_digamma_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_digamma_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_digamma_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_div_floor_rounding_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_div_floor_rounding_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_div_floor_rounding_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_div_floor_rounding_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_div_no_rounding_mode_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_div_no_rounding_mode_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_div_no_rounding_mode_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_div_no_rounding_mode_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_div_no_rounding_mode_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_div_trunc_rounding_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_div_trunc_rounding_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_div_trunc_rounding_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_div_trunc_rounding_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_eq_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_eq_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_eq_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_eq_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_eq_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_erf_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_erf_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_erf_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_erf_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_erf_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_erfc_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_erfc_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_erfc_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_erfc_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_erfc_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_erfinv_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_erfinv_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_erfinv_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_erfinv_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_erfinv_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_exp2_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_exp2_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_exp2_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_exp2_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_exp2_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_exp_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_exp_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_exp_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_exp_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_exp_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_expm1_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_expm1_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_expm1_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_expm1_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_expm1_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_floor_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_floor_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_floor_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_floor_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_fmod_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_fmod_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_fmod_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_fmod_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_frexp_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_frexp_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_gcd_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_gcd_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_ge_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_ge_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_ge_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_ge_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_ge_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_gt_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_gt_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_gt_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_gt_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_gt_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_hypot_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_hypot_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_i0_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_i0_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_i0_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_i0_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_i0_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_igamma_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_igamma_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_igammac_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_igammac_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_isinf_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_isinf_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_isinf_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_isinf_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_isinf_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_isnan_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_isnan_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_isnan_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_isnan_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_isnan_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_le_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_le_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_le_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_le_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_le_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_lgamma_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_lgamma_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_lgamma_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_lgamma_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_lgamma_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_log10_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_log10_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_log10_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_log10_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_log10_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_log1p_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_log1p_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_log1p_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_log1p_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_log1p_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_log2_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_log2_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_log2_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_log2_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_log2_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_log_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_log_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_log_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_log_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_log_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_logical_and_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_logical_and_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_logical_and_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_logical_and_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_logical_and_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_logical_not_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_logical_not_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_logical_not_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_logical_not_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_logical_not_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_logical_or_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_logical_or_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_logical_or_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_logical_or_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_logical_or_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_logical_xor_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_logical_xor_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_logical_xor_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_logical_xor_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_logical_xor_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_lt_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_lt_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_lt_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_lt_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_lt_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_max_binary_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_max_binary_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_max_binary_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_max_binary_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_max_binary_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_maximum_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_maximum_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_maximum_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_maximum_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_maximum_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_min_binary_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_min_binary_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_min_binary_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_min_binary_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_min_binary_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_minimum_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_minimum_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_minimum_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_minimum_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_minimum_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_mul_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_mul_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_mul_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_mul_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_mul_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_ne_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_ne_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_ne_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_ne_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_ne_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_neg_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_neg_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_neg_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_neg_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_nextafter_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_nextafter_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_0_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_0_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_0_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_0_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_0_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_1_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_1_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_1_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_1_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_1_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_2_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_2_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_2_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_2_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_2_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_3_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_3_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_3_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_3_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_3_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_4_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_4_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_4_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_4_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_polygamma_polygamma_n_4_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_pow_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_pow_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_pow_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_pow_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_reciprocal_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_reciprocal_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_reciprocal_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_reciprocal_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_reciprocal_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_remainder_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_remainder_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_remainder_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_remainder_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_round_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_round_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_round_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_round_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_round_decimals_0_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_round_decimals_0_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_round_decimals_3_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_round_decimals_3_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_round_decimals_neg_3_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_round_decimals_neg_3_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_rsqrt_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_rsqrt_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_rsqrt_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_rsqrt_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_rsqrt_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sigmoid_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sigmoid_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sigmoid_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sigmoid_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sigmoid_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sign_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sign_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sign_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sign_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sign_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_signbit_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_signbit_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_signbit_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_signbit_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_signbit_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sin_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sin_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sin_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sin_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sin_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sinh_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sinh_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sinh_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sinh_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sinh_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sqrt_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sqrt_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sqrt_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sqrt_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sqrt_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_square_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_square_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_square_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_square_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_square_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sub_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sub_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sub_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_sub_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_tan_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_tan_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_tan_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_tan_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_tan_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_tanh_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_tanh_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_tanh_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_tanh_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_tanh_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_true_divide_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_true_divide_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_true_divide_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_true_divide_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_true_divide_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_trunc_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_trunc_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_trunc_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_trunc_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_where_cuda_bool, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_where_cuda_float32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_where_cuda_float64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_where_cuda_int32, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_propagation_where_cuda_int64, test/inductor/test_op_dtype_prop.py::TestCaseCUDA::test_op_dtype_support_cuda 2025-10-10T01:53:59.1537905Z 2025-10-10T01:54:02.9588225Z Running dynamo/test_reconstruct 1/1 ... [2025-10-10 01:54:02.958254] 2025-10-10T01:54:02.9588850Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:54:02.9591197Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_reconstruct.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:54:02.958700] 2025-10-10T01:54:10.1378049Z 2025-10-10T01:54:10.1379080Z dynamo/test_reconstruct 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_reconstruct_1.1_e3846bbbc7021b42_.log 2025-10-10T01:54:10.1385186Z Running 16 items in this shard: test/dynamo/test_reconstruct.py::ReconstructTest::test_ConstDict_clear_reconstruct, test/dynamo/test_reconstruct.py::ReconstructTest::test_ConstDict_del_reconstruct, test/dynamo/test_reconstruct.py::ReconstructTest::test_ConstDict_get_reconstruct, test/dynamo/test_reconstruct.py::ReconstructTest::test_ConstDict_optimize_reconstruct, test/dynamo/test_reconstruct.py::ReconstructTest::test_ConstDict_pop_reconstruct, test/dynamo/test_reconstruct.py::ReconstructTest::test_ConstDict_popitem_reconstruct, test/dynamo/test_reconstruct.py::ReconstructTest::test_ConstDict_popitem_reconstruct_graph_break, test/dynamo/test_reconstruct.py::ReconstructTest::test_create_dict_reconstruct, test/dynamo/test_reconstruct.py::ReconstructTest::test_functional_call_reconstruct, test/dynamo/test_reconstruct.py::ReconstructTest::test_functional_call_reconstruct_2, test/dynamo/test_reconstruct.py::ReconstructTest::test_graph_break_in_wrapped_nested_function, test/dynamo/test_reconstruct.py::ReconstructTest::test_graph_break_in_wrapped_skipped_function, test/dynamo/test_reconstruct.py::ReconstructTest::test_graph_break_in_wrapped_user_function, test/dynamo/test_reconstruct.py::ReconstructTest::test_graph_break_in_wrapped_user_method, test/dynamo/test_reconstruct.py::ReconstructTest::test_tma_experimental_reconstruct, test/dynamo/test_reconstruct.py::ReconstructTest::test_tma_stable_reconstruct 2025-10-10T01:54:10.1390547Z 2025-10-10T01:54:13.9363881Z Running export/test_dynamic_shapes 1/1 ... [2025-10-10 01:54:13.935865] 2025-10-10T01:54:13.9364330Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:54:13.9368570Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_dynamic_shapes.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:54:13.936241] 2025-10-10T01:54:17.6651266Z 2025-10-10T01:54:17.6652580Z export/test_dynamic_shapes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_dynamic_shapes_1.1_cc27332ded76651d_.log 2025-10-10T01:54:17.6654043Z Running 2 items in this shard: test/export/test_dynamic_shapes.py::TestDimHint::test_dimhint_factory, test/export/test_dynamic_shapes.py::TestDimHint::test_dimhint_repr 2025-10-10T01:54:17.6654956Z 2025-10-10T01:54:21.4712989Z Running inductor/test_remote_cache 1/1 ... [2025-10-10 01:54:21.470761] 2025-10-10T01:54:21.4713595Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:54:21.4716017Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_remote_cache.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:54:21.471153] 2025-10-10T01:54:25.3436564Z 2025-10-10T01:54:25.3437533Z inductor/test_remote_cache 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_remote_cache_1.1_da154ba4977184df_.log 2025-10-10T01:54:25.3439447Z Running 3 items in this shard: test/inductor/test_remote_cache.py::TestRemoteCache::test_failure_logging, test/inductor/test_remote_cache.py::TestRemoteCache::test_failure_no_sample, test/inductor/test_remote_cache.py::TestRemoteCache::test_normal_logging 2025-10-10T01:54:25.3440418Z 2025-10-10T01:54:29.1833179Z Running dynamo/test_interop 1/1 ... [2025-10-10 01:54:29.182781] 2025-10-10T01:54:29.1833596Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:54:29.1836650Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_interop.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:54:29.183177] 2025-10-10T01:54:33.2558627Z 2025-10-10T01:54:33.2559393Z dynamo/test_interop 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_interop_1.1_17fde887b302c825_.log 2025-10-10T01:54:33.2561501Z Running 5 items in this shard: test/dynamo/test_interop.py::InteropTests::test_fx_fn, test/dynamo/test_interop.py::InteropTests::test_script_fn, test/dynamo/test_interop.py::InteropTests::test_staticmethod_script_fn, test/dynamo/test_interop.py::InteropTests::test_trace_fn, test/dynamo/test_interop.py::InteropTests::test_vmap_in_graph 2025-10-10T01:54:33.2562834Z 2025-10-10T01:54:37.0218957Z Running inductor/test_device_assert 1/1 ... [2025-10-10 01:54:37.021295] 2025-10-10T01:54:37.0219501Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:54:37.0220926Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_device_assert.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:54:37.021679] 2025-10-10T01:54:44.2002263Z 2025-10-10T01:54:44.2003537Z inductor/test_device_assert 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_device_assert_1.1_f5dbfde030818860_.log 2025-10-10T01:54:44.2009101Z Running 8 items in this shard: test/inductor/test_device_assert.py::TestTorchDeviceAssertTrigger::test_assert_fusion, test/inductor/test_device_assert.py::TestTorchDeviceAssertTrigger::test_assert_should_not_throw_backend_aot_eager, test/inductor/test_device_assert.py::TestTorchDeviceAssertTrigger::test_assert_should_not_throw_backend_eager, test/inductor/test_device_assert.py::TestTorchDeviceAssertTrigger::test_assert_should_not_throw_backend_inductor, test/inductor/test_device_assert.py::TestTorchDeviceAssertTrigger::test_assert_should_throw_backend_aot_eager, test/inductor/test_device_assert.py::TestTorchDeviceAssertTrigger::test_assert_should_throw_backend_eager, test/inductor/test_device_assert.py::TestTorchDeviceAssertTrigger::test_assert_should_throw_backend_inductor, test/inductor/test_device_assert.py::TestTorchDeviceAssertTrigger::test_run_assert_triton 2025-10-10T01:54:44.2012399Z 2025-10-10T01:54:48.0815820Z Running inductor/test_smoke 1/1 ... [2025-10-10 01:54:48.081016] 2025-10-10T01:54:48.0816552Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:54:48.0818084Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_smoke.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:54:48.081412] 2025-10-10T01:54:55.1107842Z 2025-10-10T01:54:55.1108755Z inductor/test_smoke 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_smoke_1.1_a2bd5e7e4ebc2a0d_.log 2025-10-10T01:54:55.1109476Z 2025-10-10T01:54:59.0081634Z Running dynamo/test_skip_guard_eval_unsafe 1/1 ... [2025-10-10 01:54:59.007619] 2025-10-10T01:54:59.0082285Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:54:59.0084716Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_skip_guard_eval_unsafe.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:54:59.008007] 2025-10-10T01:55:03.0813830Z 2025-10-10T01:55:03.0814871Z dynamo/test_skip_guard_eval_unsafe 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_skip_guard_eval_unsafe_1.1_c4c5134a33f1089a_.log 2025-10-10T01:55:03.0817171Z Running 5 items in this shard: test/dynamo/test_skip_guard_eval_unsafe.py::RunDiffGuardTests::test_bool_recompile, test/dynamo/test_skip_guard_eval_unsafe.py::RunDiffGuardTests::test_cache_line_pickup, test/dynamo/test_skip_guard_eval_unsafe.py::RunDiffGuardTests::test_fail_on_tensor_shape_change, test/dynamo/test_skip_guard_eval_unsafe.py::RunDiffGuardTests::test_post_recompile, test/dynamo/test_skip_guard_eval_unsafe.py::RunDiffGuardTests::test_tensor_recompile 2025-10-10T01:55:03.0818841Z 2025-10-10T01:55:06.9104499Z Running export/test_tools 1/1 ... [2025-10-10 01:55:06.909782] 2025-10-10T01:55:06.9105186Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:55:06.9106528Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_tools.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:55:06.910219] 2025-10-10T01:55:07.0590195Z 2025-10-10T01:55:07.0591464Z inductor/test_compile_subprocess 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_compile_subprocess_1.1_7f685742ec8c765f_.log 2025-10-10T01:55:07.0850497Z Running 878 items in this shard: test/inductor/test_compile_subprocess.py::TestSubprocess::test_async, test/inductor/test_compile_subprocess.py::TestSubprocess::test_progressive, test/inductor/test_compile_subprocess.py::GPUTests::test_AllenaiLongformerBase_repro_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test__dyn_quant_matmul_4bit_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test__dyn_quant_pack_4bit_weight_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test__unsafe_masked_index_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test__unsafe_masked_index_put_accumulate_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_abs_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_adaptive_avg_pool1d_argmax_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_adaptive_avg_pool2d1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_adaptive_avg_pool2d2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_adaptive_avg_pool2d_low_prec_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_adaptive_avg_pool_errors_with_long_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_adaptive_avg_pool_with_output_size_0_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_adaptive_max_pool2d1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_adaptive_max_pool2d2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_adaptive_max_pool2d3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_adaptive_pool_errors_with_long_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_add_complex10_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_add_complex3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_add_complex4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_add_complex5_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_add_complex6_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_add_complex7_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_add_complex8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_add_complex9_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_add_complex_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_add_complex_strided_fallback_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_add_const_float_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_add_const_int_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_add_inplace_permuted_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_adding_tensor_offsets_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_addmm_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_addmv_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_alexnet_prefix_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_aliased_buffer_reuse_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_allow_reuse_active_if_under_peak_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_allow_reuse_disable_if_exceed_peak_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_angle_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_any_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_aoti_eager_cache_hit_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_aoti_eager_dtype_device_layout_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_aoti_eager_override_registration_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_aoti_eager_support_out_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_aoti_eager_support_str_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_aoti_eager_with_persistent_cache_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_aoti_eager_with_scalar_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_arange1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_arange2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_arange3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_arange4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_arange5_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_arange6_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_argmax_argmin1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_argmax_argmin2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_argmax_argmin3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_argmax_argmin_with_duplicates_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_argmax_argmin_with_nan_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_argmax_min_int32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_argmax_to_float_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_as_strided_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_as_strided_on_views_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_as_strided_scatter_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_assert_alignment_op_name_fail_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_assert_alignment_op_name_pass_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_assert_size_stride_op_name_fail_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_assert_size_stride_op_name_pass_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_async, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d5_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d6_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d7_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d_backward2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d_backward3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d_backward4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d_backward_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool3d_backward2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool3d_backward3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool3d_backward4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool3d_backward_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool_errors_with_uint_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_baddbmm_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_batch_norm_2d_2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_batch_norm_2d_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bernoulli1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bernoulli2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bfloat16_to_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bitwise2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bitwise3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bitwise_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bmm1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bmm2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bool_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_both_scalars_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_add_autotune_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_broadcast_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_computed_offsets_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_default_kwargs_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int16_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int16_int32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int16_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int16_int8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int16_uint8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int32_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int32_int32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int32_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int32_int8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int32_uint8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int64_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int64_int32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int64_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int64_int8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int64_uint8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int8_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int8_int32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int8_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int8_int8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int8_uint8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_uint8_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_uint8_int32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_uint8_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_uint8_int8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_uint8_uint8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_nd_tiling_False_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_nd_tiling_True_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_buffer_batch_norm_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_buffer_copied_in_graph_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_buffer_copied_in_graph_with_different_shapes_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_buffer_use_after_remove_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_builtins_round_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_builtins_round_float_ndigits_neg_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_builtins_round_float_ndigits_pos_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_builtins_round_float_ndigits_zero_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_builtins_round_int_ndigits_pos_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_builtins_round_int_ndigits_zero_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cat_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cat_empty_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cat_empty_index_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cat_extern_kernel_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cat_inplace_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cat_negative_dim_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cat_of_loops_and_extern_kernel_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cat_single_empty_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cat_uint8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cat_unbacked_2d_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cat_unbacked_empty_1d_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cat_unbacked_legacy_empty_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cat_upcasting_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cauchy_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_check_stack_no_cycles_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_chunk_recompiles_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_clamp_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_clamp_type_promotion_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_clamp_type_promotion_non_tensor_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_clone_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_compar_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_complex_fallback_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_complex_from_real_imag_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_complex_memory_overlap_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_computed_buffer_inlining_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_concat_add_inplace_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_config_option_dont_assume_alignment_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_config_option_dont_assume_alignment_cudagraphs_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_config_option_dont_assume_alignment_recompiles_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_consecutive_split_cumprod_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_consecutive_split_cumsum_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_const_int32_to_float_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_constant_pad_1d_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_constant_pad_2d_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_constant_pad_2d_strides_nonpositive_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_constant_pad_3d_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_constant_pad_fill_dtype_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_constant_pad_float64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_constant_pad_nd_inplace_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_conv1d_depthwise_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_conv1d_with_permute_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_conv2d_backward_channels_last_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_conv2d_channels_last_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_conv3d_channels_last_use_block_ptr_False_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_conv3d_channels_last_use_block_ptr_True_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_conv3d_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_conv_backward_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_conv_bn_fuse_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_conv_functional_bn_fuse_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_conv_inference_heuristics_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_conv_shape_check_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_conv_with_as_strided_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_convolution1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_convolution2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_convolution3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_convolution4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_convolution5_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_copy_non_blocking_is_pinned_use_cat_False_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_copy_non_blocking_is_pinned_use_cat_True_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_copy_with_scalar_src_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cos_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cpu_scalar_with_cpu_scalar_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cpu_scalar_with_cpu_tensor_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cpu_scalar_with_gpu_tensor_cpp_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cpu_scalar_with_gpu_tensor_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cpu_scalar_with_gpu_tensor_dynamic_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cpu_tensor_with_cpu_tensor_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cpu_tensor_with_gpu_tensor_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cudnn_rnn_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cummin_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cumprod_zero_dim_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cumsum_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cumsum_inf_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cumsum_no_mask_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cumsum_pattern_matcher_issue_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cumsum_zero_dim_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_custom_op_1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_custom_op_2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_custom_op_3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_custom_op_default_layout_constraint_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_custom_op_fixed_layout_channels_last_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_custom_op_fixed_layout_sequential_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_custom_op_unbacked_symints_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_custom_scan_op_compiled_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_custom_scan_op_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_custom_scan_op_multi_input_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_custom_scan_would_split_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_data_type_propogation_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dense_mask_index_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_deterministic_codegen_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_deterministic_codegen_on_graph_break_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_deterministic_codegen_with_suffix_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_device_assert_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_diagonal_copy_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dist_bf16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dist_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div5_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div6_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div7_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div9_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div_by_zero_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div_precision_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div_presicion_accuracy_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div_prim_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div_softmax_symfloat_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div_zero_dim_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dont_constant_fold_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dropout2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dropout3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dropout_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dropout_deterministic_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dropout_trivial_0_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dropout_trivial_1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtype_mismatch_issue_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtype_sympy_expr_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_bfloat16_bfloat16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_bfloat16_float16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_bfloat16_float32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_bfloat16_float64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_bfloat16_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_bfloat16_int32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_bfloat16_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_bfloat16_int8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_bfloat16_uint8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float16_bfloat16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float16_float16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float16_float32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float16_float64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float16_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float16_int32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float16_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float16_int8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float16_uint8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float32_bfloat16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float32_float16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float32_float32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float32_float64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float32_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float32_int32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float32_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float32_int8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float32_uint8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float64_bfloat16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float64_float16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float64_float32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float64_float64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float64_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float64_int32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float64_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float64_int8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float64_uint8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_fusion_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int16_bfloat16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int16_float16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int16_float32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int16_float64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int16_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int16_int32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int16_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int16_int8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int16_uint8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int32_bfloat16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int32_float16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int32_float32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int32_float64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int32_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int32_int32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int32_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int32_int8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int32_uint8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int64_bfloat16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int64_float16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int64_float32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int64_float64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int64_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int64_int32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int64_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int64_int8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int64_uint8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int8_bfloat16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int8_float16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int8_float32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int8_float64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int8_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int8_int32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int8_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int8_int8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int8_uint8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_uint8_bfloat16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_uint8_float16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_uint8_float32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_uint8_float64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_uint8_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_uint8_int32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_uint8_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_uint8_int8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_uint8_uint8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_elu_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_embedding_bag_byte_unpack_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_embedding_bag_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_embedding_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_embedding_sparse_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_empty1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_empty2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_empty_strided_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_emulate_precision_triton_fp_fusion_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_erfc_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_erfinv_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_exact_stride_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_exp2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_exp_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_expand_as_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_expand_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_expanded_reduction_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_expm1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fallback_mutable_op_basic_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fallback_mutable_op_list_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fallback_mutable_op_list_tensor_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fallback_mutable_op_no_mutated_tensors_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fallback_mutable_op_with_return_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fft_real_input_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fft_real_input_real_output_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fill1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fill2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_flip_cat_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_flip_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_float16_to_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_float32_to_int32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_float_index_expression_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_float_index_expression_type_promotion_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_float_repr_dynamic_shapes_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_floordiv_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fmin_fmax_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fmod_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fmod_zero_dim_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_forced_buffer_realize_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fractional_max_pool2d1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fractional_max_pool2d2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fractional_max_pool2d3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fractional_max_pool2d4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fractional_max_pool2d5_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_full_boolean_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_full_like_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_full_like_sliced_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_full_like_transposed_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_full_truncation_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_functionalize_rng_wrappers_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fuse_large_params_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fuse_tiled_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fusing_write_into_disjoint_read_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_gather1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_gather2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_gather3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_gather_scatter_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_gelu_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_generate_rand_fp8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_generated_code_has_alignment_assert_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_generated_code_has_size_stride_assert_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_getitem_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_glu_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_gpu_scalar_with_cpu_tensor_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_gpu_scalar_with_gpu_tensor_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_graph_partition_arange1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_graph_partition_arange2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_graph_partition_argmax_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_graph_partition_both_scalars_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_graph_partition_constant_tensor1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_graph_partition_constant_tensor2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_graph_partition_misaligned_input_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_graph_partition_mutation_real_name_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_graph_partition_no_inputs_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_graph_partition_pad_dynamic_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_graph_partition_refcount_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_graph_partition_scalar_inputs_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_graph_partition_unbacked_symint_as_output_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_grid_sampler_2d_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_grid_sampler_expand_preserves_view_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_hardsigmoid_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_hardswish_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_hardtanh_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_horizonal_fusion1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_horizonal_fusion2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_dynamic_shapes_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_propagation_abs_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_propagation_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_propagation_device_assert_masked_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_propagation_flip_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_propagation_floordiv_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_propagation_nested_indirect_indexing_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_propagation_remainder_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_put1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_put2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_put3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_put4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_put_as_masked_fill_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_put_deterministic_fallback_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_put_failed_reinplace_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_put_fallback1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_put_fallback2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_put_index_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_put_reinplace_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_remainder_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_select_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_tensor_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_indirect_load_broadcast_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_inductor_assert_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_inductor_layout_optimization_input_mutations_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_inductor_multiple_specializations_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_inductor_triton_bucketize_respects_masking_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_inf_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_inner_fn_str_and_stride_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_inplace_activations_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_inplace_add_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_inplace_flip_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_inplace_mixed_dtype_ops_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_inplace_resize_as_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_inplace_where_pointwise_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_input_mutation1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_input_mutation2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_input_mutation3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_input_mutation4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_input_mutation5_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_insignificant_strides_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_int8_weight_only_quant_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_int_input_dynamic_shapes_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_invalid_operand_issue1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_isin_tensor_scalar_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_isinf2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_isinf_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_issue102546_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_kernel_names_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_kwargs_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_l1_loss_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_large_broadcast_reduction_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_large_grid_use_block_ptr_False_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_large_grid_use_block_ptr_True_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_large_offset_pointwise_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_large_pointwise_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_large_strided_reduction_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_large_tensor_reduction_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_layer_norm_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_leaky_relu_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_lerp_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_lgamma_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_like_channels_last_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_like_rands2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_like_rands3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_like_rands_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_like_rands_sliced_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_linear1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_linear2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_linear_dynamic_maxautotune_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_linear_float64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_linear_mixed_dtype_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_linspace1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_linspace2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_linspace3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_linspace4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_list_clearing_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_log1p_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_log2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_log_fp64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_log_softmax_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_logaddexp_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_logcumsumexp_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_logcumsumexp_zero_dim_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_logsumexp_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_long_tensor_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_low_memory_max_pool_dilation_1_dim_2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_low_memory_max_pool_dilation_1_dim_3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_low_memory_max_pool_dilation_2_dim_2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_low_memory_max_pool_dilation_2_dim_3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_mark_dynamic_with_hint_override_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_mark_unbacked_with_hint_override_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_masked_fill_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_masked_fill_promotion_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_masked_scatter_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_matmul_layer_norm_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_min_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d5_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d6_dilation_1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d6_dilation_2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d7_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d_with_indices_backward2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d_with_indices_backward3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d_with_indices_backward4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d_with_indices_backward5_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d_with_indices_backward6_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d_with_indices_backward_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_mean_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_min_max_reduction_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_min_max_reduction_nan_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_misaligned_address_issue1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_mix_device_index_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_mixed_mm2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_mixed_mm3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_mixed_mm_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_mm_mixed_dtype_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_mm_views_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_move_arange_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_mul_index_expr_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_mul_softmax_symfloat_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_multi_device_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_multi_gpu_device_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_multi_gpu_recompile_on_index_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_multi_threading_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_multilayer_any_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_multilayer_prime_size_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_multilayer_sum_low_prec_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_multilayer_var_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_multilayer_var_lowp_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_mutable_custom_op_fixed_layout2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_mutable_custom_op_fixed_layout_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_mutations_loop_fusion_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_nan_sort_stable_False_descending_False_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_nan_sort_stable_False_descending_True_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_nan_sort_stable_True_descending_False_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_nan_sort_stable_True_descending_True_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_nan_to_num_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_narrow_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_needs_contiguous_strides_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_neg_index_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_neg_max_uint8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_new_empty_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_new_empty_strided_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_new_ones_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_nll_loss_backward_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_nll_loss_forward_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_no_mega_fusion_during_lowering_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_no_op_reduction_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_no_specization_over_symbolic_value_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_nonzero_unbacked_refinement_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_norm_constant_overflow_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_one_hot_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_output_strides_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pad_cast_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pad_single_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pad_view_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pattern_matcher_multi_user_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pattern_matcher_unbacked_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_permute1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_permute2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_philox_rand_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pixel_shuffle_channels_last_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_airy_ai_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_bessel_j0_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_bessel_j1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_bessel_y0_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_bessel_y1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_chebyshev_polynomial_t_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_chebyshev_polynomial_u_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_chebyshev_polynomial_v_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_chebyshev_polynomial_w_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_digamma_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_entr_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_erf_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_erfc_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_erfcx_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_erfinv_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_exp2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_expit_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_expm1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_gammainc_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_gammaincc_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_gammaln_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_hermite_polynomial_h_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_hermite_polynomial_he_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_i0_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_i0e_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_i1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_i1e_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_laguerre_polynomial_l_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_legendre_polynomial_p_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_log1p_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_log_ndtr_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_logit_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_modified_bessel_i0_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_modified_bessel_i1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_modified_bessel_k0_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_modified_bessel_k1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_multigammaln_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_ndtr_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_ndtri_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_polygamma_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_psi_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_round_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_scaled_modified_bessel_k0_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_scaled_modified_bessel_k1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_shifted_chebyshev_polynomial_t_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_shifted_chebyshev_polynomial_u_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_shifted_chebyshev_polynomial_v_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_shifted_chebyshev_polynomial_w_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_sinc_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_spherical_bessel_j0_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_xlog1py_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_xlogy_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_zeta_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_polar_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pow1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pow2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pow3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pow_by_natural_log2_dynamic_shapes_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pow_int_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pow_symfloat_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_prepare_softmax_with_fast_math_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_prod_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_profiler_mark_wrapper_call_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_progressive, test/inductor/test_compile_subprocess.py::GPUTests::test_rand_like_deterministic_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_randint_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_randint_distribution_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_randint_int64_mod_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_randint_kernel_count_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_randn_generator_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_randn_like_empty_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_randn_with_dtype_and_device_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_reduction1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_reduction2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_reduction3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_reduction4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_reduction5_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_reduction_config_limit_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_reflection_pad2d_backward_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_reflection_pad2d_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_reinterpret_dtypeview_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_relu_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_remainder_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_remove_no_ops_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_remove_noop_clone_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_remove_noop_copy_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_remove_noop_slice1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_remove_noop_slice_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_remove_noop_slice_scatter_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_remove_noop_view_default_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_remove_noop_view_dtype_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_repeat_as_strided_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_repeat_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_repeat_interleave_2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_repeat_interleave_Tensor_decomp_int32_nd_1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_repeat_interleave_Tensor_decomp_int32_nd_2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_repeat_interleave_Tensor_decomp_int64_nd_1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_repeat_interleave_Tensor_decomp_int64_nd_2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_repeat_interleave_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_replication_pad_errors_with_bool_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_require_stride_expanded_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_resize_as_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_resize_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_reuse_buffers_with_aliasing_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_roi_align_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_roll_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_round_correctness_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_round_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_rsqrt_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_rsqrt_dynamic_shapes_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scalar_cpu_tensor_arg_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scalar_input_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scalar_output_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scaled_dot_product_attention_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scaled_dot_product_efficient_attention_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scatter1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scatter2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scatter3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scatter4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scatter5_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scatter6_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scatter_add1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scatter_add2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scatter_add3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scatter_bf16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scatter_reduce1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scatter_reduce2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scatter_reduce3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scheduler_vertical_fusion1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sdpa_prefer_nd_tiling_False_use_block_ptr_False_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sdpa_prefer_nd_tiling_False_use_block_ptr_True_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sdpa_prefer_nd_tiling_True_use_block_ptr_False_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sdpa_prefer_nd_tiling_True_use_block_ptr_True_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sdpa_unaligned_mask_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sdpa_unaligned_mask_freezing_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_searchsorted_broadcast_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_searchsorted_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_select_scatter_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_setitem_with_int_parameter_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sgn_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sgn_extremal_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_shape_padding_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_shape_prop_torch_ones_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_should_pad_bench_for_bmm_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sigmoid_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sign_dtype_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_signbit_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_silu_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_simplify_loops_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sin_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_single_elem_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_single_elem_indirect_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_size_asserts_for_multi_output_fallback_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sizehint_issue1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice_mutation1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice_mutation2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice_mutation3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice_scatter2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice_scatter3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice_scatter4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice_scatter5_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice_scatter_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice_scatter_dtype_consistency_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice_scatter_reinplace_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice_view_with_graph_break_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_softmax_backward_data_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_softmax_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_softmax_one_kernel_loop_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_softmax_one_kernel_persist_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sort_bool_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sort_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sort_stable_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sort_transpose_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_special_polygamma_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_split_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_split_cumprod_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_split_cumprod_low_prec_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_split_cumsum_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_split_cumsum_index_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_split_cumsum_low_prec_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_split_failed_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_split_reduction_dynamic_shape_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_split_reduction_with_int64_size_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_split_with_integer_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_split_with_list_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_split_with_sizes_with_unbacked_symints_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_split_with_unbacked_symints_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sqrt_dynamic_shapes_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_squeeze1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_squeeze2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_squeeze_varargs_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_stack_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_std_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_stride_preservation_with_stride_modifying_fx_pass_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_strided_inputs_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sum1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sum2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sum3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sum4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sum5_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sum_dtype_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sum_int_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sum_keepdims_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_tan_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_tanh_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_tensor1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_tensor2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_tensor3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_tensor_index_put_slice_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_tensor_index_slice_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_tmp_not_defined_issue1_use_block_ptr_True_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_tmp_not_defined_issue2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_tmp_not_defined_issue3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_to_device_constant_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_to_device_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_to_dtype_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_to_memory_format_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_topk_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_torch_device_split_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_transpose_add_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_transpose_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_transposed_propagates_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_triton_kernel_bool_param_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_triu_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_uint4x2_mixed_mm_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_uint_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unbacked_floordiv_simplify_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unbacked_floordiv_simplify_errors_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unbind_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unfold_zero_dimension_tensor_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unroll_small_reduction_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unsigned_constant_tensors_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unspec_inputs_bfloat16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unspec_inputs_float16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unspec_inputs_float32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unspec_inputs_float64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unspec_inputs_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unspec_inputs_int32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unspec_inputs_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unspec_inputs_int8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unspec_inputs_uint8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unsqueeze_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unsqueeze_inplace_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_upsample_bicubic2d_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_upsample_bilinear2d_a_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_upsample_bilinear2d_b_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_upsample_cat_conv_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_upsample_nearest1d_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_upsample_nearest2d_backward_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_upsample_nearest2d_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_upsample_nearest3d_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_var_correction_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_var_mean_div_by_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_var_mean_tile_reduction_False_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_var_mean_tile_reduction_True_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_vdd_clamp_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_vectorized_ops_masked_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_vectorized_ops_masked_var_novec_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_vertical_fusion1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_view_as_complex_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_view_as_real_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_view_detach_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_view_on_aliased_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_view_uint8_through_differing_bitwidths_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_views1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_views2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_views3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_views4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_views5_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_views6_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_views7_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_weight_norm_bwd_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_where_broadcast_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_where_with_logical_op_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_xblock_divides_xnumel_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_zero_dim_reductions_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_zero_element_mutation_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_zeros_cuda 2025-10-10T01:55:07.1103107Z 2025-10-10T01:55:10.9314198Z Running inductor/test_gpu_cpp_wrapper 1/1 ... [2025-10-10 01:55:10.930872] 2025-10-10T01:55:10.9314979Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:55:10.9316885Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_gpu_cpp_wrapper.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:55:10.931282] 2025-10-10T01:55:11.0343600Z 2025-10-10T01:55:11.0345208Z export/test_tools 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_tools_1.1_8d0107df3e998ceb_.log 2025-10-10T01:55:11.0347514Z Running 2 items in this shard: test/export/test_tools.py::TestExportTools::test_report_exportability_basic, test/export/test_tools.py::TestExportTools::test_report_exportability_with_issues 2025-10-10T01:55:11.0348850Z 2025-10-10T01:55:14.8692982Z Running export/test_export_with_inline_and_install 1/1 ... [2025-10-10 01:55:14.868689] 2025-10-10T01:55:14.8693492Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:55:14.8695239Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_export_with_inline_and_install.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:55:14.869138] 2025-10-10T01:55:19.5130925Z 2025-10-10T01:55:19.5131928Z inductor/test_gpu_cpp_wrapper 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_gpu_cpp_wrapper_1.1_c98ea84e751491e2_.log 2025-10-10T01:55:19.5262299Z Running 295 items in this shard: test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_add_complex4_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_add_complex_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_adding_tensor_offsets_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_addmm_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_aoti_debug_printer_works_on_constants, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_as_strided_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_batch_norm_2d_2_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_bernoulli1_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_bitwise_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_bmm1_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_bmm2_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_buffer_use_after_remove_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_cat_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_cat_slice_cat_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_consecutive_split_cumprod_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_conv_backward_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_convolution1_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_custom_op_1_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_custom_op_2_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_custom_op_3_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_fusion_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dynamic_shapes_persistent_reduction_mixed_x_dim_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_embedding_bag_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_enable_dynamic_shapes_cpp_wrapper_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_fft_real_input_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_fft_real_input_real_output_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_foreach_cpp_wrapper_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_index_put_deterministic_fallback_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_index_tensor_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_inductor_layout_optimization_input_mutations_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_insignificant_strides_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_layer_norm_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_linear1_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_linear2_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_linear_relu_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_mm_plus_mm2_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_mm_plus_mm3_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_mm_views_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_multi_device_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_multi_threading_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_non_tensor_args_wrapped_on_cpu, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_pointwise_hermite_polynomial_h_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_pointwise_hermite_polynomial_he_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_pow3_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_profiler_mark_wrapper_call_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_randint_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_reduction1_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_relu_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_repeat_interleave_2_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_roi_align_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_scalar_input_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_scaled_dot_product_attention_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_scaled_dot_product_efficient_attention_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_silu_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_sort_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_sum_dtype_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_sum_int_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_transpose_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_add_complex4_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_add_complex_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_adding_tensor_offsets_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_addmm_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_annotation_training, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_as_strided_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_batch_norm_2d_2_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_bernoulli1_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_bitwise_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_bmm1_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_bmm2_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_buffer_use_after_remove_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_cat_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_cat_slice_cat_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_consecutive_split_cumprod_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_conv_backward_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_convolution1_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_custom_op_1_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_custom_op_2_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_custom_op_3_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_fusion_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dynamic_shapes_persistent_reduction_mixed_x_dim_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_embedding_bag_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_enable_dynamic_shapes_cpp_wrapper_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_fft_real_input_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_fft_real_input_real_output_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_foreach_cpp_wrapper_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_index_put_deterministic_fallback_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_index_tensor_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_inductor_layout_optimization_input_mutations_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_insignificant_strides_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_layer_norm_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_linear1_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_linear2_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_linear_relu_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_mm_plus_mm2_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_mm_plus_mm3_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_mm_views_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_multi_device_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_multi_threading_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_pointwise_hermite_polynomial_h_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_pointwise_hermite_polynomial_he_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_pow3_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_profiler_mark_wrapper_call_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_randint_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_reduction1_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_relu_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_repeat_interleave_2_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_roi_align_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_scalar_input_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_scaled_dot_product_attention_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_scaled_dot_product_efficient_attention_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_silu_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_sort_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_sum_dtype_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_sum_int_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_transpose_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_uint8_cuda_dynamic_shapes_gpu_wrapper 2025-10-10T01:55:19.5382885Z 2025-10-10T01:55:23.1015100Z 2025-10-10T01:55:23.1018702Z export/test_export_with_inline_and_install 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_export_with_inline_and_install_1.1_2000574adcf3ce6d_.log 2025-10-10T01:55:23.1306535Z Running 433 items in this shard: test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestDynamismExpression::test_export_assume_static_by_default_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestDynamismExpression::test_export_constraints_error_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestDynamismExpression::test_export_constraints_error_not_in_range_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestDynamismExpression::test_export_inline_constraints_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestDynamismExpression::test_export_slice_maxsize_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestDynamismExpression::test_export_slice_unbacked_dim1_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestDynamismExpression::test_export_strict_narrow_unbacked_expr_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestDynamismExpression::test_no_grad_param_inplace_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestDynamismExpression::test_reshape_view_backed_size_oblivious_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test__scaled_dot_product_flash_attention_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_additional_inputs_constants_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_allow_explicit_guards_as_runtime_asserts_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_args_type_checked_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_aten_lift_fresh_copy_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_attention_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_attr_assignment_extra_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_automatic_constrain_size_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_automatic_dynamic_shapes_constant_relation_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_automatic_dynamic_shapes_linear_relation_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_automatic_dynamic_shapes_simple_equality_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_baddbmm_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_basic_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_basic_non_strict_fake_tensor_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_basic_non_strict_real_tensor_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_bincount_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_buffer_util_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_capture_subclass_constructor_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_capture_subclass_constructor_torch_ir_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_capture_subclass_wrong_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_ccode_python_mod_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_cdist_forward_compute_mode_zero_export_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_check_specialized_int_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_checks_to_constrain_range_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_cleanup_dynamic_markers_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_colin_unbacked_backed_vr_sub_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_colon_parameter_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_compiling_state_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_cond_access_identical_symint_closure_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_cond_branches_return_constant_int_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_cond_branches_return_same_int_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_cond_buffers_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_cond_contains_unbacked_no_escape_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_cond_int_closure_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_cond_unflatten_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_cond_with_module_stack_export_with_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_cond_with_module_stack_export_with_unflatten_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_constant_aliasing_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_constant_input_naming_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_constant_no_user_inp_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_constant_output_dup_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_constant_output_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_constant_requires_grad_const_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_constant_return_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_constant_tensor_mutation_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_constant_tensor_with_non_functional_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_constant_tensor_with_non_functional_nested_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_constrain_decomp_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_constrain_size_in_eager_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_constrain_size_with_constrain_value_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_constrain_size_with_various_cases_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_conv_dynamic_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_crop_like_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_cse_for_symint_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_custom_op_auto_functionalize_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_custom_op_auto_functionalize_pre_dispatch_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_custom_op_auto_warn_pre_dispatch_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_custom_op_preserve_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_custom_pytree_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_custom_tag_metadata_re_export_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_decomp_batch_norm_functional_predispatch_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_decomp_item_in_prim_after_decomposition_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_decomp_item_in_prim_before_decomposition_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_default_decomposition_core_cia_ops_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_derived_dim_1_2_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_derived_dim_basic_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_derived_dim_integer_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_derived_dim_nested_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_derived_dim_out_of_order_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_derived_dim_out_of_order_repeat_derived_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_derived_dim_out_of_order_simplified_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_derived_dim_out_of_order_simplified_repeat_non_derived_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_derived_dim_repeat_derived_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_detect_leak_nonstrict_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_detect_leak_nonstrict_with_stacktrace_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_detect_leak_strict_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_device_to_dynamic_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_device_to_gpu_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_device_to_mutation_float_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_device_to_mutation_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_device_to_static_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_dim_1_2_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_dim_auto_and_dim_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_dim_dynamic_divisibility_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_dim_dynamic_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_dim_dynamic_specialization_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_dim_hint_range_violations_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_dim_hint_ranges_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_disable_forced_specializations_errors_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_disable_forced_specializations_ok_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_distributed_all_gather_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_distributed_all_gather_into_tensor_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_distributed_all_reduce_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_distributed_all_to_all_single_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_distributed_reduce_scatter_tensor_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_dont_duck_size_for_auto_dynamic_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_double_lifted_constants_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_draft_export_checks_aliasing_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_draft_export_checks_mutation_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_draft_export_checks_mutation_list_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_draft_export_checks_mutation_with_nan_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_draft_export_fake_kernel_inference_errors_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_draft_export_infers_fake_kernel_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_duplicate_modules_with_non_persistent_buffers_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_dynamic_lr_shift_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_dynamic_shapes_bounds_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_dynamic_shapes_builder_basic_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_dynamic_shapes_builder_kwargs_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_dynamic_shapes_builder_pytree_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_dynamic_shapes_dataclass_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_dynamic_shapes_inferred_basic_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_dynamic_shapes_serdes_generic_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_dynamic_shapes_serdes_user_errors_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_dynamic_shapes_serdes_various_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_dynamic_shapes_spec_with_pytree_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_dynamic_shapes_wrapped_with_shape_guards_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_dynamic_sym_round_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_ends_of_bounds_oblivious_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_error_does_not_reference_eager_fallback_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_error_when_passing_mutating_primitive_op_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_exception_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_expand_copy_export_handles_implicit_true_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_api_with_dynamic_shapes_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_as_backend_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_associative_scan_lifted_buffers_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_associative_scan_symbol_dim_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_associative_scan_symbol_scandim_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_aten_to_unflatten_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_aten_to_unflatten_subclass_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_aten_to_unflatten_subclass_pre_dispatch_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_cond_preserve_torch_fn_for_subgraphs_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_cond_symbool_pred_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_cond_warns_constant_pred_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_custom_decomp_table_basic_pop_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_custom_decomp_table_container_methods_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_custom_op_lib_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_custom_triton_kernel_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_custom_triton_kernel_mutable_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_cyclic_reference_leak_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_decomp_torture_case_1_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_decomp_torture_case_2_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_decomps_dynamic_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_decomps_simple_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_dynamo_config_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_for_training_run_decomp_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_for_training_with_container_type_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_for_training_with_dynamic_shapes_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_for_training_with_mutation_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_for_training_with_state_dict_hooks_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_func_with_default_kwargs_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_func_with_keyword_only_args_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_func_with_kwargs_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_func_with_pytree_kwargs_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_func_with_var_keyword_args_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_func_with_var_keyword_pytree_args_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_func_with_var_postional_args_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_function_schema_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_graph_with_no_inputs_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_input_mutation_bug_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_input_mutation_dynamic_shape_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_input_mutation_static_shape_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_leak_compile_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_linear_preserve_dynamic_shape_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_max_nonstrict_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_max_onnx_reported_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_method_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_mod_constraints_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_module_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_preserve_linear_at_aot_level_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_preserve_linear_but_not_custom_op_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_rnn_variants_with_warning_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_scan_pytree_output_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_script_module_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_statically_known_true_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_then_compile_tensor_ctor_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_with_autocast_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_with_fake_tensor_inputs_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_with_fake_tensor_inputs_on_cuda_devices_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_with_inline_constraints_complex_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_with_inline_constraints_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_with_set_grad_enabled_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_export_with_wrong_inputs_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_external_call_non_strict_real_tensor_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_fake_inputs_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_fake_weights_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_filter_traceback_frames_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_flex_attention_export_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_float_conversion_from_int_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_float_conversion_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_fqn_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_from_node_metadata_export_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_full_on_scalar_tensor_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_function_holding_tensor_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_hints_wrapper_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_hoo_inline_users_issue_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_if_functional_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_if_post_autograd_op_preserved_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_inductor_backend_inside_nonstrict_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_inline_script_class_method_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_inline_script_class_method_recursive_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_inline_script_function_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_inline_script_method_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_int_shape_specialization_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_intermediate_shape_comp_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_is_exporting_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_is_nonzero_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_isnonzero_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_issue_113041_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_issue_157289_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_issue_161902_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_istft_op_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_keep_composite_ops_invalid_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_keep_composite_ops_linear_convd_for_training_ir_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_keep_composite_ops_linear_convd_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_kwarg_dynamic_shapes_diff_order_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_kwargs_reorder_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_layer_norm_unbacked_normalized_shape_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_layer_sharing_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_lazy_module_kwargs_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_lifted_constants_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_linear_conv_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_malformed_fqn_from_source_name_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_map_buffers_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_map_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_mask_nonzero_static_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_masked_select_dynamic_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_math_pow_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_mismatched_dynamic_shapes_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_mixed_input_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_module_dict_key_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_module_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_module_input_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_module_input_subclasses_parameterization_nested_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_module_list_slice_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_module_with_dict_container_inp_out_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_modules_access_for_deleted_submodule_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_more_multidimensional_slicing_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_multidimensional_slicing_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_multinomial_dynamic_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_multiple_definitions_same_name_dim_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_namedtuple_input_export_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_native_multi_attention_head_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_nested_dynamic_shapes_spec_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_nested_module_fake_tensor_leak_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_nested_module_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_nested_module_with_constant_buffer_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_nested_module_with_init_buffer_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_nested_module_with_parameter_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_nn_module_stack_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_nn_module_stack_shared_submodule_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_no_check_is_size_error_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_no_suggested_fixes_for_data_dependent_errors_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_no_tensor_computation_2_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_no_tensor_computation_3_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_no_tensor_computation_4_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_no_tensor_computation_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_non_arg_name_dynamic_shapes_api_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_non_arg_name_dynamic_shapes_api_with_container_type_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_non_arg_name_dynamic_shapes_api_with_kwarg_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_non_persistent_buffer_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_non_strict_dynamic_shapes_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_non_strict_dynamic_shapes_suggested_fixes_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_none_buffers_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_nonstrict_retrace_preserves_metadata_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_nonzero_2_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_nonzero_dynamic_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_not_registered_parameter_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_operator_aten_tensor_mode_variant_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_output_node_name_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_pad_sequence_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_param_util_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_partial_patched_forward_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_placeholder_naming_collisions_hoo_subgraphs_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_placeholder_naming_collisions_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_placeholder_naming_order_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_placeholder_naming_order_variadic_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_placeholder_update_preserving_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_predispatch_cond_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_predispatch_grad_wrappers_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_preserve_annotation_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_preserve_module_call_signature_unflatten_specialization_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_preserve_requires_grad_placeholders_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_preserve_shape_dynamism_for_unused_inputs_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_profiling_code_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_python_asserts_with_sym_int_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_pytree_register_data_class_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_pytree_register_nested_data_class_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_raise_user_error_when_guard_on_data_dependent_operation_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_range_constraints_with_replacement_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_real_tensor_alias_dtype_mismatch_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_real_tensor_bool_cast_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_real_tensor_errors_on_aliasing_custom_op_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_real_tensor_for_max_op_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_real_tensor_size_mismatch_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_redundant_assert_max_upper_bound_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_redundant_asserts_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_refine_dynamic_shapes_from_suggested_fixes_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_register_constant_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_repeat_interleave_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_replace_unbacked_with_very_large_upperbound_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_replaced_unbacked_bindings_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_reshape_view_helper_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_retracable_ep_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_retrace_pre_autograd_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_run_decomposition_supports_user_input_mutation_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_run_decompositions_keep_metadata_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_run_decompositions_keep_tensor_constant_metadata_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_runtime_assert_for_prim_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_runtime_assert_for_prm_str_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_runtime_assert_with_size_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_sdpa_gqa_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_sequential_slicing_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_set_example_inputs_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_set_grad_as_side_effect_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_set_grad_empty_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_set_grad_unflatten_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_setgrad_lifted_tensor_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_shared_submodule_nn_module_stack_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_simple_export_for_training_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_simple_unbacked_view_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_size_input_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_slice_nn_module_stack_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_solver_unsupported_sympy_function_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_specialize_derived_dim_roots_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_split_const_gm_with_lifted_constants_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_stack_trace_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_stack_trace_make_fx_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_state_primitives_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_state_shape_attribute_assignment_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_state_tensors_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_static_dim_constraints_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_subclass_context_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_subclass_nested_attr_access_complicated_metadata_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_subclass_nested_attr_access_const_metadata_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_subclass_nested_attr_access_const_metadata_not_top_level_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_subclass_nested_attr_access_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_subclass_nested_attr_access_submodule_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_subclasses_parameterization_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_subclasses_parameterization_nested_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_suggest_torch_checks_with_non_negative_check_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_suggest_torch_checks_with_regular_check_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_suggested_fixes_for_data_dependent_errors_basic_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_suggested_fixes_for_data_dependent_errors_puzzlers_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_suggested_fixes_new_roots_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_sym_float_operators_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_sym_or_sym_and_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_sym_sqrt_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_symbool_item_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_symfloat_item_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_symint_input_additional_inputs_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_symint_input_basic_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_symint_input_ranges_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_symint_input_shapes_collection_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_symint_input_specialization_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_symint_item_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_symint_output_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_symint_tensor_return_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_tag_ac_export_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_tensor_attribute_zero_args_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_tensor_constant_aten_to_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_tensor_constant_with_wrapped_method_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_to_module_with_mutated_buffer_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_to_module_with_mutated_buffer_multiple_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_to_module_with_mutated_buffer_multiple_update_sub_later_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_tolist_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_torch_check_eq_commutativity_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_torch_fn_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_trace_under_fake_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_train_eval_on_exported_preautograd_module_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unbacked_3d_matmul_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unbacked_bincount_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unbacked_bindings_for_divisible_u_symint_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unbacked_deferred_runtime_retrace_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unbacked_expand_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unbacked_infer_size_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unbacked_kth_value_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unbacked_linear_layer_norm_input_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unbacked_noncontig_lin_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unbacked_pad_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unbacked_scalar_constructor_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unbacked_slice_forward_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unbacked_slice_simple_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unbacked_stack_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unbacked_to_cond_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unbacked_to_cond_passthrough_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unbacked_unsqueeze_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_asserts_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_buffer_update_child2parent_swap_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_closure_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_isinstance_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_multiple_graphs_dispatch_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_multiple_graphs_preserve_signature_no_error_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_multiple_graphs_shared_submodule_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_multiple_graphs_state_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_no_unroll_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_placeholder_update_child2parent_swap_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_placeholder_update_grandchild2cousin_swap_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_random_dag_5_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_random_dag_6_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_random_dag_buf_8_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_random_dag_const_preserving_3_1_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_random_dag_const_preserving_3_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_random_dag_mutating_buf_4_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_random_dag_mutating_buf_6_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_random_dag_mutating_buf_9_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_random_dag_mutating_buf_preserving_10_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_random_dag_mutating_buf_preserving_4_1_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_random_dag_mutating_buf_preserving_4_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_random_dag_mutating_buf_preserving_5_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_random_dag_mutating_buf_preserving_7_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unflatten_random_dag_preserving_4_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unused_aliases_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_unused_constant_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_use_embedding_twice_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_user_input_and_buffer_mutation_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_vmap_custom_autograd_function_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_vmap_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_vmap_to_assert_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_where_decomp_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_while_loop_assert_separation_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_while_loop_index_assertions_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_while_loop_simple_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_while_loop_tensor_constant_idx_inline_and_install_strict, test/export/test_export_with_inline_and_install.py::InlineAndInstallStrictExportTestExport::test_wrapper_module_inline_and_install_strict 2025-10-10T01:55:23.1547358Z 2025-10-10T01:55:23.4094920Z Running export/test_serialize 1/1 ... [2025-10-10 01:55:23.408907] 2025-10-10T01:55:23.4095437Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:55:23.4096469Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_serialize.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:55:23.409329] 2025-10-10T01:55:27.0108930Z Running dynamo/test_functions 1/1 ... [2025-10-10 01:55:27.010304] 2025-10-10T01:55:27.0109485Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:55:27.0112464Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_functions.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:55:27.010734] 2025-10-10T01:55:30.8894045Z 2025-10-10T01:55:30.8895100Z export/test_serialize 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_serialize_1.1_95f8a40fb26f9781_.log 2025-10-10T01:55:30.8928909Z Running 113 items in this shard: test/export/test_serialize.py::TestSerialize::test_1D_tensor_slicing, test/export/test_serialize.py::TestSerialize::test_2D_tensor_slicing, test/export/test_serialize.py::TestSerialize::test_canonicalize, test/export/test_serialize.py::TestSerialize::test_complex_constant, test/export/test_serialize.py::TestSerialize::test_empty_constant, test/export/test_serialize.py::TestSerialize::test_empty_state_dict, test/export/test_serialize.py::TestSerialize::test_export_example_inputs_preserved, test/export/test_serialize.py::TestSerialize::test_export_with_extension_op_serialization, test/export/test_serialize.py::TestSerialize::test_int_list, test/export/test_serialize.py::TestSerialize::test_kwargs_default, test/export/test_serialize.py::TestSerialize::test_metadata_parsing_with_layer_split, test/export/test_serialize.py::TestSerialize::test_metadata_run_decomp_serder, test/export/test_serialize.py::TestSerialize::test_multi_return_some_unused, test/export/test_serialize.py::TestSerialize::test_nested_layer_split, test/export/test_serialize.py::TestSerialize::test_non_float_weight, test/export/test_serialize.py::TestSerialize::test_nonfinite_inputs, test/export/test_serialize.py::TestSerialize::test_predispatch_export_with_autograd_op, test/export/test_serialize.py::TestSerialize::test_preserve_aliasing, test/export/test_serialize.py::TestSerialize::test_rational_ranges, test/export/test_serialize.py::TestSerialize::test_serialize_constant_outputs, test/export/test_serialize.py::TestSerialize::test_serialize_infinite_sym_int, test/export/test_serialize.py::TestSerialize::test_serialize_list_returns, test/export/test_serialize.py::TestSerialize::test_serialize_multiple_returns_from_node, test/export/test_serialize.py::TestSerialize::test_serialize_param_mutation, test/export/test_serialize.py::TestSerialize::test_serialize_sym_float, test/export/test_serialize.py::TestSerialize::test_serialize_sym_int, test/export/test_serialize.py::TestSerialize::test_storage_offset, test/export/test_serialize.py::TestSerialize::test_symint_list, test/export/test_serialize.py::TestSerialize::test_triton_hop, test/export/test_serialize.py::TestSerialize::test_weight_sharing_gpu, test/export/test_serialize.py::TestDeserialize::test_arg_from, test/export/test_serialize.py::TestDeserialize::test_auto_functionalize, test/export/test_serialize.py::TestDeserialize::test_basic, test/export/test_serialize.py::TestDeserialize::test_cond, test/export/test_serialize.py::TestDeserialize::test_constraints, test/export/test_serialize.py::TestDeserialize::test_custom_obj, test/export/test_serialize.py::TestDeserialize::test_custom_obj_list_out, test/export/test_serialize.py::TestDeserialize::test_custom_obj_tuple_out, test/export/test_serialize.py::TestDeserialize::test_device, test/export/test_serialize.py::TestDeserialize::test_dynamic, test/export/test_serialize.py::TestDeserialize::test_export_no_inputs, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_assume_constant_result, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_autograd_function, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_class_method, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_cond_branch_class_method, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_cond_branch_nested_function, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_cond_branch_nonlocal_variables, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_cond_closed_over_variable, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_cond_operands, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_cond_predicate, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_constrain_as_size_example, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_constrain_as_value_example, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_decorator, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_dictionary, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_dynamic_shape_assert, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_dynamic_shape_constructor, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_dynamic_shape_if_guard, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_dynamic_shape_map, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_dynamic_shape_slicing, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_dynamic_shape_view, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_fn_with_kwargs, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_list_contains, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_list_unpack, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_nested_function, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_null_context_manager, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_optional_input, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_pytree_flatten, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_scalar_output, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_specialized_attribute, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_static_for_loop, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_static_if, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_tensor_setattr, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_type_reflection_method, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_user_input_mutation, test/export/test_serialize.py::TestDeserialize::test_forward_compatibility, test/export/test_serialize.py::TestDeserialize::test_get_attr, test/export/test_serialize.py::TestDeserialize::test_get_attr_list, test/export/test_serialize.py::TestDeserialize::test_hoo_symint_input, test/export/test_serialize.py::TestDeserialize::test_list_of_optional_tensors, test/export/test_serialize.py::TestDeserialize::test_map, test/export/test_serialize.py::TestDeserialize::test_module, test/export/test_serialize.py::TestDeserialize::test_module_meta, test/export/test_serialize.py::TestDeserialize::test_multi_return, test/export/test_serialize.py::TestDeserialize::test_multiple_getitem, test/export/test_serialize.py::TestDeserialize::test_none_input, test/export/test_serialize.py::TestDeserialize::test_optional_tuple, test/export/test_serialize.py::TestDeserialize::test_positional_argument_with_default_value, test/export/test_serialize.py::TestDeserialize::test_pytree_namedtuple, test/export/test_serialize.py::TestDeserialize::test_serialize_float8, test/export/test_serialize.py::TestDeserialize::test_shape, test/export/test_serialize.py::TestDeserialize::test_sym_bool, test/export/test_serialize.py::TestDeserialize::test_sym_bool_dynamic_shapes, test/export/test_serialize.py::TestDeserialize::test_sym_bool_torch_check_equal, test/export/test_serialize.py::TestDeserialize::test_sym_float, test/export/test_serialize.py::TestDeserialize::test_sym_int_torch_check_equal, test/export/test_serialize.py::TestDeserialize::test_sym_ite, test/export/test_serialize.py::TestDeserialize::test_tensor_tensor_list, test/export/test_serialize.py::TestDeserialize::test_unbacked_bindings_serialize, test/export/test_serialize.py::TestSchemaVersioning::test_error, test/export/test_serialize.py::TestSaveLoad::test_save_buffer, test/export/test_serialize.py::TestSaveLoad::test_save_constants, test/export/test_serialize.py::TestSaveLoad::test_save_extra, test/export/test_serialize.py::TestSaveLoad::test_save_file, test/export/test_serialize.py::TestSaveLoad::test_save_path, test/export/test_serialize.py::TestSaveLoad::test_version_error, test/export/test_serialize.py::TestSerializeCustomClass::test_backed_size_oblivious_serdes, test/export/test_serialize.py::TestSerializeCustomClass::test_custom_class, test/export/test_serialize.py::TestSerializeCustomClass::test_custom_class_containing_fake_tensor, test/export/test_serialize.py::TestSerializeCustomClass::test_custom_class_input_to_function, test/export/test_serialize.py::TestSerializeCustomClass::test_custom_tag_metadata_copy, test/export/test_serialize.py::TestSerializeCustomClass::test_custom_tag_metadata_decomp, test/export/test_serialize.py::TestSerializeCustomClass::test_custom_tag_metadata_serialization, test/export/test_serialize.py::TestSerializeCustomClass::test_unbacked_range_serdes 2025-10-10T01:55:30.8960863Z 2025-10-10T01:55:34.7991920Z Running inductor/test_benchmarking 1/1 ... [2025-10-10 01:55:34.798619] 2025-10-10T01:55:34.7992772Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:55:34.7994403Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_benchmarking.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:55:34.799094] 2025-10-10T01:55:35.2935231Z 2025-10-10T01:55:35.2936520Z dynamo/test_functions 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_functions_1.1_b2cf2692cce73a00_.log 2025-10-10T01:55:35.3055301Z Running 469 items in this shard: test/dynamo/test_functions.py::FunctionTests::test_T, test/dynamo/test_functions.py::FunctionTests::test_add, test/dynamo/test_functions.py::FunctionTests::test_add_, test/dynamo/test_functions.py::FunctionTests::test_addcdiv, test/dynamo/test_functions.py::FunctionTests::test_addcdiv_, test/dynamo/test_functions.py::FunctionTests::test_addcmul_, test/dynamo/test_functions.py::FunctionTests::test_are_functorch_transforms_active, test/dynamo/test_functions.py::FunctionTests::test_attrgetter, test/dynamo/test_functions.py::FunctionTests::test_broadcast_foreach_pow, test/dynamo/test_functions.py::FunctionTests::test_build_list_unpack, test/dynamo/test_functions.py::FunctionTests::test_call_dict1, test/dynamo/test_functions.py::FunctionTests::test_call_dict2, test/dynamo/test_functions.py::FunctionTests::test_call_dict3, test/dynamo/test_functions.py::FunctionTests::test_call_dict4, test/dynamo/test_functions.py::FunctionTests::test_call_dict5, test/dynamo/test_functions.py::FunctionTests::test_callable_builtin, test/dynamo/test_functions.py::FunctionTests::test_callable_class, test/dynamo/test_functions.py::FunctionTests::test_callable_lambda, test/dynamo/test_functions.py::FunctionTests::test_callable_list, test/dynamo/test_functions.py::FunctionTests::test_callable_torch, test/dynamo/test_functions.py::FunctionTests::test_chunks1, test/dynamo/test_functions.py::FunctionTests::test_class_dict, test/dynamo/test_functions.py::FunctionTests::test_cls_eq, test/dynamo/test_functions.py::FunctionTests::test_cls_hasattr, test/dynamo/test_functions.py::FunctionTests::test_cls_is, test/dynamo/test_functions.py::FunctionTests::test_compare_constant_and_tensor, test/dynamo/test_functions.py::FunctionTests::test_complex_closure, test/dynamo/test_functions.py::FunctionTests::test_const_tuple_add1, test/dynamo/test_functions.py::FunctionTests::test_const_tuple_add2, test/dynamo/test_functions.py::FunctionTests::test_constant1, test/dynamo/test_functions.py::FunctionTests::test_constant2, test/dynamo/test_functions.py::FunctionTests::test_constant3, test/dynamo/test_functions.py::FunctionTests::test_constant4, test/dynamo/test_functions.py::FunctionTests::test_constant_set, test/dynamo/test_functions.py::FunctionTests::test_context_wrapping_nested_functions_no_closure, test/dynamo/test_functions.py::FunctionTests::test_cublas_allow_tf32, test/dynamo/test_functions.py::FunctionTests::test_custom_dict_kwargs, test/dynamo/test_functions.py::FunctionTests::test_default_dict_closure, test/dynamo/test_functions.py::FunctionTests::test_default_dict_constr, test/dynamo/test_functions.py::FunctionTests::test_default_dict_dict, test/dynamo/test_functions.py::FunctionTests::test_default_dict_lambda, test/dynamo/test_functions.py::FunctionTests::test_default_dict_list, test/dynamo/test_functions.py::FunctionTests::test_default_dict_set, test/dynamo/test_functions.py::FunctionTests::test_default_dict_tuple, test/dynamo/test_functions.py::FunctionTests::test_defaultdict_setdefault1, test/dynamo/test_functions.py::FunctionTests::test_defaultdict_setdefault2, test/dynamo/test_functions.py::FunctionTests::test_defaultdict_setdefault3, test/dynamo/test_functions.py::FunctionTests::test_del, test/dynamo/test_functions.py::FunctionTests::test_deque, test/dynamo/test_functions.py::FunctionTests::test_device, test/dynamo/test_functions.py::FunctionTests::test_device_constant, test/dynamo/test_functions.py::FunctionTests::test_dict_copy, test/dynamo/test_functions.py::FunctionTests::test_dict_fromkeys, test/dynamo/test_functions.py::FunctionTests::test_dict_hasattr, test/dynamo/test_functions.py::FunctionTests::test_dict_id_guard, test/dynamo/test_functions.py::FunctionTests::test_dict_items_sorted, test/dynamo/test_functions.py::FunctionTests::test_dict_key_set1, test/dynamo/test_functions.py::FunctionTests::test_dict_key_set2, test/dynamo/test_functions.py::FunctionTests::test_dict_key_set3, test/dynamo/test_functions.py::FunctionTests::test_dict_keys, test/dynamo/test_functions.py::FunctionTests::test_dict_kwargs, test/dynamo/test_functions.py::FunctionTests::test_dict_mutable_map, test/dynamo/test_functions.py::FunctionTests::test_dict_ops, test/dynamo/test_functions.py::FunctionTests::test_dict_param_keys, test/dynamo/test_functions.py::FunctionTests::test_dict_setdefault1, test/dynamo/test_functions.py::FunctionTests::test_dict_setdefault2, test/dynamo/test_functions.py::FunctionTests::test_dict_setdefault3, test/dynamo/test_functions.py::FunctionTests::test_dict_sorted, test/dynamo/test_functions.py::FunctionTests::test_dict_tuple_lazy_guard, test/dynamo/test_functions.py::FunctionTests::test_dict_update, test/dynamo/test_functions.py::FunctionTests::test_dict_update_kwargs, test/dynamo/test_functions.py::FunctionTests::test_dict_values, test/dynamo/test_functions.py::FunctionTests::test_distributed_is_available, test/dynamo/test_functions.py::FunctionTests::test_distributed_is_initialized, test/dynamo/test_functions.py::FunctionTests::test_dtype, test/dynamo/test_functions.py::FunctionTests::test_dtype_compare, test/dynamo/test_functions.py::FunctionTests::test_elipsis, test/dynamo/test_functions.py::FunctionTests::test_enumerate, test/dynamo/test_functions.py::FunctionTests::test_enumerate_custom, test/dynamo/test_functions.py::FunctionTests::test_enumerate_reconstruct, test/dynamo/test_functions.py::FunctionTests::test_filter, test/dynamo/test_functions.py::FunctionTests::test_filter_fallback, test/dynamo/test_functions.py::FunctionTests::test_filter_graph_break_reconstruct, test/dynamo/test_functions.py::FunctionTests::test_filter_infinite_iterator, test/dynamo/test_functions.py::FunctionTests::test_filter_reconstruct, test/dynamo/test_functions.py::FunctionTests::test_filter_with_graph_break, test/dynamo/test_functions.py::FunctionTests::test_finfo, test/dynamo/test_functions.py::FunctionTests::test_flat_param_same_storage_size, test/dynamo/test_functions.py::FunctionTests::test_float, test/dynamo/test_functions.py::FunctionTests::test_fn_with_self_set, test/dynamo/test_functions.py::FunctionTests::test_foreach_lerp_, test/dynamo/test_functions.py::FunctionTests::test_fstrings1, test/dynamo/test_functions.py::FunctionTests::test_fstrings2, test/dynamo/test_functions.py::FunctionTests::test_fstrings3, test/dynamo/test_functions.py::FunctionTests::test_fstrings4, test/dynamo/test_functions.py::FunctionTests::test_fstrings5, test/dynamo/test_functions.py::FunctionTests::test_fstrings6, test/dynamo/test_functions.py::FunctionTests::test_funcdef_closure, test/dynamo/test_functions.py::FunctionTests::test_functools_cache_guard, test/dynamo/test_functions.py::FunctionTests::test_functools_partial, test/dynamo/test_functions.py::FunctionTests::test_functools_partial_binding, test/dynamo/test_functions.py::FunctionTests::test_generic_namedtuple_hasattr, test/dynamo/test_functions.py::FunctionTests::test_generic_namedtuple_subclass, test/dynamo/test_functions.py::FunctionTests::test_generic_namedtuple_user_methods, test/dynamo/test_functions.py::FunctionTests::test_get_autocast_gpu_dtype, test/dynamo/test_functions.py::FunctionTests::test_get_calculate_correct_fan, test/dynamo/test_functions.py::FunctionTests::test_get_default_dtype, test/dynamo/test_functions.py::FunctionTests::test_get_device_properties_tensor_device, test/dynamo/test_functions.py::FunctionTests::test_get_privateuse1_name, test/dynamo/test_functions.py::FunctionTests::test_getattr, test/dynamo/test_functions.py::FunctionTests::test_getattr_metaclass, test/dynamo/test_functions.py::FunctionTests::test_globalfn, test/dynamo/test_functions.py::FunctionTests::test_globalmodule, test/dynamo/test_functions.py::FunctionTests::test_globalvar, test/dynamo/test_functions.py::FunctionTests::test_import1, test/dynamo/test_functions.py::FunctionTests::test_in_not_in, test/dynamo/test_functions.py::FunctionTests::test_index, test/dynamo/test_functions.py::FunctionTests::test_indexed_range, test/dynamo/test_functions.py::FunctionTests::test_indirect1, test/dynamo/test_functions.py::FunctionTests::test_indirect2, test/dynamo/test_functions.py::FunctionTests::test_indirect3, test/dynamo/test_functions.py::FunctionTests::test_inline_jit__unwrap_optional, test/dynamo/test_functions.py::FunctionTests::test_inline_jit_annotations, test/dynamo/test_functions.py::FunctionTests::test_inline_lru_cache_fn_with_default_args, test/dynamo/test_functions.py::FunctionTests::test_inline_script_if_tracing_fn_with_default_args, test/dynamo/test_functions.py::FunctionTests::test_inline_softmax, test/dynamo/test_functions.py::FunctionTests::test_inline_with_default, test/dynamo/test_functions.py::FunctionTests::test_inner_function, test/dynamo/test_functions.py::FunctionTests::test_is, test/dynamo/test_functions.py::FunctionTests::test_is_any_autocast_enabled, test/dynamo/test_functions.py::FunctionTests::test_is_checkpoint_valid, test/dynamo/test_functions.py::FunctionTests::test_is_complex, test/dynamo/test_functions.py::FunctionTests::test_is_contiguous_frame_counts, test/dynamo/test_functions.py::FunctionTests::test_is_contiguous_memory_format, test/dynamo/test_functions.py::FunctionTests::test_is_floating_point, test/dynamo/test_functions.py::FunctionTests::test_is_fx_tracing, test/dynamo/test_functions.py::FunctionTests::test_is_in_onnx_export, test/dynamo/test_functions.py::FunctionTests::test_is_inference_mode_global_recompilation, test/dynamo/test_functions.py::FunctionTests::test_is_inference_recompilation, test/dynamo/test_functions.py::FunctionTests::test_is_integer, test/dynamo/test_functions.py::FunctionTests::test_is_not, test/dynamo/test_functions.py::FunctionTests::test_is_not_null, test/dynamo/test_functions.py::FunctionTests::test_is_quantized, test/dynamo/test_functions.py::FunctionTests::test_is_sparse, test/dynamo/test_functions.py::FunctionTests::test_isinstance, test/dynamo/test_functions.py::FunctionTests::test_islice_chain, test/dynamo/test_functions.py::FunctionTests::test_itemgetter, test/dynamo/test_functions.py::FunctionTests::test_itertools_chain, test/dynamo/test_functions.py::FunctionTests::test_itertools_chain_from_iterable, test/dynamo/test_functions.py::FunctionTests::test_itertools_combinations, test/dynamo/test_functions.py::FunctionTests::test_itertools_compress, test/dynamo/test_functions.py::FunctionTests::test_itertools_compress_tensors, test/dynamo/test_functions.py::FunctionTests::test_itertools_filterfalse_basic, test/dynamo/test_functions.py::FunctionTests::test_itertools_pairwise, test/dynamo/test_functions.py::FunctionTests::test_itertools_permutations_args, test/dynamo/test_functions.py::FunctionTests::test_itertools_permutations_basic, test/dynamo/test_functions.py::FunctionTests::test_itertools_permutations_various_iterators, test/dynamo/test_functions.py::FunctionTests::test_itertools_product, test/dynamo/test_functions.py::FunctionTests::test_itertools_product_args, test/dynamo/test_functions.py::FunctionTests::test_itertools_product_various_iterators, test/dynamo/test_functions.py::FunctionTests::test_itertools_reconstruct, test/dynamo/test_functions.py::FunctionTests::test_jit_annotate, test/dynamo/test_functions.py::FunctionTests::test_len_constant_dict, test/dynamo/test_functions.py::FunctionTests::test_len_constant_list, test/dynamo/test_functions.py::FunctionTests::test_len_constant_misc_iterables, test/dynamo/test_functions.py::FunctionTests::test_len_tensor, test/dynamo/test_functions.py::FunctionTests::test_list_add, test/dynamo/test_functions.py::FunctionTests::test_list_add_then_mutate, test/dynamo/test_functions.py::FunctionTests::test_list_clear, test/dynamo/test_functions.py::FunctionTests::test_list_compare_polyfill, test/dynamo/test_functions.py::FunctionTests::test_list_compare_polyfill_non_lists, test/dynamo/test_functions.py::FunctionTests::test_list_convert, test/dynamo/test_functions.py::FunctionTests::test_list_expand_lhs, test/dynamo/test_functions.py::FunctionTests::test_list_index_with_constant_tensor, test/dynamo/test_functions.py::FunctionTests::test_list_reversed, test/dynamo/test_functions.py::FunctionTests::test_list_setitem, test/dynamo/test_functions.py::FunctionTests::test_list_setitem_slice, test/dynamo/test_functions.py::FunctionTests::test_list_slice, test/dynamo/test_functions.py::FunctionTests::test_list_slice_assignment, test/dynamo/test_functions.py::FunctionTests::test_list_sorted1, test/dynamo/test_functions.py::FunctionTests::test_list_sorted2, test/dynamo/test_functions.py::FunctionTests::test_list_truth, test/dynamo/test_functions.py::FunctionTests::test_listarg1, test/dynamo/test_functions.py::FunctionTests::test_listarg2, test/dynamo/test_functions.py::FunctionTests::test_listarg3, test/dynamo/test_functions.py::FunctionTests::test_listarg4, test/dynamo/test_functions.py::FunctionTests::test_listarg5, test/dynamo/test_functions.py::FunctionTests::test_load_global_bool, test/dynamo/test_functions.py::FunctionTests::test_lru_cache_warning_issued_during_tracing, test/dynamo/test_functions.py::FunctionTests::test_mT, test/dynamo/test_functions.py::FunctionTests::test_manual_seed, test/dynamo/test_functions.py::FunctionTests::test_map_call_function_ex, test/dynamo/test_functions.py::FunctionTests::test_map_deque_extendleft, test/dynamo/test_functions.py::FunctionTests::test_map_dict_fromkeys, test/dynamo/test_functions.py::FunctionTests::test_map_enumerate, test/dynamo/test_functions.py::FunctionTests::test_map_infinite, test/dynamo/test_functions.py::FunctionTests::test_map_iter, test/dynamo/test_functions.py::FunctionTests::test_map_list, test/dynamo/test_functions.py::FunctionTests::test_map_list_extend, test/dynamo/test_functions.py::FunctionTests::test_map_list_slice_assign, test/dynamo/test_functions.py::FunctionTests::test_map_max, test/dynamo/test_functions.py::FunctionTests::test_map_max_const, test/dynamo/test_functions.py::FunctionTests::test_map_partial_unpack, test/dynamo/test_functions.py::FunctionTests::test_map_reconstruct, test/dynamo/test_functions.py::FunctionTests::test_map_reduce, test/dynamo/test_functions.py::FunctionTests::test_map_return, test/dynamo/test_functions.py::FunctionTests::test_map_set, test/dynamo/test_functions.py::FunctionTests::test_map_sorted, test/dynamo/test_functions.py::FunctionTests::test_map_str_join, test/dynamo/test_functions.py::FunctionTests::test_map_sum, test/dynamo/test_functions.py::FunctionTests::test_map_tuple, test/dynamo/test_functions.py::FunctionTests::test_map_unpack_twice, test/dynamo/test_functions.py::FunctionTests::test_map_unpack_vars, test/dynamo/test_functions.py::FunctionTests::test_map_with_graph_break, test/dynamo/test_functions.py::FunctionTests::test_map_zip_dict, test/dynamo/test_functions.py::FunctionTests::test_math_radians, test/dynamo/test_functions.py::FunctionTests::test_mean_sum_np, test/dynamo/test_functions.py::FunctionTests::test_methodcall1, test/dynamo/test_functions.py::FunctionTests::test_methodcall2, test/dynamo/test_functions.py::FunctionTests::test_methodcall3, test/dynamo/test_functions.py::FunctionTests::test_methodcaller, test/dynamo/test_functions.py::FunctionTests::test_min_max, test/dynamo/test_functions.py::FunctionTests::test_module_constant, test/dynamo/test_functions.py::FunctionTests::test_namedtuple, test/dynamo/test_functions.py::FunctionTests::test_namedtuple_defaults, test/dynamo/test_functions.py::FunctionTests::test_namedtuple_fields, test/dynamo/test_functions.py::FunctionTests::test_namedtuple_hasattr, test/dynamo/test_functions.py::FunctionTests::test_namedtuple_replace, test/dynamo/test_functions.py::FunctionTests::test_namedtuple_subclass, test/dynamo/test_functions.py::FunctionTests::test_namedtuple_user_methods, test/dynamo/test_functions.py::FunctionTests::test_ndarray_builtin_functions, test/dynamo/test_functions.py::FunctionTests::test_ndarray_method, test/dynamo/test_functions.py::FunctionTests::test_ndarray_methods_returning_scalar, test/dynamo/test_functions.py::FunctionTests::test_ndarray_reshape, test/dynamo/test_functions.py::FunctionTests::test_ndarray_transpose, test/dynamo/test_functions.py::FunctionTests::test_ndim, test/dynamo/test_functions.py::FunctionTests::test_no_recompile_inner_function, test/dynamo/test_functions.py::FunctionTests::test_no_recompile_inner_lambda, test/dynamo/test_functions.py::FunctionTests::test_non_inlined_closure, test/dynamo/test_functions.py::FunctionTests::test_not_list, test/dynamo/test_functions.py::FunctionTests::test_np_constant_collections_as_input_int_or_float_float, test/dynamo/test_functions.py::FunctionTests::test_np_constant_collections_as_input_int_or_float_int, test/dynamo/test_functions.py::FunctionTests::test_np_constant_collections_guards_float, test/dynamo/test_functions.py::FunctionTests::test_np_constant_collections_guards_int, test/dynamo/test_functions.py::FunctionTests::test_np_finfo, test/dynamo/test_functions.py::FunctionTests::test_np_iinfo, test/dynamo/test_functions.py::FunctionTests::test_number_method_method_as_integer_ratio_num_type0, test/dynamo/test_functions.py::FunctionTests::test_number_method_method_as_integer_ratio_num_type3, test/dynamo/test_functions.py::FunctionTests::test_number_method_method_bit_length_num_type1, test/dynamo/test_functions.py::FunctionTests::test_number_method_method_conjugate_num_type2, test/dynamo/test_functions.py::FunctionTests::test_number_method_method_conjugate_num_type4, test/dynamo/test_functions.py::FunctionTests::test_number_method_method_hex_num_type5, test/dynamo/test_functions.py::FunctionTests::test_number_method_method_is_integer_num_type6, test/dynamo/test_functions.py::FunctionTests::test_numpy_attributes, test/dynamo/test_functions.py::FunctionTests::test_numpy_dtype_argument_to_function, test/dynamo/test_functions.py::FunctionTests::test_numpy_dtype_call_in_function, test/dynamo/test_functions.py::FunctionTests::test_numpy_fft, test/dynamo/test_functions.py::FunctionTests::test_numpy_linalg, test/dynamo/test_functions.py::FunctionTests::test_numpy_meshgrid, test/dynamo/test_functions.py::FunctionTests::test_numpy_random, test/dynamo/test_functions.py::FunctionTests::test_numpy_size, test/dynamo/test_functions.py::FunctionTests::test_obj_eq, test/dynamo/test_functions.py::FunctionTests::test_obj_is, test/dynamo/test_functions.py::FunctionTests::test_ordered_dict_kwargs, test/dynamo/test_functions.py::FunctionTests::test_partial_across_graph_break_uninvoked, test/dynamo/test_functions.py::FunctionTests::test_partials_as_input_UDF, test/dynamo/test_functions.py::FunctionTests::test_partials_as_input_partials_lambda, test/dynamo/test_functions.py::FunctionTests::test_partials_as_input_partials_mod, test/dynamo/test_functions.py::FunctionTests::test_partials_graph_break_reconstruct, test/dynamo/test_functions.py::FunctionTests::test_partials_graph_break_reconstruct_args_and_kwargs, test/dynamo/test_functions.py::FunctionTests::test_partials_graph_break_reconstruct_mix, test/dynamo/test_functions.py::FunctionTests::test_partials_graph_break_reconstruct_mix_no_source, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___annotations__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___builtins__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___call__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___class__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___closure__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___code__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___defaults__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___delattr__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___dict__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___dir__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___doc__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___eq__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___format__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___ge__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___get__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___getattribute__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___globals__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___gt__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___hash__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___init__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___init_subclass__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___kwdefaults__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___le__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___lt__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___module__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___name__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___ne__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___new__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___qualname__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___reduce__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___reduce_ex__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___repr__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___setattr__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___sizeof__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___str__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___subclasshook__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr_args, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr_func, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr_keywords, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_set_attr, test/dynamo/test_functions.py::FunctionTests::test_partials_lambda, test/dynamo/test_functions.py::FunctionTests::test_partials_recompilation, test/dynamo/test_functions.py::FunctionTests::test_partials_torch_op_arg, test/dynamo/test_functions.py::FunctionTests::test_partials_torch_op_kwarg, test/dynamo/test_functions.py::FunctionTests::test_partials_udf_arg, test/dynamo/test_functions.py::FunctionTests::test_partials_udf_kwarg, test/dynamo/test_functions.py::FunctionTests::test_partials_udf_kwarg_method, test/dynamo/test_functions.py::FunctionTests::test_partials_udf_kwarg_module, test/dynamo/test_functions.py::FunctionTests::test_pop, test/dynamo/test_functions.py::FunctionTests::test_pos, test/dynamo/test_functions.py::FunctionTests::test_pow_int, test/dynamo/test_functions.py::FunctionTests::test_promote_types, test/dynamo/test_functions.py::FunctionTests::test_rand_inlined, test/dynamo/test_functions.py::FunctionTests::test_rand_tensor_partial, test/dynamo/test_functions.py::FunctionTests::test_range1, test/dynamo/test_functions.py::FunctionTests::test_range2, test/dynamo/test_functions.py::FunctionTests::test_range_iterator, test/dynamo/test_functions.py::FunctionTests::test_range_iterator_2, test/dynamo/test_functions.py::FunctionTests::test_range_iterator_graph_break, test/dynamo/test_functions.py::FunctionTests::test_range_iterator_graph_break_2, test/dynamo/test_functions.py::FunctionTests::test_range_length, test/dynamo/test_functions.py::FunctionTests::test_range_with_index, test/dynamo/test_functions.py::FunctionTests::test_range_with_slice_index, test/dynamo/test_functions.py::FunctionTests::test_reduce, test/dynamo/test_functions.py::FunctionTests::test_reduce_with_initial, test/dynamo/test_functions.py::FunctionTests::test_reduce_with_none_initial, test/dynamo/test_functions.py::FunctionTests::test_reduce_with_single, test/dynamo/test_functions.py::FunctionTests::test_reduce_with_single_with_initial, test/dynamo/test_functions.py::FunctionTests::test_return_dict, test/dynamo/test_functions.py::FunctionTests::test_return_dict2, test/dynamo/test_functions.py::FunctionTests::test_return_multiple_numpy_ndarray, test/dynamo/test_functions.py::FunctionTests::test_return_numpy_ndarray, test/dynamo/test_functions.py::FunctionTests::test_return_tuple1, test/dynamo/test_functions.py::FunctionTests::test_return_tuple2, test/dynamo/test_functions.py::FunctionTests::test_returning_recursive_func, test/dynamo/test_functions.py::FunctionTests::test_round, test/dynamo/test_functions.py::FunctionTests::test_set_add, test/dynamo/test_functions.py::FunctionTests::test_set_in_frozenset, test/dynamo/test_functions.py::FunctionTests::test_set_keys_view, test/dynamo/test_functions.py::FunctionTests::test_set_update_bytecode, test/dynamo/test_functions.py::FunctionTests::test_set_update_list_with_duplicated_items, test/dynamo/test_functions.py::FunctionTests::test_shape1, test/dynamo/test_functions.py::FunctionTests::test_shape2, test/dynamo/test_functions.py::FunctionTests::test_size_tuple_add, test/dynamo/test_functions.py::FunctionTests::test_slice1, test/dynamo/test_functions.py::FunctionTests::test_slice2, test/dynamo/test_functions.py::FunctionTests::test_slice3, test/dynamo/test_functions.py::FunctionTests::test_slice4, test/dynamo/test_functions.py::FunctionTests::test_slice5, test/dynamo/test_functions.py::FunctionTests::test_slice6, test/dynamo/test_functions.py::FunctionTests::test_slice_eq, test/dynamo/test_functions.py::FunctionTests::test_sliced_range, test/dynamo/test_functions.py::FunctionTests::test_sorted_const_key_non_const_items, test/dynamo/test_functions.py::FunctionTests::test_sourceless_build_method_type, test/dynamo/test_functions.py::FunctionTests::test_startswith, test/dynamo/test_functions.py::FunctionTests::test_sum, test/dynamo/test_functions.py::FunctionTests::test_sum_shortcut, test/dynamo/test_functions.py::FunctionTests::test_sum_shortcut_with_start_arg, test/dynamo/test_functions.py::FunctionTests::test_sum_shortcut_with_start_kwarg, test/dynamo/test_functions.py::FunctionTests::test_sum_with_start_arg, test/dynamo/test_functions.py::FunctionTests::test_sum_with_start_kwarg, test/dynamo/test_functions.py::FunctionTests::test_symbool_to_int, test/dynamo/test_functions.py::FunctionTests::test_tensor_dim, test/dynamo/test_functions.py::FunctionTests::test_tensor_element_size, test/dynamo/test_functions.py::FunctionTests::test_tensor_is_complex, test/dynamo/test_functions.py::FunctionTests::test_tensor_len, test/dynamo/test_functions.py::FunctionTests::test_tensor_new_with_shape, test/dynamo/test_functions.py::FunctionTests::test_tensor_new_with_size, test/dynamo/test_functions.py::FunctionTests::test_tensor_size, test/dynamo/test_functions.py::FunctionTests::test_tensor_size_indexed_by_symint, test/dynamo/test_functions.py::FunctionTests::test_tensor_type, test/dynamo/test_functions.py::FunctionTests::test_tensor_type2, test/dynamo/test_functions.py::FunctionTests::test_tensor_type3, test/dynamo/test_functions.py::FunctionTests::test_tensor_type4, test/dynamo/test_functions.py::FunctionTests::test_tensor_type5, test/dynamo/test_functions.py::FunctionTests::test_to, test/dynamo/test_functions.py::FunctionTests::test_torch_distributions_functions, test/dynamo/test_functions.py::FunctionTests::test_torch_from_numpy, test/dynamo/test_functions.py::FunctionTests::test_torch_get_device_module, test/dynamo/test_functions.py::FunctionTests::test_torch_size_as_dict_key, test/dynamo/test_functions.py::FunctionTests::test_torch_size_hasattr, test/dynamo/test_functions.py::FunctionTests::test_torch_source, test/dynamo/test_functions.py::FunctionTests::test_transpose_for_scores, test/dynamo/test_functions.py::FunctionTests::test_truth, test/dynamo/test_functions.py::FunctionTests::test_tuple1, test/dynamo/test_functions.py::FunctionTests::test_tuple2, test/dynamo/test_functions.py::FunctionTests::test_tuple_contains, test/dynamo/test_functions.py::FunctionTests::test_tuple_iadd, test/dynamo/test_functions.py::FunctionTests::test_tuple_map, test/dynamo/test_functions.py::FunctionTests::test_tuple_sorted, test/dynamo/test_functions.py::FunctionTests::test_two_point_iter, test/dynamo/test_functions.py::FunctionTests::test_unary_fold_op, test/dynamo/test_functions.py::FunctionTests::test_unary_fold_op_seq, test/dynamo/test_functions.py::FunctionTests::test_unpack1, test/dynamo/test_functions.py::FunctionTests::test_unpack2, test/dynamo/test_functions.py::FunctionTests::test_unpack3, test/dynamo/test_functions.py::FunctionTests::test_unpack_ex1, test/dynamo/test_functions.py::FunctionTests::test_unpack_ex2, test/dynamo/test_functions.py::FunctionTests::test_unpack_ex3, test/dynamo/test_functions.py::FunctionTests::test_unpack_mutable_map, test/dynamo/test_functions.py::FunctionTests::test_unsqueeze_inplace, test/dynamo/test_functions.py::FunctionTests::test_viamethod, test/dynamo/test_functions.py::FunctionTests::test_viatorch, test/dynamo/test_functions.py::FunctionTests::test_zip_longest, test/dynamo/test_functions.py::FunctionTests::test_zip_reconstruct, test/dynamo/test_functions.py::DefaultsTests::test_cast_tensor_single_elem, test/dynamo/test_functions.py::DefaultsTests::test_dataclass_factory, test/dynamo/test_functions.py::DefaultsTests::test_dataclass_nested, test/dynamo/test_functions.py::DefaultsTests::test_fn_with_attr, test/dynamo/test_functions.py::DefaultsTests::test_frozenset_construction, test/dynamo/test_functions.py::DefaultsTests::test_frozenset_illegal_call_method, test/dynamo/test_functions.py::DefaultsTests::test_frozenset_reconstruction, test/dynamo/test_functions.py::DefaultsTests::test_frozenset_return_type_method_name_copy, test/dynamo/test_functions.py::DefaultsTests::test_frozenset_return_type_method_name_difference, test/dynamo/test_functions.py::DefaultsTests::test_frozenset_return_type_method_name_intersection, test/dynamo/test_functions.py::DefaultsTests::test_frozenset_return_type_method_name_symmetric_difference, test/dynamo/test_functions.py::DefaultsTests::test_frozenset_return_type_method_name_union, test/dynamo/test_functions.py::DefaultsTests::test_func_attrs, test/dynamo/test_functions.py::DefaultsTests::test_func_default_tensor_args, test/dynamo/test_functions.py::DefaultsTests::test_func_default_torch_args, test/dynamo/test_functions.py::DefaultsTests::test_functional_compile, test/dynamo/test_functions.py::DefaultsTests::test_functools_partial_id, test/dynamo/test_functions.py::DefaultsTests::test_fx_immutable_list_mutation_not_allowed, test/dynamo/test_functions.py::DefaultsTests::test_fx_map_aggregate, test/dynamo/test_functions.py::DefaultsTests::test_gpu_current_device, test/dynamo/test_functions.py::DefaultsTests::test_in_set_inplace, test/dynamo/test_functions.py::DefaultsTests::test_in_set_would_fail_broadcast, test/dynamo/test_functions.py::DefaultsTests::test_inspect_method_source, test/dynamo/test_functions.py::DefaultsTests::test_is_init_in_compile_mutated_tensor_tensor, test/dynamo/test_functions.py::DefaultsTests::test_is_init_in_compile_vmapped_mutated_tensor_tensor, test/dynamo/test_functions.py::DefaultsTests::test_is_init_in_compile_vmapped_mutated_tensor_tensor_multi_arg, test/dynamo/test_functions.py::DefaultsTests::test_is_mutated_tensor_tensor, test/dynamo/test_functions.py::DefaultsTests::test_is_mutated_tensor_tensor_across_graph_break, test/dynamo/test_functions.py::DefaultsTests::test_is_not_tensor_tensor, test/dynamo/test_functions.py::DefaultsTests::test_is_tensor_tensor, test/dynamo/test_functions.py::DefaultsTests::test_is_vmapped_mutated_tensor_tensor, test/dynamo/test_functions.py::DefaultsTests::test_keyword, test/dynamo/test_functions.py::DefaultsTests::test_listlike_of_tensors_contains_constant, test/dynamo/test_functions.py::DefaultsTests::test_meth_default_tensor_args, test/dynamo/test_functions.py::DefaultsTests::test_pybind_object, test/dynamo/test_functions.py::DefaultsTests::test_reconstructed_name, test/dynamo/test_functions.py::DefaultsTests::test_set_call___init___frozenset, test/dynamo/test_functions.py::DefaultsTests::test_set_call___init___set, test/dynamo/test_functions.py::DefaultsTests::test_set_construction, test/dynamo/test_functions.py::DefaultsTests::test_skip_function_call_very_weird_value, test/dynamo/test_functions.py::DefaultsTests::test_str_handler_for_user_defined_object, test/dynamo/test_functions.py::DefaultsTests::test_sys_recursionlimit, test/dynamo/test_functions.py::DefaultsTests::test_tree_map, test/dynamo/test_functions.py::DefaultsTests::test_udf_list, test/dynamo/test_functions.py::DefaultsTests::test_udf_list_reconstruction, test/dynamo/test_functions.py::DefaultsTests::test_udf_list_slice, test/dynamo/test_functions.py::DefaultsTests::test_udf_namedtuple, test/dynamo/test_functions.py::DefaultsTests::test_udf_tuple, test/dynamo/test_functions.py::DefaultsTests::test_udf_tuple_construction, test/dynamo/test_functions.py::DefaultsTests::test_udf_tuple_construction_custom_new, test/dynamo/test_functions.py::DefaultsTests::test_udf_tuple_reconstruction, test/dynamo/test_functions.py::DefaultsTests::test_zip_strict 2025-10-10T01:55:35.3170536Z 2025-10-10T01:55:39.1369379Z Running inductor/test_quantization 1/1 ... [2025-10-10 01:55:39.136310] 2025-10-10T01:55:39.1370026Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:55:39.1371443Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_quantization.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:55:39.136721] 2025-10-10T01:55:42.1290147Z 2025-10-10T01:55:42.1291014Z inductor/test_benchmarking 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_benchmarking_1.1_77bdfa0c43db741c_.log 2025-10-10T01:55:42.1296296Z Running 12 items in this shard: test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_cpu_smoke_benchmarker_cls0, test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_cpu_smoke_benchmarker_cls1, test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_gpu_smoke_benchmarker_cls0, test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_gpu_smoke_benchmarker_cls1, test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_safely_infers_device_many_devices_benchmarker_cls0, test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_safely_infers_device_many_devices_benchmarker_cls1, test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_safely_infers_device_no_devices_benchmarker_cls0, test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_safely_infers_device_no_devices_benchmarker_cls1, test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_smoke_benchmarker_cls0_device_cpu, test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_smoke_benchmarker_cls0_device_cuda, test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_smoke_benchmarker_cls1_device_cpu, test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_smoke_benchmarker_cls1_device_cuda 2025-10-10T01:55:42.1302574Z 2025-10-10T01:55:45.9936339Z Running inductor/test_aot_inductor_custom_ops 1/1 ... [2025-10-10 01:55:45.992970] 2025-10-10T01:55:45.9937028Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:55:45.9938459Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor_custom_ops.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:55:45.993366] 2025-10-10T01:55:46.4663563Z 2025-10-10T01:55:46.4664651Z inductor/test_quantization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_quantization_1.1_0ffd05ffbea88812_.log 2025-10-10T01:55:46.4666174Z Running 2 items in this shard: test/inductor/test_quantization.py::TestQuantization::test_activation_quantization_aten_with_scaling, test/inductor/test_quantization.py::TestQuantization::test_activation_quantization_aten_without_scaling 2025-10-10T01:55:46.4667119Z 2025-10-10T01:55:50.3216262Z Running inductor/test_scatter_optimization 1/1 ... [2025-10-10 01:55:50.321104] 2025-10-10T01:55:50.3216749Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:55:50.3218925Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_scatter_optimization.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:55:50.321542] 2025-10-10T01:55:53.7229774Z 2025-10-10T01:55:53.7231190Z inductor/test_aot_inductor_custom_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_custom_ops_1.1_53dbf80fa4b7d1b0_.log 2025-10-10T01:55:53.7258743Z Running 35 items in this shard: test/inductor/test_aot_inductor_custom_ops.py::AOTInductorLoggingTest::test_shape_env_reuse, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCpu::test_boxed_run_inputs_clearing_cpu, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCpu::test_custom_op_add_cpu, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCpu::test_custom_op_add_output_path_cpu, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCpu::test_custom_op_all_inputs_cpu, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCpu::test_custom_op_missing_arg_with_default_value_cpu, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCpu::test_custom_op_out_variant_without_return_cpu, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCpu::test_custom_op_return_list_of_single_tensor_cpu, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCpu::test_custom_op_return_single_tensor_cpu, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCpu::test_custom_op_square_cpu, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCpu::test_custom_op_with_concat_inputs_cpu, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCpu::test_custom_op_with_multiple_outputs_cpu, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCpu::test_custom_op_with_reinterpret_view_inputs_cpu, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCpu::test_fn_with_int_output_cpu, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCpu::test_fn_with_optional_tensor_nullopt_output_cpu, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCpu::test_fn_with_optional_tensor_output_2_cpu, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCpu::test_fn_with_optional_tensor_output_cpu, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCpu::test_incorrect_custom_op_schema_cpu, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCuda::test_boxed_run_inputs_clearing_cuda, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCuda::test_custom_op_add_cuda, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCuda::test_custom_op_add_output_path_cuda, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCuda::test_custom_op_all_inputs_cuda, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCuda::test_custom_op_missing_arg_with_default_value_cuda, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCuda::test_custom_op_out_variant_without_return_cuda, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCuda::test_custom_op_return_list_of_single_tensor_cuda, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCuda::test_custom_op_return_single_tensor_cuda, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCuda::test_custom_op_square_cuda, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCuda::test_custom_op_with_concat_inputs_cuda, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCuda::test_custom_op_with_multiple_outputs_cuda, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCuda::test_custom_op_with_reinterpret_view_inputs_cuda, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCuda::test_fn_with_int_output_cuda, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCuda::test_fn_with_optional_tensor_nullopt_output_cuda, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCuda::test_fn_with_optional_tensor_output_2_cuda, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCuda::test_fn_with_optional_tensor_output_cuda, test/inductor/test_aot_inductor_custom_ops.py::AOTInductorTestABICompatibleCuda::test_incorrect_custom_op_schema_cuda 2025-10-10T01:55:53.7284542Z 2025-10-10T01:55:57.5516173Z 2025-10-10T01:55:57.5517173Z inductor/test_scatter_optimization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_scatter_optimization_1.1_04a138b69f0c8004_.log 2025-10-10T01:55:57.5520411Z Running 8 items in this shard: test/inductor/test_scatter_optimization.py::TestScatterOpt::test_3d_tensor, test/inductor/test_scatter_optimization.py::TestScatterOpt::test_can_not_optimize_due_to_dense, test/inductor/test_scatter_optimization.py::TestScatterOpt::test_can_not_optimize_due_to_non_const, test/inductor/test_scatter_optimization.py::TestScatterOpt::test_cross_entropy_loss, test/inductor/test_scatter_optimization.py::TestScatterOpt::test_neg_scatter_dim, test/inductor/test_scatter_optimization.py::TestScatterOpt::test_non_last_dim, test/inductor/test_scatter_optimization.py::TestScatterOpt::test_nonzero_const_tensor, test/inductor/test_scatter_optimization.py::TestScatterOpt::test_shorter_index_tensor 2025-10-10T01:55:57.5523448Z 2025-10-10T01:55:57.6590623Z Running inductor/test_group_batch_fusion 1/1 ... [2025-10-10 01:55:57.658537] 2025-10-10T01:55:57.6591107Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:55:57.6593301Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_group_batch_fusion.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:55:57.658934] 2025-10-10T01:56:01.3904315Z Running inductor/test_split_cat_fx_passes 1/1 ... [2025-10-10 01:56:01.389890] 2025-10-10T01:56:01.3904889Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:56:01.3906609Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_split_cat_fx_passes.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:56:01.390267] 2025-10-10T01:56:05.1398548Z 2025-10-10T01:56:05.1399919Z inductor/test_group_batch_fusion 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_group_batch_fusion_1.1_25811d4046ce28f8_.log 2025-10-10T01:56:05.1405533Z Running 13 items in this shard: test/inductor/test_group_batch_fusion.py::TestGroupBatchFusion::test_batch_dropout_pre_grad_fusion, test/inductor/test_group_batch_fusion.py::TestGroupBatchFusion::test_batch_layer_norm_fusion, test/inductor/test_group_batch_fusion.py::TestGroupBatchFusion::test_batch_linear_lhs_fusion, test/inductor/test_group_batch_fusion.py::TestGroupBatchFusion::test_batch_linear_pre_grad_fusion, test/inductor/test_group_batch_fusion.py::TestGroupBatchFusion::test_gate_fusion_post_grad, test/inductor/test_group_batch_fusion.py::TestGroupBatchFusion::test_group_linear_fusion, test/inductor/test_group_batch_fusion.py::TestGroupBatchFusion::test_group_linear_fusion_different_shapes, test/inductor/test_group_batch_fusion.py::TestGroupBatchFusion::test_math_op_fusion, test/inductor/test_group_batch_fusion.py::TestGroupBatchFusion::test_pointwise_op_fusion, test/inductor/test_group_batch_fusion.py::TestGroupBatchFusion::test_pointwise_op_fusion_post_grad, test/inductor/test_group_batch_fusion.py::TestPostGradBatchLinearFusion::test_batch_linear_post_grad_fusion, test/inductor/test_group_batch_fusion.py::TestFindIndependentSubsetGreedy::test_find_independent_subset_greedy, test/inductor/test_group_batch_fusion.py::TestFindIndependentSubsetGreedy::test_find_independent_subset_greedy_fuse 2025-10-10T01:56:05.1410658Z 2025-10-10T01:56:08.6198741Z 2025-10-10T01:56:08.6199852Z inductor/test_split_cat_fx_passes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_split_cat_fx_passes_1.1_84695daa9b39dc00_.log 2025-10-10T01:56:08.6204632Z Running 11 items in this shard: test/inductor/test_split_cat_fx_passes.py::TestSplitCatFxPasses::test_cat_normalization, test/inductor/test_split_cat_fx_passes.py::TestSplitCatFxPasses::test_config_flag_is_respected, test/inductor/test_split_cat_fx_passes.py::TestSplitCatFxPasses::test_consecutive_split_merge, test/inductor/test_split_cat_fx_passes.py::TestSplitCatFxPasses::test_numpy_compat_normalization, test/inductor/test_split_cat_fx_passes.py::TestSplitCatFxPasses::test_split_cat_merge, test/inductor/test_split_cat_fx_passes.py::TestSplitCatFxPasses::test_split_cat_merge_mutation, test/inductor/test_split_cat_fx_passes.py::TestSplitCatFxPasses::test_split_cat_new_patterns, test/inductor/test_split_cat_fx_passes.py::TestSplitCatFxPasses::test_split_normalization, test/inductor/test_split_cat_fx_passes.py::TestSplitCatFxPasses::test_split_squeeze, test/inductor/test_split_cat_fx_passes.py::TestSplitCatFxPasses::test_stack_normalization_axis_kwarg, test/inductor/test_split_cat_fx_passes.py::TestSplitCatFxPasses::test_unbind_stack 2025-10-10T01:56:08.6208621Z 2025-10-10T01:56:09.1129277Z Running dynamo/test_view 1/1 ... [2025-10-10 01:56:09.112403] 2025-10-10T01:56:09.1129843Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:56:09.1132954Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_view.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:56:09.112870] 2025-10-10T01:56:12.5196153Z Running dynamo/test_fx_annotate 1/1 ... [2025-10-10 01:56:12.518968] 2025-10-10T01:56:12.5196609Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:56:12.5198535Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_fx_annotate.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:56:12.519400] 2025-10-10T01:56:13.0365458Z 2025-10-10T01:56:13.0366597Z dynamo/test_view 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_view_1.1_94455d928243c478_.log 2025-10-10T01:56:13.0369859Z Running 6 items in this shard: test/dynamo/test_view.py::ViewTests::test_tensor_view_with_tensor_args, test/dynamo/test_view.py::ViewTests::test_tensor_view_with_tensor_shape_params, test/dynamo/test_view.py::ViewTests::test_torch_reshape_with_tensor_shape_params, test/dynamo/test_view.py::ViewTests::test_view_to_1d, test/dynamo/test_view.py::ViewTests::test_view_to_2d, test/dynamo/test_view.py::ViewTests::test_view_with_tensor_shape_params 2025-10-10T01:56:13.0372093Z 2025-10-10T01:56:16.9161875Z Running inductor/test_control_deps 1/1 ... [2025-10-10 01:56:16.915611] 2025-10-10T01:56:16.9162351Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:56:16.9163425Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_control_deps.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:56:16.915997] 2025-10-10T01:56:19.8996882Z 2025-10-10T01:56:19.8998999Z dynamo/test_fx_annotate 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_fx_annotate_1.1_107d311c25f70a2b_.log 2025-10-10T01:56:19.9001968Z Running 4 items in this shard: test/dynamo/test_fx_annotate.py::AnnotateTests::test_ac_flex_attention, test/dynamo/test_fx_annotate.py::AnnotateTests::test_activation_checkpointing, test/dynamo/test_fx_annotate.py::AnnotateTests::test_activation_checkpointing_annotation_inside, test/dynamo/test_fx_annotate.py::AnnotateTests::test_annotations 2025-10-10T01:56:19.9003256Z 2025-10-10T01:56:23.7460428Z Running dynamo/test_pre_dispatch 1/1 ... [2025-10-10 01:56:23.745485] 2025-10-10T01:56:23.7460884Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:56:23.7462300Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_pre_dispatch.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:56:23.745861] 2025-10-10T01:56:24.0964212Z 2025-10-10T01:56:24.0965522Z inductor/test_control_deps 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_control_deps_1.1_d2340aff53f38c07_.log 2025-10-10T01:56:24.0967378Z Running 1 items in this shard: test/inductor/test_control_deps.py::TestControlDeps::test_control_deps_prevents_fusion 2025-10-10T01:56:24.0968167Z 2025-10-10T01:56:27.6681852Z 2025-10-10T01:56:27.6683015Z dynamo/test_pre_dispatch 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_pre_dispatch_1.1_5c1414542ae058c3_.log 2025-10-10T01:56:27.6684701Z Running 3 items in this shard: test/dynamo/test_pre_dispatch.py::PreDispatchTests::test_autocast_simple, test/dynamo/test_pre_dispatch.py::PreDispatchTests::test_enable_grad_and_no_grad, test/dynamo/test_pre_dispatch.py::PreDispatchTests::test_no_grad_simple 2025-10-10T01:56:27.6686094Z 2025-10-10T01:56:27.9649710Z Running dynamo/test_subgraphs 1/1 ... [2025-10-10 01:56:27.964437] 2025-10-10T01:56:27.9650346Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:56:27.9652372Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_subgraphs.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:56:27.964848] 2025-10-10T01:56:31.5608181Z Running inductor/test_mkldnn_pattern_matcher 1/1 ... [2025-10-10 01:56:31.560218] 2025-10-10T01:56:31.5608701Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:56:31.5609770Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_mkldnn_pattern_matcher.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:56:31.560599] 2025-10-10T01:56:32.0883661Z 2025-10-10T01:56:32.0885609Z dynamo/test_subgraphs 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_subgraphs_1.1_1b4c6992c9e6c62e_.log 2025-10-10T01:56:32.0905052Z Running 44 items in this shard: test/dynamo/test_subgraphs.py::SubGraphTests::test_capi_call1, test/dynamo/test_subgraphs.py::SubGraphTests::test_capi_call2, test/dynamo/test_subgraphs.py::SubGraphTests::test_capi_call3, test/dynamo/test_subgraphs.py::SubGraphTests::test_control_flow1, test/dynamo/test_subgraphs.py::SubGraphTests::test_control_flow2, test/dynamo/test_subgraphs.py::SubGraphTests::test_control_flow3, test/dynamo/test_subgraphs.py::SubGraphTests::test_control_flow4, test/dynamo/test_subgraphs.py::SubGraphTests::test_control_flow5, test/dynamo/test_subgraphs.py::SubGraphTests::test_dynamic_duck_size, test/dynamo/test_subgraphs.py::SubGraphTests::test_dynamic_getitem, test/dynamo/test_subgraphs.py::SubGraphTests::test_dynamic_kwarg, test/dynamo/test_subgraphs.py::SubGraphTests::test_dynamic_order_dependence, test/dynamo/test_subgraphs.py::SubGraphTests::test_dynamic_zero_inference, test/dynamo/test_subgraphs.py::SubGraphTests::test_enumerate_not_break_graph, test/dynamo/test_subgraphs.py::SubGraphTests::test_extended_args, test/dynamo/test_subgraphs.py::SubGraphTests::test_graph_break_on_item, test/dynamo/test_subgraphs.py::SubGraphTests::test_indirect_unsupported1, test/dynamo/test_subgraphs.py::SubGraphTests::test_indirect_unsupported2, test/dynamo/test_subgraphs.py::SubGraphTests::test_indirect_unsupported3, test/dynamo/test_subgraphs.py::SubGraphTests::test_multigraph, test/dynamo/test_subgraphs.py::SubGraphTests::test_no_graph_break_on_item, test/dynamo/test_subgraphs.py::SubGraphTests::test_pop_after_resume, test/dynamo/test_subgraphs.py::SubGraphTests::test_restore_range, test/dynamo/test_subgraphs.py::SubGraphTests::test_restore_range_iter, test/dynamo/test_subgraphs.py::SubGraphTests::test_restore_state, test/dynamo/test_subgraphs.py::SubGraphTests::test_resume1, test/dynamo/test_subgraphs.py::SubGraphTests::test_resume2, test/dynamo/test_subgraphs.py::SubGraphTests::test_resume3, test/dynamo/test_subgraphs.py::SubGraphTests::test_resume4, test/dynamo/test_subgraphs.py::SubGraphTests::test_resume5, test/dynamo/test_subgraphs.py::SubGraphTests::test_resume_freevars, test/dynamo/test_subgraphs.py::SubGraphTests::test_resume_paths_join, test/dynamo/test_subgraphs.py::SubGraphTests::test_resume_tuple_iterator, test/dynamo/test_subgraphs.py::SubGraphTests::test_resume_with_no_grad1, test/dynamo/test_subgraphs.py::SubGraphTests::test_resume_with_no_grad2, test/dynamo/test_subgraphs.py::SubGraphTests::test_resume_with_no_grad3, test/dynamo/test_subgraphs.py::SubGraphTests::test_stack_state1, test/dynamo/test_subgraphs.py::SubGraphTests::test_stack_state2, test/dynamo/test_subgraphs.py::SubGraphTests::test_start1, test/dynamo/test_subgraphs.py::SubGraphTests::test_start2, test/dynamo/test_subgraphs.py::SubGraphTests::test_start3, test/dynamo/test_subgraphs.py::SubGraphTests::test_start4, test/dynamo/test_subgraphs.py::SubGraphTests::test_tuple_iterator_mutate, test/dynamo/test_subgraphs.py::SubGraphTests::test_tuple_iterator_return 2025-10-10T01:56:32.0924321Z 2025-10-10T01:56:35.9923238Z Running dynamo/test_decorators 1/1 ... [2025-10-10 01:56:35.991687] 2025-10-10T01:56:35.9923677Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:56:35.9924725Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_decorators.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:56:35.992076] 2025-10-10T01:56:39.6410395Z 2025-10-10T01:56:39.6411797Z inductor/test_mkldnn_pattern_matcher 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_mkldnn_pattern_matcher_1.1_444bf6b07f3059e5_.log 2025-10-10T01:56:39.6649908Z Running 280 items in this shard: test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_conv2d_add_scalar, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_conv2d_binary_fusion_failed, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_conv2d_binary_inplace_fusion_failed_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_conv2d_binary_inplace_fusion_pass_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_False_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_False_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_False_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_False_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_False_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_False_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_False_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_False_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_True_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_True_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_True_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_True_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_True_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_True_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_True_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_True_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_False_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_False_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_False_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_False_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_False_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_False_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_False_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_False_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_True_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_True_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_True_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_True_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_True_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_True_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_True_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_True_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_False_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_False_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_False_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_False_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_False_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_False_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_False_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_False_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_True_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_True_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_True_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_True_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_True_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_True_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_True_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_True_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_False_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_False_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_False_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_False_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_False_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_False_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_False_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_False_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_True_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_True_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_True_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_True_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_True_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_True_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_True_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_True_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_False_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_False_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_False_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_False_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_False_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_False_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_False_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_False_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_True_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_True_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_True_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_True_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_True_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_True_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_True_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_True_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_False_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_False_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_False_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_False_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_False_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_False_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_False_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_False_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_True_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_True_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_True_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_True_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_True_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_True_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_True_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_True_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_False_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_False_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_False_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_False_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_False_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_False_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_False_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_False_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_True_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_True_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_True_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_True_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_True_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_True_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_True_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_True_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_False_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_False_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_False_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_False_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_False_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_False_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_False_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_False_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_True_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_True_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_True_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_True_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_True_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_True_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_True_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_True_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_dynamic_qlinear_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_dynamic_qlinear_input_dim_exceeds_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_dynamic_qlinear_qat_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_hardtanh_pattern_fallback, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_leaky_relu_pattern_fallback, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_linear_add_bias, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_linear_binary, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_linear_binary_broadcast_shapes, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_linear_dynamic_fp16, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_linear_fp32, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_linear_input_non_contiguous_3D_wo_bias, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_linear_relu_dynamic_fp16, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_linear_unary, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_multi_linear_share_same_input, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qat_qconv2d, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qat_qconv2d_add, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qat_qconv2d_add_relu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qat_qconv2d_hardswish, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qat_qconv2d_hardtanh, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qat_qconv2d_relu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qat_qconv2d_relu6, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qat_qconv2d_silu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qcat, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv1d_relu_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_3, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_broadcast_shapes_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_int8_mixed_bf16, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_relu_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_relu_int8_mixed_bf16, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_relu_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_relu_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_dequant_promotion_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_dequant_promotion_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_hardswish_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_hardswish_int8_mixed_bf16_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_hardswish_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_hardswish_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_hardtanh_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_hardtanh_int8_mixed_bf16_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_hardtanh_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_hardtanh_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_int8_mixed_bf16, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_int8_mixed_bf16_use_autocast, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_relu6_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_relu6_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_relu_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_relu_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_relu_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_silu_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_silu_int8_mixed_bf16_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_silu_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_silu_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_with_concat_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qflatten, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_False_is_qat_False_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_False_is_qat_False_is_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_False_is_qat_True_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_False_is_qat_True_is_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_True_is_qat_False_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_True_is_qat_False_is_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_True_is_qat_True_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_True_is_qat_True_is_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_fp8_inductor_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_False_is_qat_False_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_False_is_qat_False_is_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_False_is_qat_True_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_False_is_qat_True_is_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_True_is_qat_False_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_True_is_qat_False_is_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_True_is_qat_True_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_True_is_qat_True_is_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_xpu_use_relu_False_is_qat_False_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_xpu_use_relu_True_is_qat_False_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_xpu_use_relu_True_is_qat_False_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_dequant_promotion_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_dequant_promotion_cpu_input_dim_exceeds_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_dequant_promotion_dynamic_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_dequant_promotion_input_dim_exceeds_2_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_dequant_promotion_int8_mixed_bf16, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_dequant_promotion_int8_mixed_bf16_input_dim_exceeds_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_dequant_promotion_int8_mixed_bf16_input_dim_exceeds_2_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_dequant_promotion_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_dequant_promotion_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_fp8_inductor_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_gelu_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_gelu_int8_mixed_bf16, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_gelu_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_gelu_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_input_dim_exceeds_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_input_dim_exceeds_2_and_not_contiguous, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_input_dim_exceeds_2_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_int8_mixed_bf16, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_int8_mixed_bf16_input_dim_exceeds_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_int8_mixed_bf16_input_dim_exceeds_2_and_not_contiguous, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_int8_mixed_bf16_input_dim_exceeds_2_and_not_contiguous_use_autocast, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_int8_mixed_bf16_input_dim_exceeds_2_and_not_contiguous_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_int8_mixed_bf16_input_dim_exceeds_2_use_autocast, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_int8_mixed_bf16_input_dim_exceeds_2_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_int8_mixed_bf16_use_autocast, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_mul, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_mul_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_mul_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_relu_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_relu_input_dim_exceeds_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_relu_input_dim_exceeds_2_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_relu_int8_mixed_bf16, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_relu_int8_mixed_bf16_input_dim_exceeds_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_relu_int8_mixed_bf16_input_dim_exceeds_2_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_relu_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_relu_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qmaxpool2d, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_reproduce_113440_issue_1, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_reproduce_113440_issue_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_reproduce_121253_issue_addmm_fusion_check, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_reproduce_99842_issue, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_False_bfloat16_per_channel_quant_False_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_False_bfloat16_per_channel_quant_False_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_False_bfloat16_per_channel_quant_True_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_False_bfloat16_per_channel_quant_True_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_False_float32_per_channel_quant_False_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_False_float32_per_channel_quant_False_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_False_float32_per_channel_quant_True_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_False_float32_per_channel_quant_True_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_True_bfloat16_per_channel_quant_False_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_True_bfloat16_per_channel_quant_False_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_True_bfloat16_per_channel_quant_True_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_True_bfloat16_per_channel_quant_True_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_True_float32_per_channel_quant_False_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_True_float32_per_channel_quant_False_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_True_float32_per_channel_quant_True_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_True_float32_per_channel_quant_True_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_woq_int4_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_woq_int8, test/inductor/test_mkldnn_pattern_matcher.py::TestDynamicPatternMatcher::test_linear_input_non_contiguous_3D_wo_bias_dynamic_shapes, test/inductor/test_mkldnn_pattern_matcher.py::TestDynamicPatternMatcher::test_linear_unary_dynamic_shapes, test/inductor/test_mkldnn_pattern_matcher.py::TestDynamicPatternMatcher::test_q_attention_block, test/inductor/test_mkldnn_pattern_matcher.py::TestDynamicPatternMatcher::test_qat_bn_conv2d, test/inductor/test_mkldnn_pattern_matcher.py::TestDynamicPatternMatcher::test_qconv2d_maxpool2d_linear_dynamic_cpu 2025-10-10T01:56:39.6880727Z 2025-10-10T01:56:40.1660591Z 2025-10-10T01:56:40.1661776Z dynamo/test_decorators 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_decorators_1.1_b780ec96ddf1bf8a_.log 2025-10-10T01:56:40.1683919Z Running 69 items in this shard: test/dynamo/test_decorators.py::DecoratorTests::test_allow_in_graph, test/dynamo/test_decorators.py::DecoratorTests::test_allow_in_graph_no_id_reuse, test/dynamo/test_decorators.py::DecoratorTests::test_assume_constant_result_on_computation_with_graph_input, test/dynamo/test_decorators.py::DecoratorTests::test_assume_constant_result_on_user_defined_fn, test/dynamo/test_decorators.py::DecoratorTests::test_class_methods, test/dynamo/test_decorators.py::DecoratorTests::test_disable_for_custom_op, test/dynamo/test_decorators.py::DecoratorTests::test_disable_ignores_outer_wraps, test/dynamo/test_decorators.py::DecoratorTests::test_disable_nn_module_with_class_decorator, test/dynamo/test_decorators.py::DecoratorTests::test_disable_nn_modules_forward_hook, test/dynamo/test_decorators.py::DecoratorTests::test_disable_optimize, test/dynamo/test_decorators.py::DecoratorTests::test_disable_recursive_false, test/dynamo/test_decorators.py::DecoratorTests::test_disable_recursive_false_weird, test/dynamo/test_decorators.py::DecoratorTests::test_disallow_in_graph, test/dynamo/test_decorators.py::DecoratorTests::test_dont_skip_tracing, test/dynamo/test_decorators.py::DecoratorTests::test_error_on_graph_break, test/dynamo/test_decorators.py::DecoratorTests::test_error_on_graph_break_empty_graph, test/dynamo/test_decorators.py::DecoratorTests::test_error_on_graph_break_error, test/dynamo/test_decorators.py::DecoratorTests::test_error_on_graph_break_export, test/dynamo/test_decorators.py::DecoratorTests::test_error_on_graph_break_fullgraph, test/dynamo/test_decorators.py::DecoratorTests::test_error_on_graph_break_nested, test/dynamo/test_decorators.py::DecoratorTests::test_error_on_graph_break_nested_deep, test/dynamo/test_decorators.py::DecoratorTests::test_error_on_graph_break_nested_with_skip, test/dynamo/test_decorators.py::DecoratorTests::test_graph_break, test/dynamo/test_decorators.py::DecoratorTests::test_incorrect_usage_disallow_in_graph, test/dynamo/test_decorators.py::DecoratorTests::test_mark_static_address_guarded, test/dynamo/test_decorators.py::DecoratorTests::test_mark_static_address_unguarded, test/dynamo/test_decorators.py::DecoratorTests::test_mark_static_nn_module, test/dynamo/test_decorators.py::DecoratorTests::test_nested_compile_error_on_graph_break, test/dynamo/test_decorators.py::DecoratorTests::test_nested_compile_fullgraph, test/dynamo/test_decorators.py::DecoratorTests::test_nested_disable_decorator, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_newly_constructed_trace_register_constant_type_error, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_captured_external_tensor, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_custom_class_error, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_custom_class_output_error, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_inside_compiled_function, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_inside_compiled_function_error, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_inside_compiled_function_kwarg, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_int_and_float_output, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_nested_custom_class, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_nested_custom_class_error, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_newly_constructed_custom_class_with_side_effects, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_newly_constructed_dict_with_side_effects, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_no_action_at_a_distance, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_object_in_context_error, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_on_method, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_pre_existing_custom_class, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_pre_existing_custom_class_with_side_effects, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_pre_existing_dict, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_pre_existing_dict_with_side_effects, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_pre_existing_register_constant_type_guard, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_tensor_args, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_tuple_and_sym_int_output, test/dynamo/test_decorators.py::DecoratorTests::test_patch_dynamo_config_errors, test/dynamo/test_decorators.py::DecoratorTests::test_set_stance_aot_eager_then_compile, test/dynamo/test_decorators.py::DecoratorTests::test_set_stance_eager_on_recompile, test/dynamo/test_decorators.py::DecoratorTests::test_set_stance_eager_then_compile, test/dynamo/test_decorators.py::DecoratorTests::test_set_stance_eager_then_compile_with_graph_break, test/dynamo/test_decorators.py::DecoratorTests::test_set_stance_fail_on_recompile, test/dynamo/test_decorators.py::DecoratorTests::test_set_stance_fail_on_recompile_with_disable, test/dynamo/test_decorators.py::DecoratorTests::test_set_stance_forbid_in_graph, test/dynamo/test_decorators.py::DecoratorTests::test_set_stance_force_backend, test/dynamo/test_decorators.py::DecoratorTests::test_set_stance_force_backend_with_disable, test/dynamo/test_decorators.py::DecoratorTests::test_set_stance_force_eager, test/dynamo/test_decorators.py::DecoratorTests::test_skip_frame, test/dynamo/test_decorators.py::DecoratorTests::test_step_unsupported, test/dynamo/test_decorators.py::DecoratorTests::test_step_unsupported_empty_checkpoint, test/dynamo/test_decorators.py::DecoratorTests::test_substitute_in_graph, test/dynamo/test_decorators.py::DecoratorTests::test_torch_guards_stack_frame_register_inlining_disable, test/dynamo/test_decorators.py::DecoratorTests::test_torch_guards_stack_frame_register_inlining_partially_disable 2025-10-10T01:56:40.1705439Z 2025-10-10T01:56:43.5451190Z Running dynamo/test_pgo 1/1 ... [2025-10-10 01:56:43.544559] 2025-10-10T01:56:43.5451819Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:56:43.5453600Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_pgo.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:56:43.544974] 2025-10-10T01:56:44.0259360Z Running inductor/test_cutlass_evt 1/1 ... [2025-10-10 01:56:44.025419] 2025-10-10T01:56:44.0259861Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:56:44.0262771Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cutlass_evt.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:56:44.025855] 2025-10-10T01:56:47.6684761Z 2025-10-10T01:56:47.6685816Z dynamo/test_pgo 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_pgo_1.1_f31f1f6b2207dd56_.log 2025-10-10T01:56:47.6689070Z Running 11 items in this shard: test/dynamo/test_pgo.py::PgoTest::test_basic, test/dynamo/test_pgo.py::PgoTest::test_different_file_paths_local_pgo, test/dynamo/test_pgo.py::PgoTest::test_distinct_compile_id, test/dynamo/test_pgo.py::PgoTest::test_njt, test/dynamo/test_pgo.py::PgoTest::test_no_empty_graph_allowlist, test/dynamo/test_pgo.py::PgoTest::test_pgo_dynamic_false, test/dynamo/test_pgo.py::PgoTest::test_pgo_dynamic_params, test/dynamo/test_pgo.py::PgoTest::test_remote_basic, test/dynamo/test_pgo.py::PgoTest::test_sticky_pgo_read_write, test/dynamo/test_pgo.py::PgoTest::test_whitelist_ints_floats, test/dynamo/test_pgo.py::PgoTest::test_whitelist_suggestion 2025-10-10T01:56:51.2553946Z 2025-10-10T01:56:51.2553956Z 2025-10-10T01:56:51.2554753Z inductor/test_cutlass_evt 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cutlass_evt_1.1_35b013441ff83e6b_.log 2025-10-10T01:56:51.2557756Z Running 8 items in this shard: test/inductor/test_cutlass_evt.py::TestCutlassEVT::test_evt_argument_codegen, test/inductor/test_cutlass_evt.py::TestCutlassEVT::test_evt_argument_codegen_return_accumulator, test/inductor/test_cutlass_evt.py::TestCutlassEVT::test_evt_codegen, test/inductor/test_cutlass_evt.py::TestCutlassEVT::test_example_tensor_creation, test/inductor/test_cutlass_evt.py::TestCutlassEVT::test_py_codegen, test/inductor/test_cutlass_evt.py::TestCutlassEVT::test_py_codegen_accumulator_return, test/inductor/test_cutlass_evt.py::TestCutlassEVT::test_py_codegen_broadcasting, test/inductor/test_cutlass_evt.py::TestCutlassEVT::test_py_codegen_disjoint_read_indexing 2025-10-10T01:56:51.2560201Z 2025-10-10T01:56:51.5936503Z Running dynamo/test_buffers_override 1/1 ... [2025-10-10 01:56:51.593100] 2025-10-10T01:56:51.5937164Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:56:51.5938824Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_buffers_override.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:56:51.593500] 2025-10-10T01:56:55.0777112Z Running inductor/test_online_softmax 1/1 ... [2025-10-10 01:56:55.077115] 2025-10-10T01:56:55.0777590Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:56:55.0778982Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_online_softmax.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:56:55.077548] 2025-10-10T01:56:55.4662383Z 2025-10-10T01:56:55.4663651Z dynamo/test_buffers_override 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_buffers_override_1.1_83626aa4367f16b7_.log 2025-10-10T01:56:55.4665702Z Running 2 items in this shard: test/dynamo/test_buffers_override.py::TestBuffersOverride::test_buffers_override, test/dynamo/test_buffers_override.py::TestBuffersOverride::test_named_buffers_override 2025-10-10T01:56:59.2885581Z 2025-10-10T01:56:59.2886463Z Running test_model_exports_to_core_aten 1/1 ... [2025-10-10 01:56:59.288095] 2025-10-10T01:56:59.2887129Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:56:59.2890023Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_model_exports_to_core_aten.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:56:59.288525] 2025-10-10T01:57:02.5085072Z 2025-10-10T01:57:02.5086297Z inductor/test_online_softmax 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_online_softmax_1.1_796aa6511e0a418c_.log 2025-10-10T01:57:02.5099146Z Running 31 items in this shard: test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_3d_tiled_online_softmax, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_causal_mask, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_codegen_3pass_softmax_due_to_disable, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_codegen_online_softmax_V_2048_use_log_softmax_False, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_codegen_online_softmax_V_2048_use_log_softmax_True, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_codegen_online_softmax_V_50304_use_log_softmax_False, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_codegen_online_softmax_V_50304_use_log_softmax_True, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_codegen_softmax_persistent_reduction, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_log_softmax, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_no_online_softmax_for_cpu, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_prepare_softmax_acc_with_fp64_bfloat16, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_prepare_softmax_acc_with_fp64_float16, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_prepare_softmax_acc_with_fp64_float32, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_prepare_softmax_nrow_2048_dim_-1, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_prepare_softmax_nrow_2048_dim_0, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_prepare_softmax_nrow_2048_dim_1, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_prepare_softmax_nrow_2_dim_-1, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_prepare_softmax_nrow_2_dim_0, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_prepare_softmax_nrow_2_dim_1, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_prepare_softmax_perf, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_sdpa, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_softmax, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_softmax_acc_with_fp64_fn0_bfloat16, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_softmax_acc_with_fp64_fn0_float16, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_softmax_acc_with_fp64_fn0_float32, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_softmax_acc_with_fp64_fn1_bfloat16, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_softmax_acc_with_fp64_fn1_float16, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_softmax_acc_with_fp64_fn1_float32, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_softmin, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_split_reduction, test/inductor/test_online_softmax.py::TestOnlineSoftmax::test_tb_speech_transformer_attn 2025-10-10T01:57:02.5110780Z 2025-10-10T01:57:03.5115514Z 2025-10-10T01:57:03.5116383Z test_model_exports_to_core_aten 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_model_exports_to_core_aten_1.1_fc7755e53b31eadc_.log 2025-10-10T01:57:03.5117437Z Running 1 items in this shard: test/test_model_exports_to_core_aten.py::TestQuantizePT2EModels::test_vit_aten_export 2025-10-10T01:57:03.5117890Z 2025-10-10T01:57:06.3854482Z Running inductor/test_helion_kernels 1/1 ... [2025-10-10 01:57:06.384814] 2025-10-10T01:57:06.3855061Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:57:06.3856142Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_helion_kernels.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:57:06.385230] 2025-10-10T01:57:07.4331673Z Running inductor/test_aot_inductor_utils 1/1 ... [2025-10-10 01:57:07.432634] 2025-10-10T01:57:07.4332175Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:57:07.4333803Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor_utils.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:57:07.433018] 2025-10-10T01:57:13.5669338Z 2025-10-10T01:57:13.5670456Z inductor/test_helion_kernels 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_helion_kernels_1.1_8542007f207ba381_.log 2025-10-10T01:57:13.5671685Z Running 2 items in this shard: test/inductor/test_helion_kernels.py::HelionTests::test_add_kernel, test/inductor/test_helion_kernels.py::HelionTests::test_softmax_view_reshape 2025-10-10T01:57:13.5672342Z 2025-10-10T01:57:15.2637295Z 2025-10-10T01:57:15.2638565Z inductor/test_aot_inductor_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_utils_1.1_07176475e9629f9a_.log 2025-10-10T01:57:15.2639760Z Running 0 items in this shard: 2025-10-10T01:57:15.2640029Z 2025-10-10T01:57:17.4829993Z Running export/test_package 1/1 ... [2025-10-10 01:57:17.482420] 2025-10-10T01:57:17.4830584Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:57:17.4835029Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_package.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:57:17.482972] 2025-10-10T01:57:19.1063410Z Running dynamo/test_ctx_manager 1/1 ... [2025-10-10 01:57:19.105808] 2025-10-10T01:57:19.1064008Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:57:19.1065591Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_ctx_manager.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:57:19.106193] 2025-10-10T01:57:21.3561576Z 2025-10-10T01:57:21.3562859Z export/test_package 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_package_1.1_28253410963a8533_.log 2025-10-10T01:57:21.3564826Z Running 4 items in this shard: test/export/test_package.py::TestPackage::test_basic, test/export/test_package.py::TestPackage::test_error, test/export/test_package.py::TestPackage::test_more_than_once, test/export/test_package.py::TestPackage::test_overloads 2025-10-10T01:57:21.3566052Z 2025-10-10T01:57:25.2716628Z Running inductor/test_cudagraph_trees 1/1 ... [2025-10-10 01:57:25.270950] 2025-10-10T01:57:25.2717121Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:57:25.2718182Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cudagraph_trees.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:57:25.271337] 2025-10-10T01:57:28.0894136Z 2025-10-10T01:57:28.0895189Z dynamo/test_ctx_manager 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_ctx_manager_1.1_d120947205c9e8cc_.log 2025-10-10T01:57:28.0929791Z Running 101 items in this shard: test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autocast, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autocast_arguments_binding, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autocast_cpu, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autocast_cpu_graph_break, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autocast_cpu_graph_break_2, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autocast_cpu_graph_break_inner_fn, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autocast_decorator, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autocast_device, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autocast_float64, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autocast_graph_break_method, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autocast_sdpa, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autograd_profiler, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autograd_profiler_enabled, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_context_wrapping_grad_mode_decorator, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_context_wrapping_grad_mode_nested_function_decorator, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_context_wrapping_set_grad_enabled_nested_function, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_amp_autocast, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_device, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_event_across_graph_break, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_event_created_outside_of_graph, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_event_method, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_event_method_create_stream_outside_of_compile, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_event_reconstruct, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_stream_across_graph_break, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_stream_compared_with_constant, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_stream_compared_with_stream, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_stream_context_manager1, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_stream_context_manager2, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_stream_method, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_disable_saved_tensors_hooks, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_disable_saved_tensors_hooks_graph_break, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_disable_saved_tensors_hooks_prev_disabled, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_disable_saved_tensors_hooks_prev_disabled_nested, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_generic_context_manager_CustomizedCtxManager, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_generic_context_manager_customized_ctx_manager, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_generic_context_manager_with_graph_break_CustomizedCtxManager, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_generic_context_manager_with_graph_break_customized_ctx_manager, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_generic_ctx_manager_with_graph_break_CustomizedCtxManagerWithGraphBreak, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_generic_ctx_manager_with_graph_break_customized_ctx_manager_with_graph_break, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_grad_mode_guard, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_graph_break_inlining_autocast, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_graph_break_inlining_grad, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_inactive_context_graph_break_local, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_inactive_context_graph_break_local_nullctx, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_inactive_context_graph_break_local_nullctx2, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_inactive_context_graph_break_stack, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_inactive_context_graph_break_stack2, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_is_autocast_cpu_enabled, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_nested_generic_context_manager_CustomizedCtxManager, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_nested_generic_context_manager_customized_ctx_manager, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_nested_generic_context_manager_with_graph_break_CustomizedCtxManager, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_nested_generic_context_manager_with_graph_break_customized_ctx_manager, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_nested_grad_mode_graph_break, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_no_grad, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_return_context_manager, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_return_context_manager_with_graph_break, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_sdpa_kernel_ctx_manager1, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_sdpa_kernel_ctx_manager2, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_sdpa_kernel_ctx_manager3, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_sdpa_kernel_ctx_manager_as_decorator, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_sdpa_kernel_ctx_manager_kwargs, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_sdpa_kernel_ctx_manager_set_priority, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_torch_profiler, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_torch_profiler_use_after_with_block, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_WITH_EXCEPT_START, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_advanced_contextmanager_as_argument, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_advanced_contextmanager_as_argument_error, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_change_parent_0, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_change_parent_1, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_change_parent_global_0, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_change_parent_global_1, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_change_parent_nonlocal_0, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_change_parent_nonlocal_1, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_contextlib_nullcontext, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_contextlib_suppress_name_stderr, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_contextlib_suppress_name_stdout, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_contextlib_suppress_name_suppress, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_contextmanager_as_argument, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_contextmanager_as_argument_only___enter__, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_contextmanager_as_argument_only___exit__, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_ctx_basic0, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_ctx_basic1, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_disable___enter__, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_disable___exit__, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_disable_ctx_manager, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_disable_trace_contextmanager, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_dynamo_disable_ctx, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_globals_change_in_other_file, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_graph_break_after___enter__, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_graph_break_and_disable___enter__, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_graph_break_before___enter__, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_graph_break_before___enter___and_disable___exit__, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_graph_break_before_and_after___enter__, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_graph_break_in_finally, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_graph_break_inside___enter__, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_graph_break_inside_ctx, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_graph_break_inside_ctx_1, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_graph_break_inside_ctx_2, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_graph_break_inside_ctx_with_side_effects, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_return_advanced_contextmanager, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_return_new_contextmanager 2025-10-10T01:57:28.0963629Z 2025-10-10T01:57:32.0004867Z Running inductor/test_block_analysis 1/1 ... [2025-10-10 01:57:31.999924] 2025-10-10T01:57:32.0005346Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:57:32.0006658Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_block_analysis.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:57:32.000316] 2025-10-10T01:57:32.7513518Z 2025-10-10T01:57:32.7514827Z inductor/test_cudagraph_trees 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cudagraph_trees_1.1_225a38602cd2e7c5_.log 2025-10-10T01:57:32.7594992Z Running 161 items in this shard: test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_accumulate_grad, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_accumulate_multiple_recordings, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_alias_of_parameter, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_aliased_output_checkpoint, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_aliased_static_parameter, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_aliased_storage_single_weakref, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_aliasing_static_ref, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_amp_cache_disabled, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_backward_gets_cached_cudagraphs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_cache_hit_forward_miss_backward, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_cached_boxed_forward_device_index, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_cached_forward_backward, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_checkpoint_shared_output_storage_deallocation, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_checkpointing_resets_persistent_refs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_cleanup, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_compiled_autograd_static_input_params, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_constant_output, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_conv_benchmark, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_cpp_wrapper, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_cudagraph_capture_sizes, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_cudagraph_capture_sizes1, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_cudagraph_capture_sizes2, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_cudagraph_or_error, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_dynamic_backward, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_dynamic_warmup, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_empty_cpu_tensor, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_empty_storage, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_end_recording_early, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_error_on_dealloc_use, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_error_on_dealloc_use2, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_execution_into_recording, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_expanded_inputs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_fallback_to_eager_if_recompiling_too_many_times, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_fallback_to_eager_if_recompiling_too_many_times_due_to_cudagraph_managed_tensor, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_fallback_to_eager_if_recompiling_too_many_times_warn_only_once, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_forward_backward, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_forward_backward_not_called_backend_cudagraphs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_forward_backward_not_called_backend_inductor, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_forward_generation, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_forward_with_skipped_cudagraphed_backward, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_frozen_fn, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_function_compiled_multiple_times, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_buffer_reuse, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_condition_op, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_cpu_only, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_cpu_op_and_dynamic_shapes, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar1, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar2, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar3, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar4, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar_device_put, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar_multiple, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar_mutation, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_cpu_tensor_symints, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_custom_op, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_custom_op_dynamoc_shapes, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_custom_op_mutation, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_custom_op_mutation_late_free, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_custom_op_no_split, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_custom_rule, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_dynamic_scalar_inputs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_dynamic_shapes, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_foreach_op, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_forward_backward, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_forward_backward_not_called, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_forward_with_skipped_cudagraphed_backward, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_fused_scheduler_node, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_gc, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_item, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_log_message, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_multiple_devices_msg, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_reduce_overhead_mode_effectiveness, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_reorder_cpu_and_gpu, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_reorder_cpu_and_gpu_interleave, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_reorder_custom_op_with_no_dependency, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_reorder_custom_op_with_no_dependency1, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_simple, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_symint, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_symint_cat_backward, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_symint_from_mutation_index, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_symint_from_nested_indirect_indexing, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_unbacked_symint, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_unbacked_symint_multi_output_layout, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_incompatible_cudagraph_ops_item, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_incompatible_cudagraph_ops_nonzero, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_incompatible_cudagraph_ops_nonzero_backend, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_incompatible_cudagraph_ops_nonzero_graph_breaks, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_index_put, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_live_outputs_multiple_graphs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_manager_per_device, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mark_step, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_meta_tensor, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_multi_dispatch_child_node, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_multi_dispatch_custom_module, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_multi_dispatch_custom_module_buffer, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_multi_dispatch_parent_node, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_multi_dispatch_single_compile_builtin_module, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_multi_dispatch_single_compile_builtin_module_buffers, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_multi_dispatch_single_compile_param_inputs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_multinomial, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_multiple_devices_msg_backend_cudagraphs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_multiple_devices_msg_backend_inductor, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_multiple_insert_removal_caching, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensor_warn_backend_cudagraphs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensor_warn_backend_inductor, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensor_warn_only_once_backend_cudagraphs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensor_warn_only_once_backend_inductor, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensors_backend_cudagraphs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensors_backend_inductor, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensors_config_backend_cudagraphs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensors_config_backend_inductor, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mutation_on_inp_backend_cudagraphs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mutation_on_inp_backend_inductor, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mutation_reinplaced, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_no_rerecord_with_mark_static_address, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_not_fallback_to_eager_if_have_not_recompiling_too_many_times, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_output_alias, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_peristed_output_livenes, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_remove_hooks_on_cached_tensors, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_rerecord_if_static_input_address_changed, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_rng_non_trees, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_rng_trees, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_run_simple, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_separate_recordings, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_side_stream_memory_allocation, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_single_stream_use, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_skip_cpp_wrapper, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_skip_cudagraph_unsafe_ops, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_skip_if_dynamic_shape_limit_reached1, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_skip_if_dynamic_shape_limit_reached2, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_skip_symbolic, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_sparsity, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_static_inputs_address_mutation_log, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_storage_access_error, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_tensor_constant_mutation, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_tensor_dies_between_checkpoint, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_tensor_no_longer_in_pool, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_unaligned_static_input_no_cudagraphs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_unaligned_static_input_non_trees, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_unaligned_static_input_trees, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_unaligned_static_parameter, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_unstable_ptr, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_warmup_stream_sync, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_warn_on_pending_backward, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_warn_once_if_dynamic_shape_limit_reached, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_workspace_allocation_error, test/inductor/test_cudagraph_trees.py::TestSAC::test_cpu_and_cuda_rng, test/inductor/test_cudagraph_trees.py::TestSAC::test_cudagraph_uneven_forward_backward, test/inductor/test_cudagraph_trees.py::TestSAC::test_cudagraphs_aot_eager_compat_equal, test/inductor/test_cudagraph_trees.py::TestSAC::test_cudagraphs_aot_eager_compat_equal_device_one, test/inductor/test_cudagraph_trees.py::TestSAC::test_graph_partition_cudagraphs_aot_eager_compat_equal, test/inductor/test_cudagraph_trees.py::TestSAC::test_multi_device, test/inductor/test_cudagraph_trees.py::TestSAC::test_retain_graph, test/inductor/test_cudagraph_trees.py::TestSAC::test_simple, test/inductor/test_cudagraph_trees.py::TestSAC::test_uneven_forward_backward_order0, test/inductor/test_cudagraph_trees.py::TestSAC::test_uneven_forward_backward_order1, test/inductor/test_cudagraph_trees.py::TestSAC::test_uneven_forward_backward_order2, test/inductor/test_cudagraph_trees.py::TestSAC::test_uneven_forward_backward_order3, test/inductor/test_cudagraph_trees.py::TestSAC::test_uneven_forward_backward_order4, test/inductor/test_cudagraph_trees.py::TestSAC::test_uneven_forward_backward_order5 2025-10-10T01:57:32.7672722Z 2025-10-10T01:57:36.6806703Z Running dynamo/test_autograd_function 1/1 ... [2025-10-10 01:57:36.679509] 2025-10-10T01:57:36.6807601Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:57:36.6809316Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_autograd_function.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:57:36.679989] 2025-10-10T01:57:39.3801928Z 2025-10-10T01:57:39.3802912Z inductor/test_block_analysis 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_block_analysis_1.1_9e68007ecbb52a8b_.log 2025-10-10T01:57:39.3807547Z Running 10 items in this shard: test/inductor/test_block_analysis.py::BlockAnalysisTest::test_affine_identity_stride_3_symbol2_expr2, test/inductor/test_block_analysis.py::BlockAnalysisTest::test_affine_identity_stride_4_symbol1_expr1, test/inductor/test_block_analysis.py::BlockAnalysisTest::test_affine_identity_stride_5_symbol0_expr0, test/inductor/test_block_analysis.py::BlockAnalysisTest::test_index_with_dynamic_shapes, test/inductor/test_block_analysis.py::BlockAnalysisTest::test_mod_div_identity_dims0_strides0_symbol0_expr0, test/inductor/test_block_analysis.py::BlockAnalysisTest::test_mod_div_identity_dims1_strides1_symbol1_expr1, test/inductor/test_block_analysis.py::BlockAnalysisTest::test_mod_div_identity_dims2_strides2_symbol2_expr2, test/inductor/test_block_analysis.py::BlockAnalysisTest::test_subexpr_identity_symbol0_expr0_subexpr0, test/inductor/test_block_analysis.py::BlockAnalysisTest::test_subexpr_identity_symbol1_expr1_subexpr1, test/inductor/test_block_analysis.py::BlockAnalysisTest::test_subexpr_identity_symbol2_expr2_subexpr2 2025-10-10T01:57:39.3811506Z 2025-10-10T01:57:43.2441360Z Running dynamo/test_nops 1/1 ... [2025-10-10 01:57:43.243516] 2025-10-10T01:57:43.2442431Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:57:43.2443689Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_nops.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:57:43.243902] 2025-10-10T01:57:44.0603896Z 2025-10-10T01:57:44.0605005Z dynamo/test_autograd_function 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_autograd_function_1.1_ec9e3b5f11242efb_.log 2025-10-10T01:57:44.0619541Z Running 40 items in this shard: test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_allow_in_graph, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_amp_custom_fwd_bwd, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_assert_is_contiguous_after_matmul, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_assert_is_contiguous_on_grad_output_directly, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_autograd_function_equivalence, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_autograd_function_has_graph_break, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_backward_returns_none_for_tensor_input, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_classmethod, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_data_in_bwd, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_default_values, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_enum_arg, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_forward_returns_constant, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_function_context_mark_and_save, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_function_context_save_and_mark, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_function_with_bound_free_variable, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_fwd_no_grad, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_fwd_propogation_correctness, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_linear_setup_context, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_mark_multi_output_non_differentiable, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_mark_non_differentiable, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_materialize_grad, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_multi_output, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_multiple_different_non_tensor_inputs, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_needs_input_grad, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_once_differentiable, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_print_in_bwd, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_repeated_save_for_backward_calls, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_requires_grad_in_bwd, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_save_for_bwd, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_set_materialize_grads_no_graph_break, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_smoke_from_test_autograd, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_smuggle_symint_issue_111031, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_smuggle_tensor_and_complex_structures, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_stride_in_bwd, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_tensor_list_as_input, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_tensor_subclass_intermediary_input, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_triton_kernel_basic, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_triton_kernel_multiple_out, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_tuple_arg, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_user_defined_object_as_input 2025-10-10T01:57:44.0633156Z 2025-10-10T01:57:47.3671433Z 2025-10-10T01:57:47.3672562Z dynamo/test_nops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_nops_1.1_81834f8cf3f4fbc6_.log 2025-10-10T01:57:47.3673845Z Running 4 items in this shard: test/dynamo/test_nops.py::NopTests::test1, test/dynamo/test_nops.py::NopTests::test2, test/dynamo/test_nops.py::NopTests::test3, test/dynamo/test_nops.py::NopTests::test_extended_args 2025-10-10T01:57:47.3674609Z 2025-10-10T01:57:47.9394398Z Running dynamo/test_config 1/1 ... [2025-10-10 01:57:47.938945] 2025-10-10T01:57:47.9394958Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:57:47.9398058Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_config.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:57:47.939330] 2025-10-10T01:57:51.2280872Z Running inductor/test_control_flow 1/1 ... [2025-10-10 01:57:51.227529] 2025-10-10T01:57:51.2281328Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:57:51.2283201Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_control_flow.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:57:51.227917] 2025-10-10T01:57:52.0122172Z 2025-10-10T01:57:52.0122961Z dynamo/test_config 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_config_1.1_c2c4080c336cde5e_.log 2025-10-10T01:57:52.0124782Z Running 5 items in this shard: test/dynamo/test_config.py::ConfigTests::test_automatic_dynamic, test/dynamo/test_config.py::ConfigTests::test_config_compile_ignored, test/dynamo/test_config.py::ConfigTests::test_config_hash, test/dynamo/test_config.py::ConfigTests::test_no_assume_static_by_default, test/dynamo/test_config.py::ConfigTests::test_no_automatic_dynamic 2025-10-10T01:57:52.0126112Z 2025-10-10T01:57:55.8368802Z Running export/test_db 1/1 ... [2025-10-10 01:57:55.836303] 2025-10-10T01:57:55.8369308Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:57:55.8371541Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_db.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:57:55.836691] 2025-10-10T01:57:59.7589478Z 2025-10-10T01:57:59.7590482Z inductor/test_control_flow 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_control_flow_1.1_e42998d1f616f85f_.log 2025-10-10T01:57:59.7932220Z Running 739 items in this shard: test/inductor/test_control_flow.py::CondTests::test_cond_advanced_dynamic_shapes_device_cpu, test/inductor/test_control_flow.py::CondTests::test_cond_advanced_dynamic_shapes_device_cuda, test/inductor/test_control_flow.py::CondTests::test_cond_aliasing_outputs, test/inductor/test_control_flow.py::CondTests::test_cond_control_flow_with_precomputed_size, test/inductor/test_control_flow.py::CondTests::test_cond_decompose_ops_in_subgraph_device_cpu, test/inductor/test_control_flow.py::CondTests::test_cond_decompose_ops_in_subgraph_device_cuda, test/inductor/test_control_flow.py::CondTests::test_cond_decompose_ops_in_subgraph_recursive_device_cpu, test/inductor/test_control_flow.py::CondTests::test_cond_decompose_ops_in_subgraph_recursive_device_cuda, test/inductor/test_control_flow.py::CondTests::test_cond_functional_call_device_cpu_dynamic_False, test/inductor/test_control_flow.py::CondTests::test_cond_functional_call_device_cpu_dynamic_True, test/inductor/test_control_flow.py::CondTests::test_cond_functional_call_device_cuda_dynamic_False, test/inductor/test_control_flow.py::CondTests::test_cond_functional_call_device_cuda_dynamic_True, test/inductor/test_control_flow.py::CondTests::test_cond_inductor_fx_passes_recursively_applied, test/inductor/test_control_flow.py::CondTests::test_cond_mismatched_branch_output_size_device_cpu_dynamic_False, test/inductor/test_control_flow.py::CondTests::test_cond_mismatched_branch_output_size_device_cpu_dynamic_True, test/inductor/test_control_flow.py::CondTests::test_cond_mismatched_branch_output_size_device_cuda_dynamic_False, test/inductor/test_control_flow.py::CondTests::test_cond_mismatched_branch_output_size_device_cuda_dynamic_True, test/inductor/test_control_flow.py::CondTests::test_cond_multiple_outputs_device_cpu_dynamic_False, test/inductor/test_control_flow.py::CondTests::test_cond_multiple_outputs_device_cpu_dynamic_True, test/inductor/test_control_flow.py::CondTests::test_cond_multiple_outputs_device_cuda_dynamic_False, test/inductor/test_control_flow.py::CondTests::test_cond_multiple_outputs_device_cuda_dynamic_True, test/inductor/test_control_flow.py::CondTests::test_cond_nested_control_flow_device_cpu_dynamic_False, test/inductor/test_control_flow.py::CondTests::test_cond_nested_control_flow_device_cpu_dynamic_True, test/inductor/test_control_flow.py::CondTests::test_cond_nested_control_flow_device_cuda_dynamic_False, test/inductor/test_control_flow.py::CondTests::test_cond_nested_control_flow_device_cuda_dynamic_True, test/inductor/test_control_flow.py::CondTests::test_cond_non_tensor_predicates_device_cpu_dynamic_False, test/inductor/test_control_flow.py::CondTests::test_cond_non_tensor_predicates_device_cpu_dynamic_True, test/inductor/test_control_flow.py::CondTests::test_cond_non_tensor_predicates_device_cuda_dynamic_False, test/inductor/test_control_flow.py::CondTests::test_cond_non_tensor_predicates_device_cuda_dynamic_True, test/inductor/test_control_flow.py::CondTests::test_cond_outer_code_before_after_device_cpu_dynamic_False, test/inductor/test_control_flow.py::CondTests::test_cond_outer_code_before_after_device_cpu_dynamic_True, test/inductor/test_control_flow.py::CondTests::test_cond_outer_code_before_after_device_cuda_dynamic_False, test/inductor/test_control_flow.py::CondTests::test_cond_outer_code_before_after_device_cuda_dynamic_True, test/inductor/test_control_flow.py::CondTests::test_cond_reintepret_view_inputs_outputs, test/inductor/test_control_flow.py::CondTests::test_cond_select_with_input_idx_device_cpu_dynamic_False, test/inductor/test_control_flow.py::CondTests::test_cond_select_with_input_idx_device_cpu_dynamic_True, test/inductor/test_control_flow.py::CondTests::test_cond_select_with_input_idx_device_cuda_dynamic_False, test/inductor/test_control_flow.py::CondTests::test_cond_select_with_input_idx_device_cuda_dynamic_True, test/inductor/test_control_flow.py::CondTests::test_cond_simple_control_flow_device_cpu_dynamic_False, test/inductor/test_control_flow.py::CondTests::test_cond_simple_control_flow_device_cpu_dynamic_True, test/inductor/test_control_flow.py::CondTests::test_cond_simple_control_flow_device_cuda_dynamic_False, test/inductor/test_control_flow.py::CondTests::test_cond_simple_control_flow_device_cuda_dynamic_True, test/inductor/test_control_flow.py::CondTests::test_cond_simple_with_int_closure_device_cpu, test/inductor/test_control_flow.py::CondTests::test_cond_simple_with_int_closure_device_cuda, test/inductor/test_control_flow.py::CondTests::test_cond_subgraphs_with_parameters_device_cpu_dynamic_False, test/inductor/test_control_flow.py::CondTests::test_cond_subgraphs_with_parameters_device_cpu_dynamic_True, test/inductor/test_control_flow.py::CondTests::test_cond_subgraphs_with_parameters_device_cuda_dynamic_False, test/inductor/test_control_flow.py::CondTests::test_cond_subgraphs_with_parameters_device_cuda_dynamic_True, test/inductor/test_control_flow.py::CondTests::test_cond_unbacked_symint_closure_device_cpu_dynamic_False, test/inductor/test_control_flow.py::CondTests::test_cond_unbacked_symint_closure_device_cpu_dynamic_True, test/inductor/test_control_flow.py::CondTests::test_cond_unbacked_symint_closure_device_cuda_dynamic_False, test/inductor/test_control_flow.py::CondTests::test_cond_unbacked_symint_closure_device_cuda_dynamic_True, test/inductor/test_control_flow.py::CondTests::test_cond_unbacked_symint_inner_device_cpu, test/inductor/test_control_flow.py::CondTests::test_cond_unbacked_symint_inner_device_cuda, test/inductor/test_control_flow.py::CondTests::test_cond_unbacked_symint_inner_to_outer_device_cpu, test/inductor/test_control_flow.py::CondTests::test_cond_unbacked_symint_inner_to_outer_device_cuda, test/inductor/test_control_flow.py::CondTests::test_cond_unbacked_symint_outer_to_inner_device_cpu, test/inductor/test_control_flow.py::CondTests::test_cond_unbacked_symint_outer_to_inner_device_cuda, test/inductor/test_control_flow.py::CondTests::test_cond_use_buffers_from_outer_scope, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_infinite_loop_error, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_models_with_mixed_device_device_cuda, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_nested_control_flow_device_cpu_dynamic_False_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_nested_control_flow_device_cpu_dynamic_False_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_nested_control_flow_device_cpu_dynamic_True_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_nested_control_flow_device_cpu_dynamic_True_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_nested_control_flow_device_cuda_dynamic_False_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_nested_control_flow_device_cuda_dynamic_False_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_nested_control_flow_device_cuda_dynamic_True_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_nested_control_flow_device_cuda_dynamic_True_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_simple_control_flow_device_cpu_dynamic_False_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_simple_control_flow_device_cpu_dynamic_False_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_simple_control_flow_device_cpu_dynamic_True_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_simple_control_flow_device_cpu_dynamic_True_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_simple_control_flow_device_cuda_dynamic_False_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_simple_control_flow_device_cuda_dynamic_False_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_simple_control_flow_device_cuda_dynamic_True_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_simple_control_flow_device_cuda_dynamic_True_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_stack_output_simple_device_cpu_dynamic_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_stack_output_simple_device_cpu_dynamic_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_stack_output_simple_device_cuda_dynamic_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_stack_output_simple_device_cuda_dynamic_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_conv_device_cpu_dynamic_False_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_conv_device_cpu_dynamic_False_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_conv_device_cpu_dynamic_True_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_conv_device_cpu_dynamic_True_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_conv_device_cuda_dynamic_False_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_conv_device_cuda_dynamic_False_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_conv_device_cuda_dynamic_True_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_conv_device_cuda_dynamic_True_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_data_dependent_in_out_device_cpu_dynamic_False_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_data_dependent_in_out_device_cpu_dynamic_False_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_data_dependent_in_out_device_cpu_dynamic_True_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_data_dependent_in_out_device_cpu_dynamic_True_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_data_dependent_in_out_device_cuda_dynamic_False_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_data_dependent_in_out_device_cuda_dynamic_False_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_data_dependent_in_out_device_cuda_dynamic_True_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_data_dependent_in_out_device_cuda_dynamic_True_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_data_dependent_in_out_mismatch_dynamic_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_data_dependent_in_out_mismatch_dynamic_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_data_dependent_ops_device_cpu_dynamic_False_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_data_dependent_ops_device_cpu_dynamic_False_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_data_dependent_ops_device_cpu_dynamic_True_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_data_dependent_ops_device_cpu_dynamic_True_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_data_dependent_ops_device_cuda_dynamic_False_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_data_dependent_ops_device_cuda_dynamic_False_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_data_dependent_ops_device_cuda_dynamic_True_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_data_dependent_ops_device_cuda_dynamic_True_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_outer_buffers_device_cpu_dynamic_False_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_outer_buffers_device_cpu_dynamic_False_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_outer_buffers_device_cuda_dynamic_False_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_outer_buffers_device_cuda_dynamic_False_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_outer_code_device_cpu_dynamic_False_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_outer_code_device_cpu_dynamic_False_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_outer_code_device_cpu_dynamic_True_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_outer_code_device_cpu_dynamic_True_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_outer_code_device_cuda_dynamic_False_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_outer_code_device_cuda_dynamic_False_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_outer_code_device_cuda_dynamic_True_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_outer_code_device_cuda_dynamic_True_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_parameters_device_cpu_dynamic_False_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_parameters_device_cpu_dynamic_False_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_parameters_device_cpu_dynamic_True_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_parameters_device_cpu_dynamic_True_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_parameters_device_cuda_dynamic_False_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_parameters_device_cuda_dynamic_False_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_parameters_device_cuda_dynamic_True_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_parameters_device_cuda_dynamic_True_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_pytree_inputs_device_cpu_dynamic_False_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_pytree_inputs_device_cpu_dynamic_False_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_pytree_inputs_device_cpu_dynamic_True_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_pytree_inputs_device_cpu_dynamic_True_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_pytree_inputs_device_cuda_dynamic_False_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_pytree_inputs_device_cuda_dynamic_False_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_pytree_inputs_device_cuda_dynamic_True_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_pytree_inputs_device_cuda_dynamic_True_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_sym_expr_cond_device_cpu_dynamic_False_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_sym_expr_cond_device_cpu_dynamic_False_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_sym_expr_cond_device_cpu_dynamic_True_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_sym_expr_cond_device_cpu_dynamic_True_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_sym_expr_cond_device_cuda_dynamic_False_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_sym_expr_cond_device_cuda_dynamic_False_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_sym_expr_cond_device_cuda_dynamic_True_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_sym_expr_cond_device_cuda_dynamic_True_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_unbacked_symint_closure_device_cpu_dynamic_False_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_unbacked_symint_closure_device_cpu_dynamic_False_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_unbacked_symint_closure_device_cpu_dynamic_True_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_unbacked_symint_closure_device_cpu_dynamic_True_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_unbacked_symint_closure_device_cuda_dynamic_False_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_unbacked_symint_closure_device_cuda_dynamic_False_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_unbacked_symint_closure_device_cuda_dynamic_True_autograd_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_with_unbacked_symint_closure_device_cuda_dynamic_True_autograd_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_zero_loop_device_cpu_dynamic_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_zero_loop_device_cpu_dynamic_True, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_zero_loop_device_cuda_dynamic_False, test/inductor/test_control_flow.py::WhileLoopTests::test_while_loop_zero_loop_device_cuda_dynamic_True, test/inductor/test_control_flow.py::AssociativeScanTests::test_associative_scan_CUDA_flip_combine_mode_generic_backend_inductor_cpu, test/inductor/test_control_flow.py::AssociativeScanTests::test_associative_scan_CUDA_flip_combine_mode_generic_backend_inductor_device_cuda, test/inductor/test_control_flow.py::AssociativeScanTests::test_associative_scan_CUDA_flip_combine_mode_pointwise_backend_inductor_cpu, test/inductor/test_control_flow.py::AssociativeScanTests::test_associative_scan_CUDA_flip_combine_mode_pointwise_backend_inductor_device_cuda, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_False_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_False_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_False_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_False_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_False_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_False_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_False_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_False_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_False_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_False_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_False_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_False_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_True_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_True_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_True_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_True_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_True_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_True_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_True_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_True_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_True_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_True_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_True_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_False_reverse_True_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_False_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_False_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_False_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_False_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_False_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_False_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_False_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_False_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_False_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_False_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_False_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_False_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_True_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_True_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_True_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_True_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_True_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_True_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_True_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_True_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_True_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_True_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_True_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cpu_dynamic_True_reverse_True_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_False_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_False_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_False_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_False_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_False_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_False_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_False_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_False_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_False_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_False_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_False_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_False_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_True_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_True_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_True_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_True_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_True_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_True_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_True_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_True_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_True_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_True_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_True_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_False_reverse_True_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_False_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_False_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_False_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_False_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_False_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_False_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_False_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_False_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_False_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_False_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_False_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_False_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_True_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_True_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_True_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_True_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_True_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_True_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_True_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_True_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_True_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_True_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_True_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_cond_in_scan_device_cuda_dynamic_True_reverse_True_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_chunked_ce_device_cpu_dynamic_False_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_chunked_ce_device_cpu_dynamic_False_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_chunked_ce_device_cpu_dynamic_True_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_chunked_ce_device_cpu_dynamic_True_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_chunked_ce_device_cuda_dynamic_False_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_chunked_ce_device_cuda_dynamic_False_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_chunked_ce_device_cuda_dynamic_True_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_chunked_ce_device_cuda_dynamic_True_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_compare_chunked_ce_with_no_scan_device_cpu_dynamic_False, test/inductor/test_control_flow.py::ScanTests::test_scan_compare_chunked_ce_with_no_scan_device_cpu_dynamic_True, test/inductor/test_control_flow.py::ScanTests::test_scan_compare_chunked_ce_with_no_scan_device_cuda_dynamic_False, test/inductor/test_control_flow.py::ScanTests::test_scan_compare_chunked_ce_with_no_scan_device_cuda_dynamic_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_False_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_False_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_False_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_False_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_False_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_False_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_False_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_False_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_False_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_False_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_False_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_False_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_True_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_True_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_True_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_True_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_True_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_True_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_True_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_True_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_True_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_True_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_True_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_False_reverse_True_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_False_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_False_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_False_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_False_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_False_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_False_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_False_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_False_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_False_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_False_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_False_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_False_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_True_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_True_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_True_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_True_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_True_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_True_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_True_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_True_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_True_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_True_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_True_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cpu_dynamic_True_reverse_True_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_False_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_False_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_False_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_False_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_False_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_False_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_False_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_False_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_False_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_False_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_False_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_False_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_True_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_True_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_True_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_True_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_True_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_True_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_True_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_True_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_True_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_True_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_True_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_False_reverse_True_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_False_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_False_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_False_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_False_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_False_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_False_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_False_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_False_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_False_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_False_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_False_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_False_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_True_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_True_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_True_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_True_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_True_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_True_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_True_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_True_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_True_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_True_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_True_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_conv_device_cuda_dynamic_True_reverse_True_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_0_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_0_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_0_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_0_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_0_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_0_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_0_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_0_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_1_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_1_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_1_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_1_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_1_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_1_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_1_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_1_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_3_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_3_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_3_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_3_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_3_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_3_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_3_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_False_dim_3_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_0_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_0_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_0_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_0_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_0_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_0_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_0_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_0_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_1_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_1_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_1_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_1_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_1_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_1_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_1_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_1_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_3_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_3_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_3_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_3_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_3_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_3_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_3_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_False_reverse_True_dim_3_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_0_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_0_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_0_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_0_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_0_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_0_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_0_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_0_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_1_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_1_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_1_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_1_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_1_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_1_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_1_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_1_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_3_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_3_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_3_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_3_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_3_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_3_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_3_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_False_dim_3_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_0_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_0_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_0_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_0_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_0_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_0_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_0_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_0_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_1_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_1_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_1_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_1_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_1_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_1_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_1_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_1_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_3_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_3_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_3_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_3_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_3_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_3_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_3_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cpu_dynamic_True_reverse_True_dim_3_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_0_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_0_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_0_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_0_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_0_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_0_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_0_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_0_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_1_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_1_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_1_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_1_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_1_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_1_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_1_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_1_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_3_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_3_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_3_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_3_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_3_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_3_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_3_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_False_dim_3_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_0_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_0_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_0_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_0_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_0_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_0_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_0_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_0_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_1_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_1_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_1_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_1_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_1_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_1_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_1_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_1_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_3_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_3_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_3_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_3_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_3_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_3_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_3_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_False_reverse_True_dim_3_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_0_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_0_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_0_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_0_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_0_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_0_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_0_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_0_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_1_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_1_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_1_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_1_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_1_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_1_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_1_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_1_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_3_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_3_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_3_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_3_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_3_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_3_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_3_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_False_dim_3_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_0_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_0_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_0_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_0_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_0_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_0_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_0_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_0_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_1_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_1_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_1_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_1_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_1_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_1_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_1_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_1_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_3_pred_False_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_3_pred_False_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_3_pred_False_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_3_pred_False_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_3_pred_True_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_3_pred_True_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_3_pred_True_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_in_cond_device_cuda_dynamic_True_reverse_True_dim_3_pred_True_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_False_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_False_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_False_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_False_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_False_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_False_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_False_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_False_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_False_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_False_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_False_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_False_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_True_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_True_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_True_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_True_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_True_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_True_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_True_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_True_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_True_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_True_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_True_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_False_reverse_True_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_False_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_False_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_False_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_False_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_False_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_False_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_False_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_False_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_False_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_False_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_False_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_False_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_True_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_True_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_True_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_True_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_True_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_True_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_True_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_True_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_True_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_True_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_True_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cpu_dynamic_True_reverse_True_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_False_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_False_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_False_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_False_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_False_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_False_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_False_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_False_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_False_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_False_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_False_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_False_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_True_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_True_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_True_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_True_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_True_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_True_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_True_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_True_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_True_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_True_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_True_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_False_reverse_True_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_False_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_False_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_False_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_False_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_False_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_False_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_False_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_False_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_False_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_False_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_False_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_False_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_True_dim_0_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_True_dim_0_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_True_dim_0_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_True_dim_0_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_True_dim_1_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_True_dim_1_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_True_dim_1_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_True_dim_1_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_True_dim_3_scan_length_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_True_dim_3_scan_length_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_True_dim_3_scan_length_5_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_nn_modules_device_cuda_dynamic_True_reverse_True_dim_3_scan_length_5_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_False_reverse_False_dim_0_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_False_reverse_False_dim_0_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_False_reverse_False_dim_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_False_reverse_False_dim_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_False_reverse_False_dim_2_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_False_reverse_False_dim_2_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_False_reverse_True_dim_0_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_False_reverse_True_dim_0_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_False_reverse_True_dim_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_False_reverse_True_dim_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_False_reverse_True_dim_2_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_False_reverse_True_dim_2_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_True_reverse_False_dim_0_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_True_reverse_False_dim_0_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_True_reverse_False_dim_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_True_reverse_False_dim_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_True_reverse_False_dim_2_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_True_reverse_False_dim_2_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_True_reverse_True_dim_0_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_True_reverse_True_dim_0_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_True_reverse_True_dim_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_True_reverse_True_dim_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_True_reverse_True_dim_2_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cpu_dynamic_True_reverse_True_dim_2_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_False_reverse_False_dim_0_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_False_reverse_False_dim_0_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_False_reverse_False_dim_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_False_reverse_False_dim_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_False_reverse_False_dim_2_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_False_reverse_False_dim_2_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_False_reverse_True_dim_0_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_False_reverse_True_dim_0_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_False_reverse_True_dim_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_False_reverse_True_dim_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_False_reverse_True_dim_2_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_False_reverse_True_dim_2_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_True_reverse_False_dim_0_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_True_reverse_False_dim_0_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_True_reverse_False_dim_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_True_reverse_False_dim_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_True_reverse_False_dim_2_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_True_reverse_False_dim_2_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_True_reverse_True_dim_0_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_True_reverse_True_dim_0_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_True_reverse_True_dim_1_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_True_reverse_True_dim_1_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_True_reverse_True_dim_2_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_pytree_in_out_device_cuda_dynamic_True_reverse_True_dim_2_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_with_clamp_device_cpu_dynamic_False_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_with_clamp_device_cpu_dynamic_False_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_with_clamp_device_cpu_dynamic_True_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_with_clamp_device_cpu_dynamic_True_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_with_clamp_device_cuda_dynamic_False_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_with_clamp_device_cuda_dynamic_False_autograd_True, test/inductor/test_control_flow.py::ScanTests::test_scan_with_clamp_device_cuda_dynamic_True_autograd_False, test/inductor/test_control_flow.py::ScanTests::test_scan_with_clamp_device_cuda_dynamic_True_autograd_True, test/inductor/test_control_flow.py::MapTests::test_map_nested_with_cond_device_cpu_dynamic_False_autograd_False, test/inductor/test_control_flow.py::MapTests::test_map_nested_with_cond_device_cpu_dynamic_False_autograd_True, test/inductor/test_control_flow.py::MapTests::test_map_nested_with_cond_device_cpu_dynamic_True_autograd_False, test/inductor/test_control_flow.py::MapTests::test_map_nested_with_cond_device_cpu_dynamic_True_autograd_True, test/inductor/test_control_flow.py::MapTests::test_map_nested_with_cond_device_cuda_dynamic_False_autograd_False, test/inductor/test_control_flow.py::MapTests::test_map_nested_with_cond_device_cuda_dynamic_False_autograd_True, test/inductor/test_control_flow.py::MapTests::test_map_nested_with_cond_device_cuda_dynamic_True_autograd_False, test/inductor/test_control_flow.py::MapTests::test_map_nested_with_cond_device_cuda_dynamic_True_autograd_True, test/inductor/test_control_flow.py::MapTests::test_map_pytree_in_out_device_cpu_dynamic_False_autograd_False, test/inductor/test_control_flow.py::MapTests::test_map_pytree_in_out_device_cpu_dynamic_False_autograd_True, test/inductor/test_control_flow.py::MapTests::test_map_pytree_in_out_device_cpu_dynamic_True_autograd_False, test/inductor/test_control_flow.py::MapTests::test_map_pytree_in_out_device_cpu_dynamic_True_autograd_True, test/inductor/test_control_flow.py::MapTests::test_map_pytree_in_out_device_cuda_dynamic_False_autograd_False, test/inductor/test_control_flow.py::MapTests::test_map_pytree_in_out_device_cuda_dynamic_False_autograd_True, test/inductor/test_control_flow.py::MapTests::test_map_pytree_in_out_device_cuda_dynamic_True_autograd_False, test/inductor/test_control_flow.py::MapTests::test_map_pytree_in_out_device_cuda_dynamic_True_autograd_True, test/inductor/test_control_flow.py::MapTests::test_map_simple_device_cpu_dynamic_False_autograd_False, test/inductor/test_control_flow.py::MapTests::test_map_simple_device_cpu_dynamic_False_autograd_True, test/inductor/test_control_flow.py::MapTests::test_map_simple_device_cpu_dynamic_True_autograd_False, test/inductor/test_control_flow.py::MapTests::test_map_simple_device_cpu_dynamic_True_autograd_True, test/inductor/test_control_flow.py::MapTests::test_map_simple_device_cuda_dynamic_False_autograd_False, test/inductor/test_control_flow.py::MapTests::test_map_simple_device_cuda_dynamic_False_autograd_True, test/inductor/test_control_flow.py::MapTests::test_map_simple_device_cuda_dynamic_True_autograd_False, test/inductor/test_control_flow.py::MapTests::test_map_simple_device_cuda_dynamic_True_autograd_True, test/inductor/test_control_flow.py::MapTests::test_map_simple_linear_with_view_device_cpu_dynamic_False_autograd_False, test/inductor/test_control_flow.py::MapTests::test_map_simple_linear_with_view_device_cpu_dynamic_False_autograd_True, test/inductor/test_control_flow.py::MapTests::test_map_simple_linear_with_view_device_cpu_dynamic_True_autograd_False, test/inductor/test_control_flow.py::MapTests::test_map_simple_linear_with_view_device_cpu_dynamic_True_autograd_True, test/inductor/test_control_flow.py::MapTests::test_map_simple_linear_with_view_device_cuda_dynamic_False_autograd_False, test/inductor/test_control_flow.py::MapTests::test_map_simple_linear_with_view_device_cuda_dynamic_False_autograd_True, test/inductor/test_control_flow.py::MapTests::test_map_simple_linear_with_view_device_cuda_dynamic_True_autograd_False, test/inductor/test_control_flow.py::MapTests::test_map_simple_linear_with_view_device_cuda_dynamic_True_autograd_True 2025-10-10T01:57:59.8253830Z 2025-10-10T01:57:59.8253843Z 2025-10-10T01:57:59.8254313Z export/test_db 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_db_1.1_392d050764ebdc4b_.log 2025-10-10T01:57:59.8265940Z Running 36 items in this shard: test/export/test_db.py::ExampleTests::test_exportdb_not_supported_case_dynamic_shape_round, test/export/test_db.py::ExampleTests::test_exportdb_not_supported_case_model_attr_mutation, test/export/test_db.py::ExampleTests::test_exportdb_not_supported_case_unsupported_operator, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_assume_constant_result, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_autograd_function, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_class_method, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_cond_branch_class_method, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_cond_branch_nested_function, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_cond_branch_nonlocal_variables, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_cond_closed_over_variable, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_cond_operands, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_cond_predicate, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_constrain_as_size_example, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_constrain_as_value_example, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_decorator, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_dictionary, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_dynamic_shape_assert, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_dynamic_shape_constructor, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_dynamic_shape_if_guard, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_dynamic_shape_map, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_dynamic_shape_slicing, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_dynamic_shape_view, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_fn_with_kwargs, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_list_contains, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_list_unpack, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_nested_function, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_null_context_manager, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_optional_input, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_pytree_flatten, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_scalar_output, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_specialized_attribute, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_static_for_loop, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_static_if, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_tensor_setattr, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_type_reflection_method, test/export/test_db.py::ExampleTests::test_exportdb_supported_case_user_input_mutation 2025-10-10T01:57:59.8276858Z 2025-10-10T01:58:03.6474743Z Running inductor/test_unbacked_symints 1/1 ... [2025-10-10 01:58:03.646896] 2025-10-10T01:58:03.6475393Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:58:03.6476820Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_unbacked_symints.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:58:03.647313] 2025-10-10T01:58:03.6888102Z Running inductor/test_fused_attention 1/1 ... [2025-10-10 01:58:03.688268] 2025-10-10T01:58:03.6888714Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:58:03.6890106Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_fused_attention.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:58:03.688666] 2025-10-10T01:58:10.8764058Z 2025-10-10T01:58:10.8765220Z inductor/test_unbacked_symints 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_unbacked_symints_1.1_3854c4a618a0b316_.log 2025-10-10T01:58:10.8777383Z Running 31 items in this shard: test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_autotuning_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_broadcast_tensors_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_combo_kernel_size_hint_failure_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_einsum_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_equivalent_backed_unbacked_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_expand_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_expand_ok_with_runtime_assert_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_issue_143498_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_mm_and_friends_addmm_False_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_mm_and_friends_addmm_True_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_mm_and_friends_bmm_False_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_mm_and_friends_bmm_True_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_mm_and_friends_mm_False_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_mm_and_friends_mm_True_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_nonzero_in_inference_mode_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_sdfpa_unbacked_strides_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_sdpfa_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_softmax_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_split_with_sizes_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_to_int_with_unbacked_size_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_triton_kernel_grid_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_triton_kernel_with_unbacked_symint_fallback_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_unbacked_linear_layer_norm_input_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_unbacked_masked_scatter_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_unbacked_range_tree_divisor_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_unbacked_repeat_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_unbacked_slice_on_subclass_dynamic2_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_unbacked_slice_on_subclass_dynamic_False_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_unbacked_slice_on_subclass_dynamic_True_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_vertical_pointwise_reduction_fusion_cuda, test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCUDA::test_view_of_slice_cuda 2025-10-10T01:58:10.8788773Z 2025-10-10T01:58:12.7216030Z 2025-10-10T01:58:12.7217362Z inductor/test_fused_attention 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_fused_attention_1.1_0bcb9e0021e051e2_.log 2025-10-10T01:58:12.7258599Z Running 108 items in this shard: test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_insignificant_strides, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_pattern_fails_with_reuse_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_pattern_fails_with_tensor_factor_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_pattern_fails_with_unsupported_mask_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_prev_13_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_prev_14_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_prev_15_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_10_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_11_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_12_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_13_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_14_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_15_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_17_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_19_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_1_freezing, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_1_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_20_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_21_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_22_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_23_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_24_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_2_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_3_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_4_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_5_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_6_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_7_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_8_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_9_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_insignificant_strides, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_pattern_fails_with_reuse_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_pattern_fails_with_tensor_factor_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_pattern_fails_with_unsupported_mask_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_prev_13_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_prev_14_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_prev_15_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_10_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_11_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_12_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_13_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_14_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_15_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_17_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_19_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_1_freezing, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_1_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_20_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_21_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_22_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_23_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_24_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_2_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_3_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_4_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_5_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_6_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_7_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_8_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_9_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_pattern_fails_with_reuse_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_pattern_fails_with_tensor_factor_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_pattern_fails_with_unsupported_mask_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_prev_13_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_prev_14_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_prev_15_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_11_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_12_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_13_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_14_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_15_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_16_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_16_fp32_mask_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_17_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_18_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_19_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_1_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_20_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_21_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_22_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_23_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_24_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_2_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_5_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_pattern_fails_with_reuse_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_pattern_fails_with_tensor_factor_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_pattern_fails_with_unsupported_mask_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_prev_13_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_prev_14_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_prev_15_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_11_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_12_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_13_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_14_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_15_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_16_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_16_fp32_mask_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_17_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_18_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_19_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_1_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_20_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_21_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_22_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_23_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_24_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_2_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_5_cpu 2025-10-10T01:58:12.7298709Z 2025-10-10T01:58:14.8391330Z Running dynamo/test_export_mutations 1/1 ... [2025-10-10 01:58:14.838372] 2025-10-10T01:58:14.8391794Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:58:14.8392837Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_export_mutations.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:58:14.838746] 2025-10-10T01:58:16.6040573Z Running inductor/test_config 1/1 ... [2025-10-10 01:58:16.603485] 2025-10-10T01:58:16.6041016Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:58:16.6042330Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_config.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:58:16.603912] 2025-10-10T01:58:18.9622943Z 2025-10-10T01:58:18.9624055Z dynamo/test_export_mutations 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_export_mutations_1.1_dc71709ddd4ec781_.log 2025-10-10T01:58:18.9626882Z Running 5 items in this shard: test/dynamo/test_export_mutations.py::MutationExportTests::test_module_attribute_mutation_violation_negative_1, test/dynamo/test_export_mutations.py::MutationExportTests::test_module_attribute_mutation_violation_negative_2, test/dynamo/test_export_mutations.py::MutationExportTests::test_module_attribute_mutation_violation_negative_3, test/dynamo/test_export_mutations.py::MutationExportTests::test_module_attribute_mutation_violation_negative_4, test/dynamo/test_export_mutations.py::MutationExportTests::test_module_attribute_mutation_violation_positive_1 2025-10-10T01:58:18.9629378Z 2025-10-10T01:58:22.8793178Z Running dynamo/test_guard_serialization 1/1 ... [2025-10-10 01:58:22.878747] 2025-10-10T01:58:22.8793668Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:58:22.8795989Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_guard_serialization.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:58:22.879188] 2025-10-10T01:58:23.7841583Z 2025-10-10T01:58:23.7842702Z inductor/test_config 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_config_1.1_948d16f61fbab907_.log 2025-10-10T01:58:23.7847375Z Running 14 items in this shard: test/inductor/test_config.py::TestInductorConfig::test_api_options, test/inductor/test_config.py::TestInductorConfig::test_codegen_skips_custom_passes, test/inductor/test_config.py::TestInductorConfig::test_compile_api, test/inductor/test_config.py::TestInductorConfig::test_compile_api_passes_config, test/inductor/test_config.py::TestInductorConfig::test_get_compiler_config, test/inductor/test_config.py::TestInductorConfig::test_hasattr, test/inductor/test_config.py::TestInductorConfig::test_invalid_backend, test/inductor/test_config.py::TestInductorConfig::test_invalid_names, test/inductor/test_config.py::TestInductorConfig::test_non_inductor_backend, test/inductor/test_config.py::TestInductorConfig::test_options_do_something, test/inductor/test_config.py::TestInductorConfig::test_patch, test/inductor/test_config.py::TestInductorConfig::test_save_load, test/inductor/test_config.py::TestInductorConfig::test_select_decomp_table_fallback_embedding_bag_byte_unpack, test/inductor/test_config.py::TestInductorConfig::test_set 2025-10-10T01:58:23.7851417Z 2025-10-10T01:58:27.6829704Z Running inductor/test_graph_transform_observer 1/1 ... [2025-10-10 01:58:27.682411] 2025-10-10T01:58:27.6830298Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:58:27.6833146Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_graph_transform_observer.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:58:27.682795] 2025-10-10T01:58:30.2084439Z 2025-10-10T01:58:30.2085722Z dynamo/test_guard_serialization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_guard_serialization_1.1_735fa62e05e2bf95_.log 2025-10-10T01:58:30.2104789Z Running 54 items in this shard: test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_bool_match, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_bound_method_input, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_bound_method_patched_forward, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_bound_methods_empty, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_bound_methods_missing, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_builtin_match, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_c10d_work, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_closure_match, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_closure_var_missing, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_constant_match, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_ddp_module, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_default_device, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_deterministic_algorithms, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_dict_contains, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_dict_keys_match, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_dict_keys_serialization, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_dict_version, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_dispatch_key_set_match, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_dual_level, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_duplicate_input, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_empty_nn_module_hooks_dict, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_equals_match, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_fsdp_training_state, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_function_locals, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_function_match, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_function_with_wrong_fqn, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_functorch_stack_match, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_grad_mode, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_grad_mode_loading, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_guard_on_key_order_with_cache, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_hasattr_serialization, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_id_match, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_id_match_with_config, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_mapping_keys_check, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_name_match, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_nn_module, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_none_match, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_not_present_in_generic_dict, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_range_iterator_match, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_sequence_length, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_shape_env, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_skipped_objects, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_tensor_match, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_tensor_subclass_metadata_match, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_torch_function_state, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_tuple_iterator_len, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_type_match, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_unserializable_sharded_tensor, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_unserializable_submodule, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_unused_process_group, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_unused_stream, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_unused_weakref, test/dynamo/test_guard_serialization.py::TestGuardSerialization::test_weakref_alive, test/dynamo/test_guard_serialization.py::TestGuardSerializationFSDP::test_guard_serialization_fsdp_module 2025-10-10T01:58:30.2122629Z 2025-10-10T01:58:34.1352227Z Running dynamo/test_unittest 1/1 ... [2025-10-10 01:58:34.134643] 2025-10-10T01:58:34.1352671Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:58:34.1355614Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_unittest.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:58:34.135128] 2025-10-10T01:58:35.0626067Z 2025-10-10T01:58:35.0627098Z inductor/test_graph_transform_observer 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_graph_transform_observer_1.1_4599af1e98fa3111_.log 2025-10-10T01:58:35.0628258Z Running 1 items in this shard: test/inductor/test_graph_transform_observer.py::TestGraphTransformObserver::test_sdpa_rewriter 2025-10-10T01:58:35.0628784Z 2025-10-10T01:58:38.2085072Z 2025-10-10T01:58:38.2085859Z dynamo/test_unittest 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_unittest_1.1_f48ae0c7ef7bfb97_.log 2025-10-10T01:58:38.2086744Z Running 1 items in this shard: test/dynamo/test_unittest.py::TestUnittest::test_SkipTest 2025-10-10T01:58:38.2087237Z 2025-10-10T01:58:38.9235609Z Running inductor/test_cache 1/1 ... [2025-10-10 01:58:38.923066] 2025-10-10T01:58:38.9236364Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:58:38.9238069Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cache.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:58:38.923450] 2025-10-10T01:58:42.0651650Z Running dynamo/test_after_aot 1/1 ... [2025-10-10 01:58:42.064639] 2025-10-10T01:58:42.0652126Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:58:42.0654428Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_after_aot.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:58:42.065046] 2025-10-10T01:58:44.0990506Z 2025-10-10T01:58:44.0991593Z inductor/test_cache 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cache_1.1_6802b59f2684bb57_.log 2025-10-10T01:58:44.1267199Z Running 725 items in this shard: test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type0_value_type0_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type0_value_type0_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type0_value_type1_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type0_value_type1_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type0_value_type2_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type0_value_type2_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type0_value_type3_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type0_value_type3_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type0_value_type4_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type0_value_type4_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type0_value_type5_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type0_value_type5_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type1_value_type0_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type1_value_type0_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type1_value_type1_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type1_value_type1_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type1_value_type2_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type1_value_type2_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type1_value_type3_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type1_value_type3_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type1_value_type4_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type1_value_type4_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type1_value_type5_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type1_value_type5_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type2_value_type0_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type2_value_type0_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type2_value_type1_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type2_value_type1_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type2_value_type2_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type2_value_type2_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type2_value_type3_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type2_value_type3_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type2_value_type4_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type2_value_type4_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type2_value_type5_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type0_key_type2_value_type5_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type0_value_type0_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type0_value_type0_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type0_value_type1_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type0_value_type1_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type0_value_type2_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type0_value_type2_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type0_value_type3_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type0_value_type3_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type0_value_type4_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type0_value_type4_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type0_value_type5_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type0_value_type5_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type1_value_type0_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type1_value_type0_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type1_value_type1_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type1_value_type1_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type1_value_type2_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type1_value_type2_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type1_value_type3_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type1_value_type3_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type1_value_type4_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type1_value_type4_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type1_value_type5_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type1_value_type5_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type2_value_type0_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type2_value_type0_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type2_value_type1_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type2_value_type1_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type2_value_type2_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type2_value_type2_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type2_value_type3_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type2_value_type3_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type2_value_type4_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type2_value_type4_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type2_value_type5_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type1_key_type2_value_type5_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type0_value_type0_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type0_value_type0_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type0_value_type1_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type0_value_type1_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type0_value_type2_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type0_value_type2_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type0_value_type3_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type0_value_type3_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type0_value_type4_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type0_value_type4_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type0_value_type5_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type0_value_type5_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type1_value_type0_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type1_value_type0_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type1_value_type1_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type1_value_type1_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type1_value_type2_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type1_value_type2_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type1_value_type3_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type1_value_type3_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type1_value_type4_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type1_value_type4_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type1_value_type5_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type1_value_type5_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type2_value_type0_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type2_value_type0_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type2_value_type1_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type2_value_type1_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type2_value_type2_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type2_value_type2_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type2_value_type3_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type2_value_type3_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type2_value_type4_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type2_value_type4_get_first_True, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type2_value_type5_get_first_False, test/inductor/test_cache.py::CacheTest::test_combo_concurrent_cache_type2_key_type2_value_type5_get_first_True, test/inductor/test_cache.py::CacheTest::test_get_cache_type0_key_type0_value_type0, test/inductor/test_cache.py::CacheTest::test_get_cache_type0_key_type0_value_type1, test/inductor/test_cache.py::CacheTest::test_get_cache_type0_key_type0_value_type2, test/inductor/test_cache.py::CacheTest::test_get_cache_type0_key_type0_value_type3, test/inductor/test_cache.py::CacheTest::test_get_cache_type0_key_type0_value_type4, test/inductor/test_cache.py::CacheTest::test_get_cache_type0_key_type0_value_type5, test/inductor/test_cache.py::CacheTest::test_get_cache_type0_key_type1_value_type0, test/inductor/test_cache.py::CacheTest::test_get_cache_type0_key_type1_value_type1, test/inductor/test_cache.py::CacheTest::test_get_cache_type0_key_type1_value_type2, test/inductor/test_cache.py::CacheTest::test_get_cache_type0_key_type1_value_type3, test/inductor/test_cache.py::CacheTest::test_get_cache_type0_key_type1_value_type4, test/inductor/test_cache.py::CacheTest::test_get_cache_type0_key_type1_value_type5, test/inductor/test_cache.py::CacheTest::test_get_cache_type0_key_type2_value_type0, test/inductor/test_cache.py::CacheTest::test_get_cache_type0_key_type2_value_type1, test/inductor/test_cache.py::CacheTest::test_get_cache_type0_key_type2_value_type2, test/inductor/test_cache.py::CacheTest::test_get_cache_type0_key_type2_value_type3, test/inductor/test_cache.py::CacheTest::test_get_cache_type0_key_type2_value_type4, test/inductor/test_cache.py::CacheTest::test_get_cache_type0_key_type2_value_type5, test/inductor/test_cache.py::CacheTest::test_get_cache_type1_key_type0_value_type0, test/inductor/test_cache.py::CacheTest::test_get_cache_type1_key_type0_value_type1, test/inductor/test_cache.py::CacheTest::test_get_cache_type1_key_type0_value_type2, test/inductor/test_cache.py::CacheTest::test_get_cache_type1_key_type0_value_type3, test/inductor/test_cache.py::CacheTest::test_get_cache_type1_key_type0_value_type4, test/inductor/test_cache.py::CacheTest::test_get_cache_type1_key_type0_value_type5, test/inductor/test_cache.py::CacheTest::test_get_cache_type1_key_type1_value_type0, test/inductor/test_cache.py::CacheTest::test_get_cache_type1_key_type1_value_type1, test/inductor/test_cache.py::CacheTest::test_get_cache_type1_key_type1_value_type2, test/inductor/test_cache.py::CacheTest::test_get_cache_type1_key_type1_value_type3, test/inductor/test_cache.py::CacheTest::test_get_cache_type1_key_type1_value_type4, test/inductor/test_cache.py::CacheTest::test_get_cache_type1_key_type1_value_type5, test/inductor/test_cache.py::CacheTest::test_get_cache_type1_key_type2_value_type0, test/inductor/test_cache.py::CacheTest::test_get_cache_type1_key_type2_value_type1, test/inductor/test_cache.py::CacheTest::test_get_cache_type1_key_type2_value_type2, test/inductor/test_cache.py::CacheTest::test_get_cache_type1_key_type2_value_type3, test/inductor/test_cache.py::CacheTest::test_get_cache_type1_key_type2_value_type4, test/inductor/test_cache.py::CacheTest::test_get_cache_type1_key_type2_value_type5, test/inductor/test_cache.py::CacheTest::test_get_cache_type2_key_type0_value_type0, test/inductor/test_cache.py::CacheTest::test_get_cache_type2_key_type0_value_type1, test/inductor/test_cache.py::CacheTest::test_get_cache_type2_key_type0_value_type2, test/inductor/test_cache.py::CacheTest::test_get_cache_type2_key_type0_value_type3, test/inductor/test_cache.py::CacheTest::test_get_cache_type2_key_type0_value_type4, test/inductor/test_cache.py::CacheTest::test_get_cache_type2_key_type0_value_type5, test/inductor/test_cache.py::CacheTest::test_get_cache_type2_key_type1_value_type0, test/inductor/test_cache.py::CacheTest::test_get_cache_type2_key_type1_value_type1, test/inductor/test_cache.py::CacheTest::test_get_cache_type2_key_type1_value_type2, test/inductor/test_cache.py::CacheTest::test_get_cache_type2_key_type1_value_type3, test/inductor/test_cache.py::CacheTest::test_get_cache_type2_key_type1_value_type4, test/inductor/test_cache.py::CacheTest::test_get_cache_type2_key_type1_value_type5, test/inductor/test_cache.py::CacheTest::test_get_cache_type2_key_type2_value_type0, test/inductor/test_cache.py::CacheTest::test_get_cache_type2_key_type2_value_type1, test/inductor/test_cache.py::CacheTest::test_get_cache_type2_key_type2_value_type2, test/inductor/test_cache.py::CacheTest::test_get_cache_type2_key_type2_value_type3, test/inductor/test_cache.py::CacheTest::test_get_cache_type2_key_type2_value_type4, test/inductor/test_cache.py::CacheTest::test_get_cache_type2_key_type2_value_type5, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type0_key_type0_value_type0, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type0_key_type0_value_type1, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type0_key_type0_value_type2, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type0_key_type0_value_type3, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type0_key_type0_value_type4, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type0_key_type0_value_type5, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type0_key_type1_value_type0, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type0_key_type1_value_type1, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type0_key_type1_value_type2, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type0_key_type1_value_type3, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type0_key_type1_value_type4, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type0_key_type1_value_type5, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type0_key_type2_value_type0, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type0_key_type2_value_type1, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type0_key_type2_value_type2, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type0_key_type2_value_type3, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type0_key_type2_value_type4, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type0_key_type2_value_type5, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type1_key_type0_value_type0, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type1_key_type0_value_type1, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type1_key_type0_value_type2, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type1_key_type0_value_type3, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type1_key_type0_value_type4, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type1_key_type0_value_type5, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type1_key_type1_value_type0, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type1_key_type1_value_type1, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type1_key_type1_value_type2, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type1_key_type1_value_type3, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type1_key_type1_value_type4, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type1_key_type1_value_type5, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type1_key_type2_value_type0, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type1_key_type2_value_type1, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type1_key_type2_value_type2, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type1_key_type2_value_type3, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type1_key_type2_value_type4, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type1_key_type2_value_type5, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type2_key_type0_value_type0, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type2_key_type0_value_type1, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type2_key_type0_value_type2, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type2_key_type0_value_type3, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type2_key_type0_value_type4, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type2_key_type0_value_type5, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type2_key_type1_value_type0, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type2_key_type1_value_type1, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type2_key_type1_value_type2, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type2_key_type1_value_type3, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type2_key_type1_value_type4, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type2_key_type1_value_type5, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type2_key_type2_value_type0, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type2_key_type2_value_type1, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type2_key_type2_value_type2, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type2_key_type2_value_type3, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type2_key_type2_value_type4, test/inductor/test_cache.py::CacheTest::test_get_concurrent_cache_type2_key_type2_value_type5, test/inductor/test_cache.py::CacheTest::test_insert_cache_type0_key_type0_value_type0, test/inductor/test_cache.py::CacheTest::test_insert_cache_type0_key_type0_value_type1, test/inductor/test_cache.py::CacheTest::test_insert_cache_type0_key_type0_value_type2, test/inductor/test_cache.py::CacheTest::test_insert_cache_type0_key_type0_value_type3, test/inductor/test_cache.py::CacheTest::test_insert_cache_type0_key_type0_value_type4, test/inductor/test_cache.py::CacheTest::test_insert_cache_type0_key_type0_value_type5, test/inductor/test_cache.py::CacheTest::test_insert_cache_type0_key_type1_value_type0, test/inductor/test_cache.py::CacheTest::test_insert_cache_type0_key_type1_value_type1, test/inductor/test_cache.py::CacheTest::test_insert_cache_type0_key_type1_value_type2, test/inductor/test_cache.py::CacheTest::test_insert_cache_type0_key_type1_value_type3, test/inductor/test_cache.py::CacheTest::test_insert_cache_type0_key_type1_value_type4, test/inductor/test_cache.py::CacheTest::test_insert_cache_type0_key_type1_value_type5, test/inductor/test_cache.py::CacheTest::test_insert_cache_type0_key_type2_value_type0, test/inductor/test_cache.py::CacheTest::test_insert_cache_type0_key_type2_value_type1, test/inductor/test_cache.py::CacheTest::test_insert_cache_type0_key_type2_value_type2, test/inductor/test_cache.py::CacheTest::test_insert_cache_type0_key_type2_value_type3, test/inductor/test_cache.py::CacheTest::test_insert_cache_type0_key_type2_value_type4, test/inductor/test_cache.py::CacheTest::test_insert_cache_type0_key_type2_value_type5, test/inductor/test_cache.py::CacheTest::test_insert_cache_type1_key_type0_value_type0, test/inductor/test_cache.py::CacheTest::test_insert_cache_type1_key_type0_value_type1, test/inductor/test_cache.py::CacheTest::test_insert_cache_type1_key_type0_value_type2, test/inductor/test_cache.py::CacheTest::test_insert_cache_type1_key_type0_value_type3, test/inductor/test_cache.py::CacheTest::test_insert_cache_type1_key_type0_value_type4, test/inductor/test_cache.py::CacheTest::test_insert_cache_type1_key_type0_value_type5, test/inductor/test_cache.py::CacheTest::test_insert_cache_type1_key_type1_value_type0, test/inductor/test_cache.py::CacheTest::test_insert_cache_type1_key_type1_value_type1, test/inductor/test_cache.py::CacheTest::test_insert_cache_type1_key_type1_value_type2, test/inductor/test_cache.py::CacheTest::test_insert_cache_type1_key_type1_value_type3, test/inductor/test_cache.py::CacheTest::test_insert_cache_type1_key_type1_value_type4, test/inductor/test_cache.py::CacheTest::test_insert_cache_type1_key_type1_value_type5, test/inductor/test_cache.py::CacheTest::test_insert_cache_type1_key_type2_value_type0, test/inductor/test_cache.py::CacheTest::test_insert_cache_type1_key_type2_value_type1, test/inductor/test_cache.py::CacheTest::test_insert_cache_type1_key_type2_value_type2, test/inductor/test_cache.py::CacheTest::test_insert_cache_type1_key_type2_value_type3, test/inductor/test_cache.py::CacheTest::test_insert_cache_type1_key_type2_value_type4, test/inductor/test_cache.py::CacheTest::test_insert_cache_type1_key_type2_value_type5, test/inductor/test_cache.py::CacheTest::test_insert_cache_type2_key_type0_value_type0, test/inductor/test_cache.py::CacheTest::test_insert_cache_type2_key_type0_value_type1, test/inductor/test_cache.py::CacheTest::test_insert_cache_type2_key_type0_value_type2, test/inductor/test_cache.py::CacheTest::test_insert_cache_type2_key_type0_value_type3, test/inductor/test_cache.py::CacheTest::test_insert_cache_type2_key_type0_value_type4, test/inductor/test_cache.py::CacheTest::test_insert_cache_type2_key_type0_value_type5, test/inductor/test_cache.py::CacheTest::test_insert_cache_type2_key_type1_value_type0, test/inductor/test_cache.py::CacheTest::test_insert_cache_type2_key_type1_value_type1, test/inductor/test_cache.py::CacheTest::test_insert_cache_type2_key_type1_value_type2, test/inductor/test_cache.py::CacheTest::test_insert_cache_type2_key_type1_value_type3, test/inductor/test_cache.py::CacheTest::test_insert_cache_type2_key_type1_value_type4, test/inductor/test_cache.py::CacheTest::test_insert_cache_type2_key_type1_value_type5, test/inductor/test_cache.py::CacheTest::test_insert_cache_type2_key_type2_value_type0, test/inductor/test_cache.py::CacheTest::test_insert_cache_type2_key_type2_value_type1, test/inductor/test_cache.py::CacheTest::test_insert_cache_type2_key_type2_value_type2, test/inductor/test_cache.py::CacheTest::test_insert_cache_type2_key_type2_value_type3, test/inductor/test_cache.py::CacheTest::test_insert_cache_type2_key_type2_value_type4, test/inductor/test_cache.py::CacheTest::test_insert_cache_type2_key_type2_value_type5, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type0_key_type0_value_type0, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type0_key_type0_value_type1, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type0_key_type0_value_type2, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type0_key_type0_value_type3, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type0_key_type0_value_type4, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type0_key_type0_value_type5, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type0_key_type1_value_type0, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type0_key_type1_value_type1, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type0_key_type1_value_type2, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type0_key_type1_value_type3, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type0_key_type1_value_type4, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type0_key_type1_value_type5, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type0_key_type2_value_type0, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type0_key_type2_value_type1, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type0_key_type2_value_type2, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type0_key_type2_value_type3, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type0_key_type2_value_type4, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type0_key_type2_value_type5, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type1_key_type0_value_type0, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type1_key_type0_value_type1, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type1_key_type0_value_type2, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type1_key_type0_value_type3, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type1_key_type0_value_type4, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type1_key_type0_value_type5, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type1_key_type1_value_type0, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type1_key_type1_value_type1, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type1_key_type1_value_type2, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type1_key_type1_value_type3, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type1_key_type1_value_type4, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type1_key_type1_value_type5, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type1_key_type2_value_type0, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type1_key_type2_value_type1, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type1_key_type2_value_type2, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type1_key_type2_value_type3, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type1_key_type2_value_type4, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type1_key_type2_value_type5, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type2_key_type0_value_type0, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type2_key_type0_value_type1, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type2_key_type0_value_type2, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type2_key_type0_value_type3, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type2_key_type0_value_type4, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type2_key_type0_value_type5, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type2_key_type1_value_type0, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type2_key_type1_value_type1, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type2_key_type1_value_type2, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type2_key_type1_value_type3, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type2_key_type1_value_type4, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type2_key_type1_value_type5, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type2_key_type2_value_type0, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type2_key_type2_value_type1, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type2_key_type2_value_type2, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type2_key_type2_value_type3, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type2_key_type2_value_type4, test/inductor/test_cache.py::CacheTest::test_insert_concurrent_cache_type2_key_type2_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type0_value_type0_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type0_value_type0_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type0_value_type1_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type0_value_type1_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type0_value_type2_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type0_value_type2_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type0_value_type3_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type0_value_type3_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type0_value_type4_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type0_value_type4_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type0_value_type5_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type0_value_type5_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type1_value_type0_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type1_value_type0_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type1_value_type1_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type1_value_type1_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type1_value_type2_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type1_value_type2_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type1_value_type3_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type1_value_type3_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type1_value_type4_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type1_value_type4_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type1_value_type5_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type1_value_type5_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type2_value_type0_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type2_value_type0_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type2_value_type1_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type2_value_type1_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type2_value_type2_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type2_value_type2_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type2_value_type3_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type2_value_type3_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type2_value_type4_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type2_value_type4_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type2_value_type5_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type0_key_type2_value_type5_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type0_value_type0_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type0_value_type0_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type0_value_type1_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type0_value_type1_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type0_value_type2_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type0_value_type2_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type0_value_type3_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type0_value_type3_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type0_value_type4_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type0_value_type4_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type0_value_type5_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type0_value_type5_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type1_value_type0_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type1_value_type0_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type1_value_type1_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type1_value_type1_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type1_value_type2_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type1_value_type2_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type1_value_type3_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type1_value_type3_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type1_value_type4_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type1_value_type4_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type1_value_type5_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type1_value_type5_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type2_value_type0_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type2_value_type0_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type2_value_type1_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type2_value_type1_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type2_value_type2_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type2_value_type2_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type2_value_type3_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type2_value_type3_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type2_value_type4_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type2_value_type4_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type2_value_type5_get_first_False, test/inductor/test_cache.py::AsyncCacheTest::test_combo_async_concurrent_async_cache_type1_key_type2_value_type5_get_first_True, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type0_key_type0_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type0_key_type0_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type0_key_type0_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type0_key_type0_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type0_key_type0_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type0_key_type0_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type0_key_type1_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type0_key_type1_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type0_key_type1_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type0_key_type1_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type0_key_type1_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type0_key_type1_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type0_key_type2_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type0_key_type2_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type0_key_type2_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type0_key_type2_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type0_key_type2_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type0_key_type2_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type1_key_type0_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type1_key_type0_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type1_key_type0_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type1_key_type0_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type1_key_type0_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type1_key_type0_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type1_key_type1_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type1_key_type1_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type1_key_type1_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type1_key_type1_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type1_key_type1_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type1_key_type1_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type1_key_type2_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type1_key_type2_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type1_key_type2_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type1_key_type2_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type1_key_type2_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_async_cache_type1_key_type2_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type0_key_type0_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type0_key_type0_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type0_key_type0_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type0_key_type0_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type0_key_type0_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type0_key_type0_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type0_key_type1_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type0_key_type1_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type0_key_type1_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type0_key_type1_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type0_key_type1_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type0_key_type1_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type0_key_type2_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type0_key_type2_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type0_key_type2_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type0_key_type2_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type0_key_type2_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type0_key_type2_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type1_key_type0_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type1_key_type0_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type1_key_type0_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type1_key_type0_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type1_key_type0_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type1_key_type0_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type1_key_type1_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type1_key_type1_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type1_key_type1_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type1_key_type1_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type1_key_type1_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type1_key_type1_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type1_key_type2_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type1_key_type2_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type1_key_type2_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type1_key_type2_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type1_key_type2_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_get_async_concurrent_async_cache_type1_key_type2_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type0_key_type0_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type0_key_type0_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type0_key_type0_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type0_key_type0_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type0_key_type0_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type0_key_type0_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type0_key_type1_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type0_key_type1_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type0_key_type1_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type0_key_type1_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type0_key_type1_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type0_key_type1_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type0_key_type2_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type0_key_type2_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type0_key_type2_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type0_key_type2_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type0_key_type2_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type0_key_type2_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type1_key_type0_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type1_key_type0_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type1_key_type0_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type1_key_type0_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type1_key_type0_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type1_key_type0_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type1_key_type1_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type1_key_type1_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type1_key_type1_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type1_key_type1_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type1_key_type1_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type1_key_type1_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type1_key_type2_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type1_key_type2_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type1_key_type2_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type1_key_type2_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type1_key_type2_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_async_cache_type1_key_type2_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type0_key_type0_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type0_key_type0_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type0_key_type0_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type0_key_type0_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type0_key_type0_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type0_key_type0_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type0_key_type1_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type0_key_type1_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type0_key_type1_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type0_key_type1_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type0_key_type1_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type0_key_type1_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type0_key_type2_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type0_key_type2_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type0_key_type2_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type0_key_type2_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type0_key_type2_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type0_key_type2_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type1_key_type0_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type1_key_type0_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type1_key_type0_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type1_key_type0_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type1_key_type0_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type1_key_type0_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type1_key_type1_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type1_key_type1_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type1_key_type1_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type1_key_type1_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type1_key_type1_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type1_key_type1_value_type5, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type1_key_type2_value_type0, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type1_key_type2_value_type1, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type1_key_type2_value_type2, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type1_key_type2_value_type3, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type1_key_type2_value_type4, test/inductor/test_cache.py::AsyncCacheTest::test_insert_async_concurrent_async_cache_type1_key_type2_value_type5, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_bad_encoding_key_type0_value_type0, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_bad_encoding_key_type0_value_type1, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_bad_encoding_key_type0_value_type2, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_bad_encoding_key_type0_value_type3, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_bad_encoding_key_type0_value_type4, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_bad_encoding_key_type0_value_type5, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_bad_encoding_key_type1_value_type0, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_bad_encoding_key_type1_value_type1, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_bad_encoding_key_type1_value_type2, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_bad_encoding_key_type1_value_type3, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_bad_encoding_key_type1_value_type4, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_bad_encoding_key_type1_value_type5, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_bad_encoding_key_type2_value_type0, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_bad_encoding_key_type2_value_type1, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_bad_encoding_key_type2_value_type2, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_bad_encoding_key_type2_value_type3, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_bad_encoding_key_type2_value_type4, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_bad_encoding_key_type2_value_type5, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_duplicated_entries_key_type0_value_type0, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_duplicated_entries_key_type0_value_type1, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_duplicated_entries_key_type0_value_type2, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_duplicated_entries_key_type0_value_type3, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_duplicated_entries_key_type0_value_type4, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_duplicated_entries_key_type0_value_type5, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_duplicated_entries_key_type1_value_type0, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_duplicated_entries_key_type1_value_type1, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_duplicated_entries_key_type1_value_type2, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_duplicated_entries_key_type1_value_type3, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_duplicated_entries_key_type1_value_type4, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_duplicated_entries_key_type1_value_type5, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_duplicated_entries_key_type2_value_type0, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_duplicated_entries_key_type2_value_type1, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_duplicated_entries_key_type2_value_type2, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_duplicated_entries_key_type2_value_type3, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_duplicated_entries_key_type2_value_type4, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_duplicated_entries_key_type2_value_type5, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type0_with_whitespace_False_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type0_with_whitespace_False_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type0_with_whitespace_True_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type0_with_whitespace_True_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type1_with_whitespace_False_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type1_with_whitespace_False_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type1_with_whitespace_True_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type1_with_whitespace_True_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type2_with_whitespace_False_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type2_with_whitespace_False_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type2_with_whitespace_True_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type2_with_whitespace_True_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type3_with_whitespace_False_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type3_with_whitespace_False_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type3_with_whitespace_True_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type3_with_whitespace_True_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type4_with_whitespace_False_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type4_with_whitespace_False_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type4_with_whitespace_True_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type4_with_whitespace_True_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type5_with_whitespace_False_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type5_with_whitespace_False_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type5_with_whitespace_True_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type0_value_type5_with_whitespace_True_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type0_with_whitespace_False_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type0_with_whitespace_False_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type0_with_whitespace_True_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type0_with_whitespace_True_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type1_with_whitespace_False_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type1_with_whitespace_False_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type1_with_whitespace_True_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type1_with_whitespace_True_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type2_with_whitespace_False_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type2_with_whitespace_False_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type2_with_whitespace_True_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type2_with_whitespace_True_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type3_with_whitespace_False_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type3_with_whitespace_False_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type3_with_whitespace_True_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type3_with_whitespace_True_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type4_with_whitespace_False_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type4_with_whitespace_False_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type4_with_whitespace_True_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type4_with_whitespace_True_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type5_with_whitespace_False_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type5_with_whitespace_False_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type5_with_whitespace_True_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type1_value_type5_with_whitespace_True_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type0_with_whitespace_False_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type0_with_whitespace_False_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type0_with_whitespace_True_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type0_with_whitespace_True_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type1_with_whitespace_False_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type1_with_whitespace_False_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type1_with_whitespace_True_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type1_with_whitespace_True_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type2_with_whitespace_False_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type2_with_whitespace_False_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type2_with_whitespace_True_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type2_with_whitespace_True_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type3_with_whitespace_False_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type3_with_whitespace_False_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type3_with_whitespace_True_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type3_with_whitespace_True_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type4_with_whitespace_False_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type4_with_whitespace_False_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type4_with_whitespace_True_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type4_with_whitespace_True_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type5_with_whitespace_False_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type5_with_whitespace_False_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type5_with_whitespace_True_with_semicolon_suffix_False, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_key_type2_value_type5_with_whitespace_True_with_semicolon_suffix_True, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_missing_comma_separator_key_type0_value_type0, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_missing_comma_separator_key_type0_value_type1, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_missing_comma_separator_key_type0_value_type2, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_missing_comma_separator_key_type0_value_type3, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_missing_comma_separator_key_type0_value_type4, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_missing_comma_separator_key_type0_value_type5, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_missing_comma_separator_key_type1_value_type0, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_missing_comma_separator_key_type1_value_type1, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_missing_comma_separator_key_type1_value_type2, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_missing_comma_separator_key_type1_value_type3, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_missing_comma_separator_key_type1_value_type4, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_missing_comma_separator_key_type1_value_type5, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_missing_comma_separator_key_type2_value_type0, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_missing_comma_separator_key_type2_value_type1, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_missing_comma_separator_key_type2_value_type2, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_missing_comma_separator_key_type2_value_type3, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_missing_comma_separator_key_type2_value_type4, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_missing_comma_separator_key_type2_value_type5, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_not_un_pickle_able_key_type0_value_type0, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_not_un_pickle_able_key_type0_value_type1, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_not_un_pickle_able_key_type0_value_type2, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_not_un_pickle_able_key_type0_value_type3, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_not_un_pickle_able_key_type0_value_type4, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_not_un_pickle_able_key_type0_value_type5, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_not_un_pickle_able_key_type1_value_type0, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_not_un_pickle_able_key_type1_value_type1, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_not_un_pickle_able_key_type1_value_type2, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_not_un_pickle_able_key_type1_value_type3, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_not_un_pickle_able_key_type1_value_type4, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_not_un_pickle_able_key_type1_value_type5, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_not_un_pickle_able_key_type2_value_type0, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_not_un_pickle_able_key_type2_value_type1, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_not_un_pickle_able_key_type2_value_type2, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_not_un_pickle_able_key_type2_value_type3, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_not_un_pickle_able_key_type2_value_type4, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_env_var_not_un_pickle_able_key_type2_value_type5, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_key_type0_value_type0, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_key_type0_value_type1, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_key_type0_value_type2, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_key_type0_value_type3, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_key_type0_value_type4, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_key_type0_value_type5, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_key_type1_value_type0, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_key_type1_value_type1, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_key_type1_value_type2, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_key_type1_value_type3, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_key_type1_value_type4, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_key_type1_value_type5, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_key_type2_value_type0, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_key_type2_value_type1, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_key_type2_value_type2, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_key_type2_value_type3, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_key_type2_value_type4, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_key_type2_value_type5, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_not_dict, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_not_un_pickle_able_key_type0_value_type0, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_not_un_pickle_able_key_type0_value_type1, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_not_un_pickle_able_key_type0_value_type2, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_not_un_pickle_able_key_type0_value_type3, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_not_un_pickle_able_key_type0_value_type4, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_not_un_pickle_able_key_type0_value_type5, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_not_un_pickle_able_key_type1_value_type0, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_not_un_pickle_able_key_type1_value_type1, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_not_un_pickle_able_key_type1_value_type2, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_not_un_pickle_able_key_type1_value_type3, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_not_un_pickle_able_key_type1_value_type4, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_not_un_pickle_able_key_type1_value_type5, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_not_un_pickle_able_key_type2_value_type0, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_not_un_pickle_able_key_type2_value_type1, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_not_un_pickle_able_key_type2_value_type2, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_not_un_pickle_able_key_type2_value_type3, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_not_un_pickle_able_key_type2_value_type4, test/inductor/test_cache.py::OtherTest::test_in_memory_cache_from_file_path_not_un_pickle_able_key_type2_value_type5, test/inductor/test_cache.py::OtherTest::test_on_disk_cache_fpath_from_key_un_pickle_able_on_disk_cache_type0, test/inductor/test_cache.py::OtherTest::test_on_disk_cache_fpath_from_key_un_pickle_able_on_disk_cache_type1, test/inductor/test_cache.py::OtherTest::test_on_disk_cache_version_bump_on_disk_cache_type0, test/inductor/test_cache.py::OtherTest::test_on_disk_cache_version_bump_on_disk_cache_type1 2025-10-10T01:58:44.1530005Z 2025-10-10T01:58:46.2384315Z 2025-10-10T01:58:46.2385142Z dynamo/test_after_aot 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_after_aot_1.1_4aea24d445cc5326_.log 2025-10-10T01:58:46.2386452Z Running 2 items in this shard: test/dynamo/test_after_aot.py::TestAfterAot::test_dump_tensor, test/dynamo/test_after_aot.py::TestAfterAot::test_save_graph_repro 2025-10-10T01:58:46.2387149Z 2025-10-10T01:58:48.0162609Z Running inductor/test_compile 1/1 ... [2025-10-10 01:58:48.015734] 2025-10-10T01:58:48.0163060Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:58:48.0165776Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_compile.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:58:48.016164] 2025-10-10T01:58:50.0984569Z Running export/test_export_opinfo 1/1 ... [2025-10-10 01:58:50.097864] 2025-10-10T01:58:50.0985027Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:58:50.0987659Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_export_opinfo.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:58:50.098310] 2025-10-10T01:58:55.0238024Z 2025-10-10T01:58:55.0238838Z export/test_export_opinfo 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_export_opinfo_1.1_6357e7fe4ed14a50_.log 2025-10-10T01:58:55.0243262Z Running 9 items in this shard: test/export/test_export_opinfo.py::TestExportOnFakeCudaCUDA::test_fake_export___getitem___cuda_float32, test/export/test_export_opinfo.py::TestExportOnFakeCudaCUDA::test_fake_export_nn_functional_batch_norm_cuda_float32, test/export/test_export_opinfo.py::TestExportOnFakeCudaCUDA::test_fake_export_nn_functional_batch_norm_without_cudnn_cuda_float32, test/export/test_export_opinfo.py::TestExportOnFakeCudaCUDA::test_fake_export_nn_functional_conv2d_cuda_float32, test/export/test_export_opinfo.py::TestExportOnFakeCudaCUDA::test_fake_export_nn_functional_instance_norm_cuda_float32, test/export/test_export_opinfo.py::TestExportOnFakeCudaCUDA::test_fake_export_nn_functional_multi_margin_loss_cuda_float32, test/export/test_export_opinfo.py::TestExportOnFakeCudaCUDA::test_fake_export_nn_functional_scaled_dot_product_attention_cuda_float32, test/export/test_export_opinfo.py::TestExportOnFakeCudaCUDA::test_fake_export_nonzero_cuda_float32, test/export/test_export_opinfo.py::TestExportOnFakeCudaCUDA::test_preserve_original_behavior_cuda 2025-10-10T01:58:55.0247469Z 2025-10-10T01:58:55.1461029Z 2025-10-10T01:58:55.1461884Z inductor/test_compile 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_compile_1.1_fa8e60346a4a10c2_.log 2025-10-10T01:58:55.1466205Z Running 9 items in this shard: test/inductor/test_compile.py::TestStandaloneInductor::test_inductor_generate_debug_symbol, test/inductor/test_compile.py::TestStandaloneInductor::test_inductor_via_bare_module, test/inductor/test_compile.py::TestStandaloneInductor::test_inductor_via_export1, test/inductor/test_compile.py::TestStandaloneInductor::test_inductor_via_export2, test/inductor/test_compile.py::TestStandaloneInductor::test_inductor_via_fx, test/inductor/test_compile.py::TestStandaloneInductor::test_inductor_via_fx_dict_input, test/inductor/test_compile.py::TestStandaloneInductor::test_inductor_via_fx_tensor_return, test/inductor/test_compile.py::TestStandaloneInductor::test_inductor_via_make_fx, test/inductor/test_compile.py::TestStandaloneInductor::test_inductor_via_op_with_multiple_outputs 2025-10-10T01:58:55.1469484Z 2025-10-10T01:58:58.8615771Z Running inductor/test_custom_lowering 1/1 ... [2025-10-10 01:58:58.860991] 2025-10-10T01:58:58.8616433Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:58:58.8617623Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_custom_lowering.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:58:58.861369] 2025-10-10T01:58:59.1170841Z Running dynamo/test_graph_region_tracker 1/1 ... [2025-10-10 01:58:59.116523] 2025-10-10T01:58:59.1171376Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:58:59.1172953Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_graph_region_tracker.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:58:59.116940] 2025-10-10T01:59:03.2402534Z 2025-10-10T01:59:03.2404093Z dynamo/test_graph_region_tracker 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_graph_region_tracker_1.1_7d8fde614f127a2e_.log 2025-10-10T01:59:03.2412738Z Running 13 items in this shard: test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_get_regions_multiple_region_groups, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_get_regions_single_region_group, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_mismatched_arg_shapes, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_mismatched_dtypes, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_mismatched_global_state, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_mutation_tracking_allow_in_graph, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_mutation_tracking_setitem, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_mutation_tracking_simple, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_nested_args, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_no_duplicate_tracking, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_no_single_node_regions, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_non_tensor_arg_hashing, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_region_sorting 2025-10-10T01:59:03.2420706Z 2025-10-10T01:59:06.1925383Z 2025-10-10T01:59:06.1926295Z inductor/test_custom_lowering 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_custom_lowering_1.1_5a6704fc4dde7e87_.log 2025-10-10T01:59:06.1928729Z Running 5 items in this shard: test/inductor/test_custom_lowering.py::TestCustomLowering::test_jagged_to_padded_dense_sanity_cuda, test/inductor/test_custom_lowering.py::TestCustomLowering::test_jagged_to_padded_dense_zero_size, test/inductor/test_custom_lowering.py::TestCustomLowering::test_multi_inp_asm, test/inductor/test_custom_lowering.py::TestCustomLowering::test_register_lowering_custom_dict, test/inductor/test_custom_lowering.py::TestCustomLowering::test_tanh_approx 2025-10-10T01:59:06.1930530Z 2025-10-10T01:59:07.1371476Z Running dynamo/test_dicts 1/1 ... [2025-10-10 01:59:07.136518] 2025-10-10T01:59:07.1371916Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:59:07.1372974Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_dicts.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:59:07.136954] 2025-10-10T01:59:10.0649321Z Running inductor/test_fuzzer 1/1 ... [2025-10-10 01:59:10.064409] 2025-10-10T01:59:10.0649772Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:59:10.0652045Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_fuzzer.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:59:10.064841] 2025-10-10T01:59:11.5110152Z 2025-10-10T01:59:11.5111298Z dynamo/test_dicts 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_dicts_1.1_4854fe43cf57b271_.log 2025-10-10T01:59:11.5141043Z Running 126 items in this shard: test/dynamo/test_dicts.py::DictTests::test_builtin_ior_, test/dynamo/test_dicts.py::DictTests::test_builtin_or_with_diff_keys, test/dynamo/test_dicts.py::DictTests::test_builtin_or_with_invalid_types, test/dynamo/test_dicts.py::DictTests::test_builtin_or_with_same_keys, test/dynamo/test_dicts.py::DictTests::test_construct_user_dict_and_return, test/dynamo/test_dicts.py::DictTests::test_contains_dunder_dict, test/dynamo/test_dicts.py::DictTests::test_contains_module_dunder_dict, test/dynamo/test_dicts.py::DictTests::test_custom_iter_dict, test/dynamo/test_dicts.py::DictTests::test_custom_keys_iter_dict, test/dynamo/test_dicts.py::DictTests::test_dict_construction_from_mapping_proxy, test/dynamo/test_dicts.py::DictTests::test_dict_contains, test/dynamo/test_dicts.py::DictTests::test_dict_copy_alias, test/dynamo/test_dicts.py::DictTests::test_dict_guard_on_keys_order, test/dynamo/test_dicts.py::DictTests::test_dict_guard_on_keys_order2, test/dynamo/test_dicts.py::DictTests::test_dict_iter, test/dynamo/test_dicts.py::DictTests::test_dict_keys_binop_op_and_, test/dynamo/test_dicts.py::DictTests::test_dict_keys_binop_op_or_, test/dynamo/test_dicts.py::DictTests::test_dict_keys_binop_op_sub, test/dynamo/test_dicts.py::DictTests::test_dict_keys_binop_op_xor, test/dynamo/test_dicts.py::DictTests::test_dict_keys_inplace_binop_op_iand, test/dynamo/test_dicts.py::DictTests::test_dict_keys_inplace_binop_op_ior, test/dynamo/test_dicts.py::DictTests::test_dict_keys_inplace_binop_op_isub, test/dynamo/test_dicts.py::DictTests::test_dict_keys_inplace_binop_op_ixor, test/dynamo/test_dicts.py::DictTests::test_dict_list_values, test/dynamo/test_dicts.py::DictTests::test_dict_mutation_side_effect, test/dynamo/test_dicts.py::DictTests::test_dict_namedtuple, test/dynamo/test_dicts.py::DictTests::test_dict_order_keys, test/dynamo/test_dicts.py::DictTests::test_dict_order_keys_modules, test/dynamo/test_dicts.py::DictTests::test_dict_order_keys_tensors, test/dynamo/test_dicts.py::DictTests::test_dict_reconstruct_keeps_original_order, test/dynamo/test_dicts.py::DictTests::test_dict_subclass_contains, test/dynamo/test_dicts.py::DictTests::test_dict_subclass_get_method, test/dynamo/test_dicts.py::DictTests::test_dict_subclass_initialization_in_graph, test/dynamo/test_dicts.py::DictTests::test_dict_subclass_instantiation, test/dynamo/test_dicts.py::DictTests::test_dict_subclass_instantiation_return, test/dynamo/test_dicts.py::DictTests::test_dict_subclass_local_mutation, test/dynamo/test_dicts.py::DictTests::test_dict_subclass_local_with_non_dict_method, test/dynamo/test_dicts.py::DictTests::test_dict_subclass_methods_fallback_mutation, test/dynamo/test_dicts.py::DictTests::test_dict_subclass_methods_fallback_readonly, test/dynamo/test_dicts.py::DictTests::test_dict_subclass_setitem, test/dynamo/test_dicts.py::DictTests::test_dict_tag_guard, test/dynamo/test_dicts.py::DictTests::test_empty_dict_recompilation, test/dynamo/test_dicts.py::DictTests::test_fn_id, test/dynamo/test_dicts.py::DictTests::test_items_type, test/dynamo/test_dicts.py::DictTests::test_lazy_key_guarding, test/dynamo/test_dicts.py::DictTests::test_lazy_key_non_const_guarding, test/dynamo/test_dicts.py::DictTests::test_mapping_proxy_ban_muation_on_dict_realization, test/dynamo/test_dicts.py::DictTests::test_mapping_proxy_existing, test/dynamo/test_dicts.py::DictTests::test_mapping_proxy_existing_local_mutation, test/dynamo/test_dicts.py::DictTests::test_mapping_proxy_existing_mutation, test/dynamo/test_dicts.py::DictTests::test_mapping_proxy_for_local, test/dynamo/test_dicts.py::DictTests::test_mapping_proxy_for_nonlocal, test/dynamo/test_dicts.py::DictTests::test_move_to_end, test/dynamo/test_dicts.py::DictTests::test_newly_constructed_default_dict, test/dynamo/test_dicts.py::DictTests::test_ordered_dict_reordered_keys, test/dynamo/test_dicts.py::DictTests::test_ordered_dict_subclass_reordered_keys, test/dynamo/test_dicts.py::DictTests::test_overridden_get_item, test/dynamo/test_dicts.py::DictTests::test_udf_dict_reconstruction, test/dynamo/test_dicts.py::DictTests::test_update_dunder_dict, test/dynamo/test_dicts.py::DictTests::test_update_module_dunder_dict, test/dynamo/test_dicts.py::DictTests::test_weakref_dict, test/dynamo/test_dicts.py::DictGuardTests::test_cmp_eq, test/dynamo/test_dicts.py::DictGuardTests::test_cmp_ior, test/dynamo/test_dicts.py::DictGuardTests::test_cmp_ne, test/dynamo/test_dicts.py::DictGuardTests::test_cmp_or, test/dynamo/test_dicts.py::DictGuardTests::test_popitem, test/dynamo/test_dicts.py::DictMethodsTests::test_binop_ior, test/dynamo/test_dicts.py::DictMethodsTests::test_binop_ior_iterable, test/dynamo/test_dicts.py::DictMethodsTests::test_binop_or, test/dynamo/test_dicts.py::DictMethodsTests::test_clear, test/dynamo/test_dicts.py::DictMethodsTests::test_cmp_eq, test/dynamo/test_dicts.py::DictMethodsTests::test_cmp_ne, test/dynamo/test_dicts.py::DictMethodsTests::test_copy, test/dynamo/test_dicts.py::DictMethodsTests::test_dict_type_comparison, test/dynamo/test_dicts.py::DictMethodsTests::test_fromkeys, test/dynamo/test_dicts.py::DictMethodsTests::test_get, test/dynamo/test_dicts.py::DictMethodsTests::test_items, test/dynamo/test_dicts.py::DictMethodsTests::test_keys, test/dynamo/test_dicts.py::DictMethodsTests::test_pop, test/dynamo/test_dicts.py::DictMethodsTests::test_popitem, test/dynamo/test_dicts.py::DictMethodsTests::test_setdefault, test/dynamo/test_dicts.py::DictMethodsTests::test_type, test/dynamo/test_dicts.py::DictMethodsTests::test_update, test/dynamo/test_dicts.py::DictMethodsTests::test_values, test/dynamo/test_dicts.py::DictSubclassMethodsTests::test_binop_ior, test/dynamo/test_dicts.py::DictSubclassMethodsTests::test_binop_ior_iterable, test/dynamo/test_dicts.py::DictSubclassMethodsTests::test_binop_or, test/dynamo/test_dicts.py::DictSubclassMethodsTests::test_clear, test/dynamo/test_dicts.py::DictSubclassMethodsTests::test_cmp_eq, test/dynamo/test_dicts.py::DictSubclassMethodsTests::test_cmp_ne, test/dynamo/test_dicts.py::DictSubclassMethodsTests::test_copy, test/dynamo/test_dicts.py::DictSubclassMethodsTests::test_dict_type_comparison, test/dynamo/test_dicts.py::DictSubclassMethodsTests::test_fromkeys, test/dynamo/test_dicts.py::DictSubclassMethodsTests::test_get, test/dynamo/test_dicts.py::DictSubclassMethodsTests::test_items, test/dynamo/test_dicts.py::DictSubclassMethodsTests::test_keys, test/dynamo/test_dicts.py::DictSubclassMethodsTests::test_pop, test/dynamo/test_dicts.py::DictSubclassMethodsTests::test_popitem, test/dynamo/test_dicts.py::DictSubclassMethodsTests::test_setdefault, test/dynamo/test_dicts.py::DictSubclassMethodsTests::test_type, test/dynamo/test_dicts.py::DictSubclassMethodsTests::test_update, test/dynamo/test_dicts.py::DictSubclassMethodsTests::test_values, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_binop_ior, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_binop_ior_iterable, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_binop_ior_return_type, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_binop_or, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_binop_or_return_type, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_clear, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_cmp_eq, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_cmp_eq_order, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_cmp_ne, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_copy, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_dict_type_comparison, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_fromkeys, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_get, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_items, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_keys, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_move_to_end, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_pop, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_popitem, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_popitem_kwarg, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_setdefault, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_type, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_update, test/dynamo/test_dicts.py::OrderedDictMethodsTests::test_values, test/dynamo/test_dicts.py::OrderedDictSubclassOverload::test_move_to_end 2025-10-10T01:59:11.5169960Z 2025-10-10T01:59:15.4561473Z Running dynamo/test_modules 1/1 ... [2025-10-10 01:59:15.455537] 2025-10-10T01:59:15.4562055Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:59:15.4563714Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_modules.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:59:15.455953] 2025-10-10T01:59:17.1936559Z 2025-10-10T01:59:17.1937476Z inductor/test_fuzzer 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_fuzzer_1.1_2aab71feeee138a2_.log 2025-10-10T01:59:17.1941607Z Running 11 items in this shard: test/inductor/test_fuzzer.py::TestConfigFuzzer::test_config_fuzzer_bisector_boolean, test/inductor/test_fuzzer.py::TestConfigFuzzer::test_config_fuzzer_bisector_exception, test/inductor/test_fuzzer.py::TestConfigFuzzer::test_config_fuzzer_dynamo_bisect, test/inductor/test_fuzzer.py::TestConfigFuzzer::test_config_fuzzer_inductor_bisect, test/inductor/test_fuzzer.py::TestConfigFuzzer::test_config_fuzzer_inductor_cpu, test/inductor/test_fuzzer.py::TestConfigFuzzer::test_config_fuzzer_inductor_gpu, test/inductor/test_fuzzer.py::TestConfigFuzzer::test_config_fuzzer_n_tuple, test/inductor/test_fuzzer.py::TestConfigFuzzer::test_fuzzer_inductor_calling_compile, test/inductor/test_fuzzer.py::TestConfigFuzzer::test_fuzzer_running_test, test/inductor/test_fuzzer.py::TestConfigFuzzer::test_sampling_method_random, test/inductor/test_fuzzer.py::TestConfigFuzzer::test_sampling_method_toggle 2025-10-10T01:59:17.1944918Z 2025-10-10T01:59:21.0786553Z Running dynamo/test_metrics_context 1/1 ... [2025-10-10 01:59:21.078105] 2025-10-10T01:59:21.0787030Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:59:21.0788350Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_metrics_context.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:59:21.078490] 2025-10-10T01:59:23.0848892Z 2025-10-10T01:59:23.0849842Z dynamo/test_modules 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_modules_1.1_de1820bf1c4fdb49_.log 2025-10-10T01:59:23.0884506Z Running 134 items in this shard: test/dynamo/test_modules.py::NNModuleTests::test_access_by_keys, test/dynamo/test_modules.py::NNModuleTests::test_basicmodule1, test/dynamo/test_modules.py::NNModuleTests::test_basicmodule2, test/dynamo/test_modules.py::NNModuleTests::test_call_fn_with_non_const_inputs_safe, test/dynamo/test_modules.py::NNModuleTests::test_cfgmod, test/dynamo/test_modules.py::NNModuleTests::test_children, test/dynamo/test_modules.py::NNModuleTests::test_constloop, test/dynamo/test_modules.py::NNModuleTests::test_conv_call_forward_directly, test/dynamo/test_modules.py::NNModuleTests::test_conv_call_super_forward_directly, test/dynamo/test_modules.py::NNModuleTests::test_conv_transpose_call_forward_directly, test/dynamo/test_modules.py::NNModuleTests::test_conv_transpose_call_super_forward_directly, test/dynamo/test_modules.py::NNModuleTests::test_densenet, test/dynamo/test_modules.py::NNModuleTests::test_enumvalues, test/dynamo/test_modules.py::NNModuleTests::test_fnmember, test/dynamo/test_modules.py::NNModuleTests::test_fnmembercmp1, test/dynamo/test_modules.py::NNModuleTests::test_fnmembercmp2, test/dynamo/test_modules.py::NNModuleTests::test_forward_directly, test/dynamo/test_modules.py::NNModuleTests::test_generation_tag, test/dynamo/test_modules.py::NNModuleTests::test_hasattr, test/dynamo/test_modules.py::NNModuleTests::test_inject_module_parameters, test/dynamo/test_modules.py::NNModuleTests::test_intarg, test/dynamo/test_modules.py::NNModuleTests::test_iseval1, test/dynamo/test_modules.py::NNModuleTests::test_iseval2, test/dynamo/test_modules.py::NNModuleTests::test_isnonelayer, test/dynamo/test_modules.py::NNModuleTests::test_istraining1, test/dynamo/test_modules.py::NNModuleTests::test_istraining2, test/dynamo/test_modules.py::NNModuleTests::test_layerlist, test/dynamo/test_modules.py::NNModuleTests::test_lazy_module1, test/dynamo/test_modules.py::NNModuleTests::test_lazy_module2, test/dynamo/test_modules.py::NNModuleTests::test_lazy_module4, test/dynamo/test_modules.py::NNModuleTests::test_lazy_module5, test/dynamo/test_modules.py::NNModuleTests::test_lazy_module6, test/dynamo/test_modules.py::NNModuleTests::test_lazy_module7, test/dynamo/test_modules.py::NNModuleTests::test_lazy_module_bad_params, test/dynamo/test_modules.py::NNModuleTests::test_lazy_module_bad_params_call_function, test/dynamo/test_modules.py::NNModuleTests::test_lazy_module_kwargs, test/dynamo/test_modules.py::NNModuleTests::test_lazy_module_no_cls_to_become, test/dynamo/test_modules.py::NNModuleTests::test_lazy_module_speculation_log_divergence, test/dynamo/test_modules.py::NNModuleTests::test_module_attribute_precedence, test/dynamo/test_modules.py::NNModuleTests::test_module_call_module_with_static_forward, test/dynamo/test_modules.py::NNModuleTests::test_module_class_method, test/dynamo/test_modules.py::NNModuleTests::test_module_comparison, test/dynamo/test_modules.py::NNModuleTests::test_module_forward_has_graph_break, test/dynamo/test_modules.py::NNModuleTests::test_module_guard_name_is_valid, test/dynamo/test_modules.py::NNModuleTests::test_module_name_string, test/dynamo/test_modules.py::NNModuleTests::test_module_property, test/dynamo/test_modules.py::NNModuleTests::test_module_static_method, test/dynamo/test_modules.py::NNModuleTests::test_moduledict, test/dynamo/test_modules.py::NNModuleTests::test_moduledict_custom, test/dynamo/test_modules.py::NNModuleTests::test_modulelist, test/dynamo/test_modules.py::NNModuleTests::test_modulelist_custom, test/dynamo/test_modules.py::NNModuleTests::test_modulelist_nested, test/dynamo/test_modules.py::NNModuleTests::test_modulemethod1, test/dynamo/test_modules.py::NNModuleTests::test_modulemethod2, test/dynamo/test_modules.py::NNModuleTests::test_named_children, test/dynamo/test_modules.py::NNModuleTests::test_nn_module_setattr, test/dynamo/test_modules.py::NNModuleTests::test_nn_module_unspec_int_attr, test/dynamo/test_modules.py::NNModuleTests::test_nn_moduledict_contains, test/dynamo/test_modules.py::NNModuleTests::test_parameterdict, test/dynamo/test_modules.py::NNModuleTests::test_parameterdict_custom, test/dynamo/test_modules.py::NNModuleTests::test_parameters1, test/dynamo/test_modules.py::NNModuleTests::test_parameters2, test/dynamo/test_modules.py::NNModuleTests::test_parameters3, test/dynamo/test_modules.py::NNModuleTests::test_parameters4, test/dynamo/test_modules.py::NNModuleTests::test_parameters5, test/dynamo/test_modules.py::NNModuleTests::test_self_mutating1, test/dynamo/test_modules.py::NNModuleTests::test_seq, test/dynamo/test_modules.py::NNModuleTests::test_sequential_with_duplicated_module, test/dynamo/test_modules.py::NNModuleTests::test_sequential_with_duplicated_module2, test/dynamo/test_modules.py::NNModuleTests::test_simple_torch_function, test/dynamo/test_modules.py::NNModuleTests::test_stringmember, test/dynamo/test_modules.py::NNModuleTests::test_submodules1, test/dynamo/test_modules.py::NNModuleTests::test_submodules2, test/dynamo/test_modules.py::NNModuleTests::test_super1, test/dynamo/test_modules.py::NNModuleTests::test_super2, test/dynamo/test_modules.py::NNModuleTests::test_super_class_method, test/dynamo/test_modules.py::NNModuleTests::test_tensorlist, test/dynamo/test_modules.py::NNModuleTests::test_torch_function_with_closure, test/dynamo/test_modules.py::NNModuleTests::test_torch_mangled_class_name, test/dynamo/test_modules.py::NNModuleTests::test_unsupportedmethod, test/dynamo/test_modules.py::NNModuleTests::test_unsupportedmodule, test/dynamo/test_modules.py::NNModuleTests::test_viamodulecall, test/dynamo/test_modules.py::OptimizedModuleTest::test_assign_does_not_exist, test/dynamo/test_modules.py::OptimizedModuleTest::test_attr, test/dynamo/test_modules.py::OptimizedModuleTest::test_attr_precedence, test/dynamo/test_modules.py::OptimizedModuleTest::test_backward_hooks, test/dynamo/test_modules.py::OptimizedModuleTest::test_branch_on_nn_module_custom_bool, test/dynamo/test_modules.py::OptimizedModuleTest::test_branch_on_nn_module_custom_len, test/dynamo/test_modules.py::OptimizedModuleTest::test_buffer_order, test/dynamo/test_modules.py::OptimizedModuleTest::test_composition, test/dynamo/test_modules.py::OptimizedModuleTest::test_composition_with_opt_mod, test/dynamo/test_modules.py::OptimizedModuleTest::test_delattr_on_compiled_module, test/dynamo/test_modules.py::OptimizedModuleTest::test_dir, test/dynamo/test_modules.py::OptimizedModuleTest::test_dunder_call_explicitly, test/dynamo/test_modules.py::OptimizedModuleTest::test_globals_change_in_other_file, test/dynamo/test_modules.py::OptimizedModuleTest::test_guard_on_torch_nn_modules, test/dynamo/test_modules.py::OptimizedModuleTest::test_hooks_allowed_modules, test/dynamo/test_modules.py::OptimizedModuleTest::test_hooks_allowed_modules_compiles, test/dynamo/test_modules.py::OptimizedModuleTest::test_hooks_allowed_modules_compiles_self_contained, test/dynamo/test_modules.py::OptimizedModuleTest::test_hooks_inner, test/dynamo/test_modules.py::OptimizedModuleTest::test_hooks_outer, test/dynamo/test_modules.py::OptimizedModuleTest::test_hooks_skip_guards, test/dynamo/test_modules.py::OptimizedModuleTest::test_inline_inbuilt_nn_modules, test/dynamo/test_modules.py::OptimizedModuleTest::test_mark_static_nn_module_tensor, test/dynamo/test_modules.py::OptimizedModuleTest::test_mark_static_previously_seen_tensor, test/dynamo/test_modules.py::OptimizedModuleTest::test_mark_static_with_freezing, test/dynamo/test_modules.py::OptimizedModuleTest::test_module_dict_iter_keys, test/dynamo/test_modules.py::OptimizedModuleTest::test_module_dict_iter_name, test/dynamo/test_modules.py::OptimizedModuleTest::test_module_dict_iter_values, test/dynamo/test_modules.py::OptimizedModuleTest::test_module_order, test/dynamo/test_modules.py::OptimizedModuleTest::test_module_patch, test/dynamo/test_modules.py::OptimizedModuleTest::test_module_setattr, test/dynamo/test_modules.py::OptimizedModuleTest::test_monkeypatching_forward, test/dynamo/test_modules.py::OptimizedModuleTest::test_nn_module, test/dynamo/test_modules.py::OptimizedModuleTest::test_no_op_assignment, test/dynamo/test_modules.py::OptimizedModuleTest::test_no_recompile_on_nn_guarded_modules, test/dynamo/test_modules.py::OptimizedModuleTest::test_overridden_call, test/dynamo/test_modules.py::OptimizedModuleTest::test_param_order, test/dynamo/test_modules.py::OptimizedModuleTest::test_param_requires_grad, test/dynamo/test_modules.py::OptimizedModuleTest::test_patch_module, test/dynamo/test_modules.py::OptimizedModuleTest::test_recompile_limit_on_freed_module, test/dynamo/test_modules.py::OptimizedModuleTest::test_recompile_limit_on_guarded_nn_modules, test/dynamo/test_modules.py::OptimizedModuleTest::test_recursion, test/dynamo/test_modules.py::OptimizedModuleTest::test_save_and_load_all_backends, test/dynamo/test_modules.py::OptimizedModuleTest::test_save_and_load_inductor, test/dynamo/test_modules.py::OptimizedModuleTest::test_setattr_on_compiled_module, test/dynamo/test_modules.py::OptimizedModuleTest::test_to, test/dynamo/test_modules.py::OptimizedModuleTest::test_trace_delattr, test/dynamo/test_modules.py::OptimizedModuleTest::test_udo_instance_method_as_hook, test/dynamo/test_modules.py::OptimizedModuleTest::test_unhashable_nn_submodule, test/dynamo/test_modules.py::OptimizedModuleTest::test_unspec_non_inlinable_module, test/dynamo/test_modules.py::OptimizedModuleTest::test_unspecialized_seq, test/dynamo/test_modules.py::OptimizedModuleTest::test_user_defined_nn_module_dynamic, test/dynamo/test_modules.py::NNModuleTestsDeviceCUDA::test_lazy_module3_cuda 2025-10-10T01:59:23.0919165Z 2025-10-10T01:59:25.0016234Z 2025-10-10T01:59:25.0017446Z dynamo/test_metrics_context 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_metrics_context_1.1_a00681eaf9860da7_.log 2025-10-10T01:59:25.0020860Z Running 9 items in this shard: test/dynamo/test_metrics_context.py::TestMetricsContext::test_add_to_set, test/dynamo/test_metrics_context.py::TestMetricsContext::test_context_exists, test/dynamo/test_metrics_context.py::TestMetricsContext::test_nested_context, test/dynamo/test_metrics_context.py::TestMetricsContext::test_set, test/dynamo/test_metrics_context.py::TestMetricsContext::test_set_disallow_overwrite, test/dynamo/test_metrics_context.py::TestMetricsContext::test_set_key_value, test/dynamo/test_metrics_context.py::TestMetricsContext::test_top_n, test/dynamo/test_metrics_context.py::TestMetricsContext::test_update_allow_overwrite, test/dynamo/test_metrics_context.py::TestMetricsContext::test_update_disallow_overwrite 2025-10-10T01:59:25.0023535Z 2025-10-10T01:59:27.1035127Z Running dynamo/test_install_free_tensors 1/1 ... [2025-10-10 01:59:27.102952] 2025-10-10T01:59:27.1035780Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:59:27.1037485Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_install_free_tensors.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:59:27.103417] 2025-10-10T01:59:28.9156406Z Running inductor/test_memory_planning 1/1 ... [2025-10-10 01:59:28.914879] 2025-10-10T01:59:28.9157299Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:59:28.9158330Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_memory_planning.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:59:28.915253] 2025-10-10T01:59:31.2779128Z 2025-10-10T01:59:31.2780581Z dynamo/test_install_free_tensors 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_install_free_tensors_1.1_66d267faa80ec138_.log 2025-10-10T01:59:31.2790328Z Running 25 items in this shard: test/dynamo/test_install_free_tensors.py::InstallParamsAsGraphAttrTests::test_breadth_linear, test/dynamo/test_install_free_tensors.py::InstallParamsAsGraphAttrTests::test_nested_linear, test/dynamo/test_install_free_tensors.py::InstallParamsAsGraphAttrTests::test_nets_as_input, test/dynamo/test_install_free_tensors.py::InstallParamsAsGraphAttrTests::test_optimizing_buffer_and_param_in_input, test/dynamo/test_install_free_tensors.py::InstallParamsAsGraphAttrTests::test_optimizing_buffer_in_input, test/dynamo/test_install_free_tensors.py::InstallParamsAsGraphAttrTests::test_optimizing_linear, test/dynamo/test_install_free_tensors.py::InstallParamsAsGraphAttrTests::test_optimizing_params_in_input, test/dynamo/test_install_free_tensors.py::InstallParamsAsGraphAttrTests::test_resnet_structure, test/dynamo/test_install_free_tensors.py::InstallParamsAsGraphAttrTests::test_simple_batchnorm, test/dynamo/test_install_free_tensors.py::InstallParamsAsGraphAttrTests::test_transformer, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_dict_of_tensor, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_global_tensor_export, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_list_of_tensor, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_modify_net_state, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_nested_list_of_tensor, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_nonlocal_closure, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_optimizing_buffer_and_param_in_input, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_optimizing_buffer_in_input, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_optimizing_params_in_input, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_resnet_structure, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_simple_batchnorm, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_simple_linear, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_tensors_as_nn_attr, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_transformer, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_user_defined_object 2025-10-10T01:59:31.2799725Z 2025-10-10T01:59:35.1457033Z Running inductor/test_ordered_set 1/1 ... [2025-10-10 01:59:35.145072] 2025-10-10T01:59:35.1457492Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:59:35.1460359Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_ordered_set.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:59:35.145455] 2025-10-10T01:59:36.6468019Z 2025-10-10T01:59:36.6469110Z inductor/test_memory_planning 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_memory_planning_1.1_09ecd4d1a24e9a08_.log 2025-10-10T01:59:36.6471643Z Running 4 items in this shard: test/inductor/test_memory_planning.py::TestMemoryPlanning::test_aoti, test/inductor/test_memory_planning.py::TestMemoryPlanning::test_cpp_wrapper, test/inductor/test_memory_planning.py::TestMemoryPlanning::test_python_wrapper, test/inductor/test_memory_planning.py::TestMemoryPlanning::test_unbacked_symint 2025-10-10T01:59:36.6472949Z 2025-10-10T01:59:40.3212019Z 2025-10-10T01:59:40.3212764Z inductor/test_ordered_set 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_ordered_set_1.1_381dadbf74517f24_.log 2025-10-10T01:59:40.3327129Z Running 401 items in this shard: test/inductor/test_ordered_set.py::TestJointOps::test_and, test/inductor/test_ordered_set.py::TestJointOps::test_badcmp, test/inductor/test_ordered_set.py::TestJointOps::test_container_iterator, test/inductor/test_ordered_set.py::TestJointOps::test_contains, test/inductor/test_ordered_set.py::TestJointOps::test_cyclical_repr, test/inductor/test_ordered_set.py::TestJointOps::test_deepcopy, test/inductor/test_ordered_set.py::TestJointOps::test_difference, test/inductor/test_ordered_set.py::TestJointOps::test_do_not_rehash_dict_keys, test/inductor/test_ordered_set.py::TestJointOps::test_equality, test/inductor/test_ordered_set.py::TestJointOps::test_free_after_iterating, test/inductor/test_ordered_set.py::TestJointOps::test_gc, test/inductor/test_ordered_set.py::TestJointOps::test_intersection, test/inductor/test_ordered_set.py::TestJointOps::test_isdisjoint, test/inductor/test_ordered_set.py::TestJointOps::test_iterator_pickling, test/inductor/test_ordered_set.py::TestJointOps::test_len, test/inductor/test_ordered_set.py::TestJointOps::test_new_or_init, test/inductor/test_ordered_set.py::TestJointOps::test_or, test/inductor/test_ordered_set.py::TestJointOps::test_pickling, test/inductor/test_ordered_set.py::TestJointOps::test_setOfFrozensets, test/inductor/test_ordered_set.py::TestJointOps::test_sub, test/inductor/test_ordered_set.py::TestJointOps::test_sub_and_super, test/inductor/test_ordered_set.py::TestJointOps::test_subclass_with_custom_hash, test/inductor/test_ordered_set.py::TestJointOps::test_symmetric_difference, test/inductor/test_ordered_set.py::TestJointOps::test_union, test/inductor/test_ordered_set.py::TestJointOps::test_uniquification, test/inductor/test_ordered_set.py::TestJointOps::test_xor, test/inductor/test_ordered_set.py::TestSet::test_add, test/inductor/test_ordered_set.py::TestSet::test_and, test/inductor/test_ordered_set.py::TestSet::test_badcmp, test/inductor/test_ordered_set.py::TestSet::test_clear, test/inductor/test_ordered_set.py::TestSet::test_constructor_identity, test/inductor/test_ordered_set.py::TestSet::test_container_iterator, test/inductor/test_ordered_set.py::TestSet::test_contains, test/inductor/test_ordered_set.py::TestSet::test_copy, test/inductor/test_ordered_set.py::TestSet::test_cyclical_repr, test/inductor/test_ordered_set.py::TestSet::test_deepcopy, test/inductor/test_ordered_set.py::TestSet::test_difference, test/inductor/test_ordered_set.py::TestSet::test_difference_update, test/inductor/test_ordered_set.py::TestSet::test_discard, test/inductor/test_ordered_set.py::TestSet::test_do_not_rehash_dict_keys, test/inductor/test_ordered_set.py::TestSet::test_equality, test/inductor/test_ordered_set.py::TestSet::test_free_after_iterating, test/inductor/test_ordered_set.py::TestSet::test_gc, test/inductor/test_ordered_set.py::TestSet::test_hash, test/inductor/test_ordered_set.py::TestSet::test_iand, test/inductor/test_ordered_set.py::TestSet::test_init, test/inductor/test_ordered_set.py::TestSet::test_inplace_on_self, test/inductor/test_ordered_set.py::TestSet::test_intersection, test/inductor/test_ordered_set.py::TestSet::test_intersection_update, test/inductor/test_ordered_set.py::TestSet::test_ior, test/inductor/test_ordered_set.py::TestSet::test_isdisjoint, test/inductor/test_ordered_set.py::TestSet::test_isub, test/inductor/test_ordered_set.py::TestSet::test_iterator_pickling, test/inductor/test_ordered_set.py::TestSet::test_ixor, test/inductor/test_ordered_set.py::TestSet::test_len, test/inductor/test_ordered_set.py::TestSet::test_new_or_init, test/inductor/test_ordered_set.py::TestSet::test_or, test/inductor/test_ordered_set.py::TestSet::test_pickling, test/inductor/test_ordered_set.py::TestSet::test_pop, test/inductor/test_ordered_set.py::TestSet::test_remove, test/inductor/test_ordered_set.py::TestSet::test_remove_keyerror_set, test/inductor/test_ordered_set.py::TestSet::test_remove_keyerror_unpacking, test/inductor/test_ordered_set.py::TestSet::test_rich_compare, test/inductor/test_ordered_set.py::TestSet::test_setOfFrozensets, test/inductor/test_ordered_set.py::TestSet::test_set_literal, test/inductor/test_ordered_set.py::TestSet::test_set_literal_evaluation_order, test/inductor/test_ordered_set.py::TestSet::test_set_literal_insertion_order, test/inductor/test_ordered_set.py::TestSet::test_sub, test/inductor/test_ordered_set.py::TestSet::test_sub_and_super, test/inductor/test_ordered_set.py::TestSet::test_subclass_with_custom_hash, test/inductor/test_ordered_set.py::TestSet::test_symmetric_difference, test/inductor/test_ordered_set.py::TestSet::test_symmetric_difference_update, test/inductor/test_ordered_set.py::TestSet::test_union, test/inductor/test_ordered_set.py::TestSet::test_uniquification, test/inductor/test_ordered_set.py::TestSet::test_update, test/inductor/test_ordered_set.py::TestSet::test_weakref, test/inductor/test_ordered_set.py::TestSet::test_xor, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_copy, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_empty_difference, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_empty_difference_rev, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_empty_intersection, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_empty_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_empty_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_empty_union, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_equivalent_equality, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_intersection_empty, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_isdisjoint_empty, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_issue_37219, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_iteration, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_length, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_pickling, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_repr, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_self_difference, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_self_equality, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_self_intersection, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_self_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_self_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_self_union, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_union_empty, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_copy, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_empty_difference, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_empty_difference_rev, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_empty_intersection, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_empty_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_empty_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_empty_union, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_equivalent_equality, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_in, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_intersection_empty, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_isdisjoint_empty, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_issue_37219, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_iteration, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_length, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_not_in, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_pickling, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_repr, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_self_difference, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_self_equality, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_self_intersection, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_self_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_self_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_self_union, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_union_empty, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_copy, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_empty_difference, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_empty_difference_rev, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_empty_intersection, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_empty_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_empty_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_empty_union, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_equivalent_equality, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_in, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_intersection_empty, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_isdisjoint_empty, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_issue_37219, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_iteration, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_length, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_not_in, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_pickling, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_repr, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_self_difference, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_self_equality, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_self_intersection, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_self_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_self_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_self_union, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_union_empty, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_copy, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_empty_difference, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_empty_difference_rev, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_empty_intersection, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_empty_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_empty_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_empty_union, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_equivalent_equality, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_intersection_empty, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_isdisjoint_empty, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_issue_37219, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_iteration, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_length, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_pickling, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_repr, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_self_difference, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_self_equality, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_self_intersection, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_self_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_self_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_self_union, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_union_empty, test/inductor/test_ordered_set.py::TestBasicOpsString::test_copy, test/inductor/test_ordered_set.py::TestBasicOpsString::test_empty_difference, test/inductor/test_ordered_set.py::TestBasicOpsString::test_empty_difference_rev, test/inductor/test_ordered_set.py::TestBasicOpsString::test_empty_intersection, test/inductor/test_ordered_set.py::TestBasicOpsString::test_empty_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsString::test_empty_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsString::test_empty_union, test/inductor/test_ordered_set.py::TestBasicOpsString::test_equivalent_equality, test/inductor/test_ordered_set.py::TestBasicOpsString::test_intersection_empty, test/inductor/test_ordered_set.py::TestBasicOpsString::test_isdisjoint_empty, test/inductor/test_ordered_set.py::TestBasicOpsString::test_issue_37219, test/inductor/test_ordered_set.py::TestBasicOpsString::test_iteration, test/inductor/test_ordered_set.py::TestBasicOpsString::test_length, test/inductor/test_ordered_set.py::TestBasicOpsString::test_pickling, test/inductor/test_ordered_set.py::TestBasicOpsString::test_repr, test/inductor/test_ordered_set.py::TestBasicOpsString::test_self_difference, test/inductor/test_ordered_set.py::TestBasicOpsString::test_self_equality, test/inductor/test_ordered_set.py::TestBasicOpsString::test_self_intersection, test/inductor/test_ordered_set.py::TestBasicOpsString::test_self_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsString::test_self_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsString::test_self_union, test/inductor/test_ordered_set.py::TestBasicOpsString::test_union_empty, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_copy, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_empty_difference, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_empty_difference_rev, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_empty_intersection, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_empty_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_empty_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_empty_union, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_equivalent_equality, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_intersection_empty, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_isdisjoint_empty, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_issue_37219, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_iteration, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_length, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_pickling, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_repr, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_self_difference, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_self_equality, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_self_intersection, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_self_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_self_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_self_union, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_union_empty, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_copy, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_empty_difference, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_empty_difference_rev, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_empty_intersection, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_empty_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_empty_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_empty_union, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_equivalent_equality, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_intersection_empty, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_isdisjoint_empty, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_issue_37219, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_iteration, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_length, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_pickling, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_repr, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_self_difference, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_self_equality, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_self_intersection, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_self_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_self_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_self_union, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_union_empty, test/inductor/test_ordered_set.py::TestExceptionPropagation::test_changingSizeWhileIterating, test/inductor/test_ordered_set.py::TestExceptionPropagation::test_instanceWithException, test/inductor/test_ordered_set.py::TestExceptionPropagation::test_instancesWithoutException, test/inductor/test_ordered_set.py::TestSetOfSets::test_constructor, test/inductor/test_ordered_set.py::TestBinaryOps::test_eq, test/inductor/test_ordered_set.py::TestBinaryOps::test_intersection_non_overlap, test/inductor/test_ordered_set.py::TestBinaryOps::test_intersection_overlap, test/inductor/test_ordered_set.py::TestBinaryOps::test_intersection_subset, test/inductor/test_ordered_set.py::TestBinaryOps::test_intersection_superset, test/inductor/test_ordered_set.py::TestBinaryOps::test_isdisjoint_non_overlap, test/inductor/test_ordered_set.py::TestBinaryOps::test_isdisjoint_overlap, test/inductor/test_ordered_set.py::TestBinaryOps::test_isdisjoint_subset, test/inductor/test_ordered_set.py::TestBinaryOps::test_isdisjoint_superset, test/inductor/test_ordered_set.py::TestBinaryOps::test_sym_difference_non_overlap, test/inductor/test_ordered_set.py::TestBinaryOps::test_sym_difference_overlap, test/inductor/test_ordered_set.py::TestBinaryOps::test_sym_difference_subset, test/inductor/test_ordered_set.py::TestBinaryOps::test_sym_difference_superset, test/inductor/test_ordered_set.py::TestBinaryOps::test_union_non_overlap, test/inductor/test_ordered_set.py::TestBinaryOps::test_union_overlap, test/inductor/test_ordered_set.py::TestBinaryOps::test_union_subset, test/inductor/test_ordered_set.py::TestBinaryOps::test_union_superset, test/inductor/test_ordered_set.py::TestUpdateOps::test_difference_method_call, test/inductor/test_ordered_set.py::TestUpdateOps::test_difference_non_overlap, test/inductor/test_ordered_set.py::TestUpdateOps::test_difference_overlap, test/inductor/test_ordered_set.py::TestUpdateOps::test_difference_subset, test/inductor/test_ordered_set.py::TestUpdateOps::test_difference_superset, test/inductor/test_ordered_set.py::TestUpdateOps::test_intersection_method_call, test/inductor/test_ordered_set.py::TestUpdateOps::test_intersection_non_overlap, test/inductor/test_ordered_set.py::TestUpdateOps::test_intersection_overlap, test/inductor/test_ordered_set.py::TestUpdateOps::test_intersection_subset, test/inductor/test_ordered_set.py::TestUpdateOps::test_intersection_superset, test/inductor/test_ordered_set.py::TestUpdateOps::test_sym_difference_method_call, test/inductor/test_ordered_set.py::TestUpdateOps::test_sym_difference_non_overlap, test/inductor/test_ordered_set.py::TestUpdateOps::test_sym_difference_overlap, test/inductor/test_ordered_set.py::TestUpdateOps::test_sym_difference_subset, test/inductor/test_ordered_set.py::TestUpdateOps::test_sym_difference_superset, test/inductor/test_ordered_set.py::TestUpdateOps::test_union_method_call, test/inductor/test_ordered_set.py::TestUpdateOps::test_union_non_overlap, test/inductor/test_ordered_set.py::TestUpdateOps::test_union_overlap, test/inductor/test_ordered_set.py::TestUpdateOps::test_union_subset, test/inductor/test_ordered_set.py::TestUpdateOps::test_union_superset, test/inductor/test_ordered_set.py::TestMutate::test_add_absent, test/inductor/test_ordered_set.py::TestMutate::test_add_present, test/inductor/test_ordered_set.py::TestMutate::test_add_until_full, test/inductor/test_ordered_set.py::TestMutate::test_clear, test/inductor/test_ordered_set.py::TestMutate::test_discard_absent, test/inductor/test_ordered_set.py::TestMutate::test_discard_present, test/inductor/test_ordered_set.py::TestMutate::test_pop, test/inductor/test_ordered_set.py::TestMutate::test_remove_absent, test/inductor/test_ordered_set.py::TestMutate::test_remove_present, test/inductor/test_ordered_set.py::TestMutate::test_remove_until_empty, test/inductor/test_ordered_set.py::TestMutate::test_update_empty_tuple, test/inductor/test_ordered_set.py::TestMutate::test_update_unit_tuple_non_overlap, test/inductor/test_ordered_set.py::TestMutate::test_update_unit_tuple_overlap, test/inductor/test_ordered_set.py::TestSubsets::test_issubset, test/inductor/test_ordered_set.py::TestSubsetEqualEmpty::test_issubset, test/inductor/test_ordered_set.py::TestSubsetEqualNonEmpty::test_issubset, test/inductor/test_ordered_set.py::TestSubsetEmptyNonEmpty::test_issubset, test/inductor/test_ordered_set.py::TestSubsetPartial::test_issubset, test/inductor/test_ordered_set.py::TestSubsetNonOverlap::test_issubset, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_difference, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_eq_ne, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_ge_gt_le_lt, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_intersection, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_intersection_update, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_intersection_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_sym_difference, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_sym_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_sym_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_union, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_update, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_difference, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_eq_ne, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_ge_gt_le_lt, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_intersection, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_intersection_update, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_intersection_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_sym_difference, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_sym_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_sym_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_union, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_update, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_difference, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_eq_ne, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_ge_gt_le_lt, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_intersection, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_intersection_update, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_intersection_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_sym_difference, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_sym_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_sym_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_union, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_update, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_difference, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_eq_ne, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_ge_gt_le_lt, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_intersection, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_intersection_update, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_intersection_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_sym_difference, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_sym_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_sym_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_union, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_update, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsString::test_difference, test/inductor/test_ordered_set.py::TestOnlySetsString::test_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsString::test_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsString::test_eq_ne, test/inductor/test_ordered_set.py::TestOnlySetsString::test_ge_gt_le_lt, test/inductor/test_ordered_set.py::TestOnlySetsString::test_intersection, test/inductor/test_ordered_set.py::TestOnlySetsString::test_intersection_update, test/inductor/test_ordered_set.py::TestOnlySetsString::test_intersection_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsString::test_sym_difference, test/inductor/test_ordered_set.py::TestOnlySetsString::test_sym_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsString::test_sym_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsString::test_union, test/inductor/test_ordered_set.py::TestOnlySetsString::test_update, test/inductor/test_ordered_set.py::TestOnlySetsString::test_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_difference, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_eq_ne, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_ge_gt_le_lt, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_intersection, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_intersection_update, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_intersection_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_sym_difference, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_sym_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_sym_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_union, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_update, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_update_operator, test/inductor/test_ordered_set.py::TestCopyingEmpty::test_copy, test/inductor/test_ordered_set.py::TestCopyingEmpty::test_deep_copy, test/inductor/test_ordered_set.py::TestCopyingSingleton::test_copy, test/inductor/test_ordered_set.py::TestCopyingSingleton::test_deep_copy, test/inductor/test_ordered_set.py::TestCopyingTriple::test_copy, test/inductor/test_ordered_set.py::TestCopyingTriple::test_deep_copy, test/inductor/test_ordered_set.py::TestCopyingTuple::test_copy, test/inductor/test_ordered_set.py::TestCopyingTuple::test_deep_copy, test/inductor/test_ordered_set.py::TestCopyingNested::test_copy, test/inductor/test_ordered_set.py::TestCopyingNested::test_deep_copy, test/inductor/test_ordered_set.py::TestIdentities::test_binopsVsSubsets, test/inductor/test_ordered_set.py::TestIdentities::test_commutativity, test/inductor/test_ordered_set.py::TestIdentities::test_exclusion, test/inductor/test_ordered_set.py::TestIdentities::test_summations, test/inductor/test_ordered_set.py::TestVariousIteratorArgs::test_constructor, test/inductor/test_ordered_set.py::TestVariousIteratorArgs::test_inline_methods, test/inductor/test_ordered_set.py::TestVariousIteratorArgs::test_inplace_methods, test/inductor/test_ordered_set.py::TestWeirdBugs::test_8420_set_merge, test/inductor/test_ordered_set.py::TestWeirdBugs::test_iter_and_mutate, test/inductor/test_ordered_set.py::TestWeirdBugs::test_merge_and_mutate, test/inductor/test_ordered_set.py::TestGraphs::test_cube, test/inductor/test_ordered_set.py::TestGraphs::test_cuboctahedron 2025-10-10T01:59:40.3432672Z 2025-10-10T01:59:40.5360625Z Running inductor/test_split_cat_fx_aten_passes 1/1 ... [2025-10-10 01:59:40.535440] 2025-10-10T01:59:40.5361170Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:59:40.5362314Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_split_cat_fx_aten_passes.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:59:40.535814] 2025-10-10T01:59:44.2485776Z Running dynamo/test_activation_checkpointing 1/1 ... [2025-10-10 01:59:44.247981] 2025-10-10T01:59:44.2486274Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:59:44.2488674Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_activation_checkpointing.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:59:44.248429] 2025-10-10T01:59:47.8164240Z 2025-10-10T01:59:47.8165343Z inductor/test_split_cat_fx_aten_passes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_split_cat_fx_aten_passes_1.1_a8aba974cbdd3456_.log 2025-10-10T01:59:47.8168002Z Running 5 items in this shard: test/inductor/test_split_cat_fx_aten_passes.py::TestSplitCatAten::test_move_view_after_cat_aten, test/inductor/test_split_cat_fx_aten_passes.py::TestSplitCatAten::test_select_cat_post_grad, test/inductor/test_split_cat_fx_aten_passes.py::TestSplitCatAten::test_split_cat_post_grad, test/inductor/test_split_cat_fx_aten_passes.py::TestSplitCatAten::test_split_cat_post_grad_singular, test/inductor/test_split_cat_fx_aten_passes.py::TestSplitCatAtenNormalizationPasses::test_split_aten_normalization 2025-10-10T01:59:47.8169948Z 2025-10-10T01:59:51.6285871Z 2025-10-10T01:59:51.6287359Z dynamo/test_activation_checkpointing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_activation_checkpointing_1.1_20f97b959dffeba5_.log 2025-10-10T01:59:51.6314299Z Running 33 items in this shard: test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_autocast_flash_attention_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_compile_selective_checkpoint_custom_rule_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_compile_selective_checkpoint_inplace_op_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_compile_selective_checkpoint_invalid_context_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_compile_selective_checkpoint_list_ops_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_compile_selective_checkpoint_must_not_recompute_gemm_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_compile_selective_checkpoint_must_recompute_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_compile_selective_checkpoint_outplace_op_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_compile_selective_checkpoint_parametrization_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_compile_selective_checkpoint_partial_ctx_fn_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_compile_selective_checkpoint_random_op_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_compile_selective_checkpoint_tensor_subclass_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_compile_selective_checkpoint_triton_kernel_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_distributed_utils_checkpoint_wrapper_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_dynamo_does_not_trace_getattr_as_top_frame_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_error_msg_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_fallback_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_kwargs_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_list_inputs_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_pattern_matcher_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_sac_with_partial_context_fn_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_symints_location_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_tags_decomps_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_tags_dropout_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_tags_function_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_tags_function_via_global_checkpoint_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_tags_function_with_kwargs_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_tags_module_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_tags_multiple_checkpoints_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_tags_must_save_tensor_that_has_backward_hook_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_tags_rand_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_tags_recomputed_rand_cuda, test/dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_tags_sequential_layers_cuda 2025-10-10T01:59:51.6339842Z 2025-10-10T01:59:51.6861791Z Running dynamo/test_compiler_bisector 1/1 ... [2025-10-10 01:59:51.685541] 2025-10-10T01:59:51.6862472Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:59:51.6863900Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_compiler_bisector.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:59:51.685917] 2025-10-10T01:59:55.5791299Z Running dynamo/test_aot_compile 1/1 ... [2025-10-10 01:59:55.578561] 2025-10-10T01:59:55.5792036Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T01:59:55.5793746Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_aot_compile.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 01:59:55.578944] 2025-10-10T01:59:59.1164036Z 2025-10-10T01:59:59.1165541Z dynamo/test_compiler_bisector 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_compiler_bisector_1.1_2b248deb7f14704d_.log 2025-10-10T01:59:59.1169343Z Running 8 items in this shard: test/dynamo/test_compiler_bisector.py::TestCompilerBisector::test_bad_decomp, test/dynamo/test_compiler_bisector.py::TestCompilerBisector::test_bad_lowering, test/dynamo/test_compiler_bisector.py::TestCompilerBisector::test_crossref, test/dynamo/test_compiler_bisector.py::TestCompilerBisector::test_eager_backend, test/dynamo/test_compiler_bisector.py::TestCompilerBisector::test_emulate_precision_casts, test/dynamo/test_compiler_bisector.py::TestCompilerBisector::test_joint_graph, test/dynamo/test_compiler_bisector.py::TestCompilerBisector::test_pre_grad, test/dynamo/test_compiler_bisector.py::TestCompilerBisector::test_rng 2025-10-10T01:59:59.1172099Z 2025-10-10T01:59:59.7023506Z 2025-10-10T01:59:59.7024517Z dynamo/test_aot_compile 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_aot_compile_1.1_724050367795ef40_.log 2025-10-10T01:59:59.7029505Z Running 12 items in this shard: test/dynamo/test_aot_compile.py::TestAOTCompile::test_aot_compile_basic_fn, test/dynamo/test_aot_compile.py::TestAOTCompile::test_aot_compile_basic_fn_inductor, test/dynamo/test_aot_compile.py::TestAOTCompile::test_aot_compile_basic_forward, test/dynamo/test_aot_compile.py::TestAOTCompile::test_aot_compile_disable_guard_check, test/dynamo/test_aot_compile.py::TestAOTCompile::test_aot_compile_graph_break_error_fmt, test/dynamo/test_aot_compile.py::TestAOTCompile::test_aot_compile_module, test/dynamo/test_aot_compile.py::TestAOTCompile::test_aot_compile_repeat_interleave, test/dynamo/test_aot_compile.py::TestAOTCompile::test_aot_compile_source_info, test/dynamo/test_aot_compile.py::TestAOTCompile::test_aot_module_simplified_serializable_autograd, test/dynamo/test_aot_compile.py::TestAOTCompile::test_aot_module_simplified_serializable_inference, test/dynamo/test_aot_compile.py::TestAOTCompile::test_decorated_function_aot, test/dynamo/test_aot_compile.py::TestAOTCompile::test_guard_filter_override_aot 2025-10-10T01:59:59.7033320Z 2025-10-10T02:00:03.0894011Z Running dynamo/test_modes 1/1 ... [2025-10-10 02:00:03.088771] 2025-10-10T02:00:03.0894616Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:00:03.0896179Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_modes.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:00:03.089212] 2025-10-10T02:00:03.5756859Z Running inductor/test_auto_functionalize 1/1 ... [2025-10-10 02:00:03.575141] 2025-10-10T02:00:03.5757517Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:00:03.5759358Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_auto_functionalize.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:00:03.575515] 2025-10-10T02:00:07.6986354Z 2025-10-10T02:00:07.6987627Z inductor/test_auto_functionalize 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_auto_functionalize_1.1_d3b7a61204f59239_.log 2025-10-10T02:00:07.7002325Z Running 39 items in this shard: test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_alias, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_alias2, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_alias2_dynamic, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_alias_id_input_to_custom_op, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_alias_id_output, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_auto_functionalize_can_with_default, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_auto_functionalize_can_with_none_return, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_auto_functionalize_extra1, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_auto_functionalize_extra2, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_auto_functionalize_extra3, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_auto_functionalize_extra4, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_auto_functionalize_extra5, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_auto_functionalize_old, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_auto_functionalize_on_view, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_auto_functionalize_optional_old, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_auto_functionalize_optional_v2, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_auto_functionalize_self_as_mutate_arg, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_auto_functionalize_tensorlist, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_auto_functionalize_v2, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_auto_functionalize_with_returns_old, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_auto_functionalize_with_returns_v2, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_can_auto_functionalize, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_dynamic2_v2, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_dynamic3_v2, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_dynamic_v2, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_graph_input_is_view, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_inference_mode1_v2, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_inference_mode2_v2, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_inference_mode3_v2, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_inference_mode4_v2, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_inference_mode_view, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_recompile, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_scheduling_with_multiple_mutates, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_slice, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_slice_dynamic, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_split, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_split_dynamic, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_try_use_slice, test/inductor/test_auto_functionalize.py::AutoFunctionalizeTests::test_unbacked_auto_functionalize_op 2025-10-10T02:00:07.7016197Z 2025-10-10T02:00:11.5894622Z Running inductor/test_torchinductor_codegen_config_overrides 1/1 ... [2025-10-10 02:00:11.588894] 2025-10-10T02:00:11.5895401Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:00:11.5897571Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_codegen_config_overrides.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:00:11.589330] 2025-10-10T02:00:18.9205486Z 2025-10-10T02:00:18.9206755Z inductor/test_torchinductor_codegen_config_overrides 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_codegen_config_overrides_1.1_ed505f213deffd89_.log 2025-10-10T02:00:18.9209137Z Running 3 items in this shard: test/inductor/test_torchinductor_codegen_config_overrides.py::CodegenInductorTest::test_force_pointwise_cat_force_pointwise_cat_False, test/inductor/test_torchinductor_codegen_config_overrides.py::CodegenInductorTest::test_force_pointwise_cat_force_pointwise_cat_True, test/inductor/test_torchinductor_codegen_config_overrides.py::CodegenInductorTest::test_kernel_fusion_thresholds 2025-10-10T02:00:18.9210892Z 2025-10-10T02:00:22.8328125Z Running dynamo/test_profiler 1/1 ... [2025-10-10 02:00:22.832170] 2025-10-10T02:00:22.8328838Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:00:22.8331023Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_profiler.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:00:22.832587] 2025-10-10T02:00:24.2490395Z 2025-10-10T02:00:24.2491676Z dynamo/test_modes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_modes_1.1_4dd8be0f57b8efb5_.log 2025-10-10T02:00:24.2501615Z Running 28 items in this shard: test/dynamo/test_modes.py::TorchDispatchModeTests::test_skip_torch_dispatch_modes, test/dynamo/test_modes.py::TorchDispatchModeTests::test_torch_dispatch_ignore_compile_internals, test/dynamo/test_modes.py::TorchFunctionModeTests::test_builtin_equivalent_funcs, test/dynamo/test_modes.py::TorchFunctionModeTests::test_error_empty_stack_pop_torch_function_mode, test/dynamo/test_modes.py::TorchFunctionModeTests::test_expand, test/dynamo/test_modes.py::TorchFunctionModeTests::test_flex_attention, test/dynamo/test_modes.py::TorchFunctionModeTests::test_hop, test/dynamo/test_modes.py::TorchFunctionModeTests::test_hop_eager, test/dynamo/test_modes.py::TorchFunctionModeTests::test_intermedate_torch_function_mode_construction_mutation, test/dynamo/test_modes.py::TorchFunctionModeTests::test_is_torch_function_all_disabled, test/dynamo/test_modes.py::TorchFunctionModeTests::test_len_torch_function_mode, test/dynamo/test_modes.py::TorchFunctionModeTests::test_nested_torch_function_mode, test/dynamo/test_modes.py::TorchFunctionModeTests::test_pop_torch_function_mode, test/dynamo/test_modes.py::TorchFunctionModeTests::test_push_torch_function_mode, test/dynamo/test_modes.py::TorchFunctionModeTests::test_register_hook, test/dynamo/test_modes.py::TorchFunctionModeTests::test_stack_state_clear_default_device, test/dynamo/test_modes.py::TorchFunctionModeTests::test_stack_state_mutation_default_device, test/dynamo/test_modes.py::TorchFunctionModeTests::test_torch_function_mode_and_pop_graph_break, test/dynamo/test_modes.py::TorchFunctionModeTests::test_torch_function_mode_and_pop_graph_break_mutation, test/dynamo/test_modes.py::TorchFunctionModeTests::test_torch_function_mode_disable, test/dynamo/test_modes.py::TorchFunctionModeTests::test_torch_function_mode_enabled_guard, test/dynamo/test_modes.py::TorchFunctionModeTests::test_torch_function_mode_enter_exit, test/dynamo/test_modes.py::TorchFunctionModeTests::test_torch_function_mode_graph_break, test/dynamo/test_modes.py::TorchFunctionModeTests::test_torch_function_mode_guards_cpp, test/dynamo/test_modes.py::TorchFunctionModeTests::test_torch_function_mode_guards_py, test/dynamo/test_modes.py::TorchFunctionModeTests::test_torch_function_mode_highest_priority, test/dynamo/test_modes.py::TorchFunctionModeTests::test_torch_function_mode_preserves_cuda_rng_state, test/dynamo/test_modes.py::TorchFunctionModeTests::test_torch_function_mode_restore_on_exc 2025-10-10T02:00:24.2510586Z 2025-10-10T02:00:26.9052190Z 2025-10-10T02:00:26.9053155Z dynamo/test_profiler 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_profiler_1.1_cb36058f70d351c9_.log 2025-10-10T02:00:26.9057168Z Running 10 items in this shard: test/dynamo/test_profiler.py::DynamoProfilerTests::test_dynamo_timed_profiling_backend_compile, test/dynamo/test_profiler.py::DynamoProfilerTests::test_dynamo_timed_profiling_isolated, test/dynamo/test_profiler.py::DynamoProfilerTests::test_execution_trace_dynamic_shapes, test/dynamo/test_profiler.py::DynamoProfilerTests::test_profile_dynamic_shapes_compilation, test/dynamo/test_profiler.py::DynamoProfilerTests::test_profile_dynamic_shapes_list_compilation, test/dynamo/test_profiler.py::DynamoProfilerTests::test_profile_dynamic_shapes_runtime, test/dynamo/test_profiler.py::DynamoProfilerTests::test_profiler_cache_lookup, test/dynamo/test_profiler.py::DynamoProfilerTests::test_profiler_cache_lookup_profiler_step, test/dynamo/test_profiler.py::DynamoProfilerTests::test_profiler_dynamo_compiled_region, test/dynamo/test_profiler.py::DynamoProfilerTests::test_profiler_enabled_export 2025-10-10T02:00:26.9060527Z 2025-10-10T02:00:28.1697714Z Running dynamo/test_global 1/1 ... [2025-10-10 02:00:28.169242] 2025-10-10T02:00:28.1699995Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:00:28.1702025Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_global.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:00:28.169823] 2025-10-10T02:00:30.8183569Z Running inductor/test_inductor_freezing 1/1 ... [2025-10-10 02:00:30.817803] 2025-10-10T02:00:30.8184384Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:00:30.8185775Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_inductor_freezing.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:00:30.818185] 2025-10-10T02:00:32.2126171Z 2025-10-10T02:00:32.2127203Z dynamo/test_global 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_global_1.1_43e046d083585564_.log 2025-10-10T02:00:32.2130729Z Running 12 items in this shard: test/dynamo/test_global.py::TestGlobals::test_store_global_1, test/dynamo/test_global.py::TestGlobals::test_store_global_2, test/dynamo/test_global.py::TestGlobals::test_store_global_cross_file, test/dynamo/test_global.py::TestGlobals::test_store_global_crossfile_inline, test/dynamo/test_global.py::TestGlobals::test_store_global_dict, test/dynamo/test_global.py::TestGlobals::test_store_global_dict_2, test/dynamo/test_global.py::TestGlobals::test_store_global_inline_1, test/dynamo/test_global.py::TestGlobals::test_store_global_inline_2, test/dynamo/test_global.py::TestGlobals::test_store_global_list, test/dynamo/test_global.py::TestGlobals::test_store_global_list_2, test/dynamo/test_global.py::TestGlobals::test_store_global_new, test/dynamo/test_global.py::TestGlobals::test_store_global_object 2025-10-10T02:00:32.2133567Z 2025-10-10T02:00:36.0809087Z Running dynamo/test_model_output 1/1 ... [2025-10-10 02:00:36.080220] 2025-10-10T02:00:36.0809646Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:00:36.0810963Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_model_output.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:00:36.080598] 2025-10-10T02:00:38.4488283Z 2025-10-10T02:00:38.4489643Z inductor/test_inductor_freezing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_inductor_freezing_1.1_74a06e8cebe9e633_.log 2025-10-10T02:00:38.4507024Z Running 48 items in this shard: test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_aliased_param_return_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_autocast_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_conv_bn_with_multi_bn_share_conv_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_conv_functional_bn_with_multi_bn_share_conv_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_conv_layout_convert_with_view_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_conv_multiple_uses_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_conv_weight_layout_convert_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_conv_with_as_strided_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_cpp_wrapper_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_dont_change_dtype_folding_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_error_on_eager_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_folded_conv_bn_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_folded_conv_bn_hardswish_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_folded_conv_bn_with_module_sharing_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_folded_conv_functional_bn_with_module_sharing_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_mm_concat_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_mutation_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_param_deallocated_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_redundant_clone_for_layout_convert_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_rng_op_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_static_indices_cudagraph_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_symint_not_folded_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_unequal_bias_horizontal_addmm_fusion_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_unfolded_bn_cpu, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_aliased_param_return_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_autocast_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_conv_bn_with_multi_bn_share_conv_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_conv_functional_bn_with_multi_bn_share_conv_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_conv_layout_convert_with_view_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_conv_multiple_uses_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_conv_weight_layout_convert_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_conv_with_as_strided_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_cpp_wrapper_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_dont_change_dtype_folding_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_error_on_eager_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_folded_conv_bn_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_folded_conv_bn_hardswish_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_folded_conv_bn_with_module_sharing_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_folded_conv_functional_bn_with_module_sharing_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_mm_concat_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_mutation_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_param_deallocated_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_redundant_clone_for_layout_convert_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_rng_op_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_static_indices_cudagraph_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_symint_not_folded_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_unequal_bias_horizontal_addmm_fusion_cuda, test/inductor/test_inductor_freezing.py::FreezingGpuTests::test_unfolded_bn_cuda 2025-10-10T02:00:38.4522936Z 2025-10-10T02:00:40.2050006Z 2025-10-10T02:00:40.2051483Z dynamo/test_model_output 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_model_output_1.1_ed271d65aed54ae0_.log 2025-10-10T02:00:40.2060031Z Running 18 items in this shard: test/dynamo/test_model_output.py::TestHFPretrained::test_pretrained, test/dynamo/test_model_output.py::TestHFPretrained::test_pretrained_non_const_attr, test/dynamo/test_model_output.py::TestModelOutput::test_mo_assign, test/dynamo/test_model_output.py::TestModelOutput::test_mo_create, test/dynamo/test_model_output.py::TestModelOutput::test_mo_from_outside, test/dynamo/test_model_output.py::TestModelOutput::test_mo_getattr, test/dynamo/test_model_output.py::TestModelOutput::test_mo_getattr_missing, test/dynamo/test_model_output.py::TestModelOutput::test_mo_getitem, test/dynamo/test_model_output.py::TestModelOutput::test_mo_index, test/dynamo/test_model_output.py::TestModelOutput::test_mo_init, test/dynamo/test_model_output.py::TestModelOutput::test_mo_init2, test/dynamo/test_model_output.py::TestModelOutput::test_mo_init_with_disable, test/dynamo/test_model_output.py::TestModelOutput::test_mo_newkey, test/dynamo/test_model_output.py::TestModelOutput::test_mo_reconstruct_bytecode, test/dynamo/test_model_output.py::TestModelOutput::test_mo_tuple, test/dynamo/test_model_output.py::TestModelOutput::test_none, test/dynamo/test_model_output.py::TestModelOutput::test_reconstruction, test/dynamo/test_model_output.py::TestModelOutputBertCUDA::test_HF_bert_model_output_cuda 2025-10-10T02:00:40.2068273Z 2025-10-10T02:00:42.4209438Z Running export/test_torchbind 1/1 ... [2025-10-10 02:00:42.420370] 2025-10-10T02:00:42.4209873Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:00:42.4211466Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_torchbind.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:00:42.420761] 2025-10-10T02:00:44.1302132Z Running dynamo/test_nested_graph_breaks 1/1 ... [2025-10-10 02:00:44.129517] 2025-10-10T02:00:44.1302608Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:00:44.1303657Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_nested_graph_breaks.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:00:44.129968] 2025-10-10T02:00:48.2041670Z 2025-10-10T02:00:48.2042879Z dynamo/test_nested_graph_breaks 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_nested_graph_breaks_1.1_df6a649fed81d0dc_.log 2025-10-10T02:00:48.2050580Z Running 16 items in this shard: test/dynamo/test_nested_graph_breaks.py::NestedGraphBreakTests::test_cells, test/dynamo/test_nested_graph_breaks.py::NestedGraphBreakTests::test_cells_double_graph_break, test/dynamo/test_nested_graph_breaks.py::NestedGraphBreakTests::test_differing_arg_nums, test/dynamo/test_nested_graph_breaks.py::NestedGraphBreakTests::test_differing_locals_nums, test/dynamo/test_nested_graph_breaks.py::NestedGraphBreakTests::test_doubly_nested_graph_break, test/dynamo/test_nested_graph_breaks.py::NestedGraphBreakTests::test_inactive_ctx_manager, test/dynamo/test_nested_graph_breaks.py::NestedGraphBreakTests::test_nested_graph_break_in_loop, test/dynamo/test_nested_graph_breaks.py::NestedGraphBreakTests::test_nested_graph_break_in_try_block, test/dynamo/test_nested_graph_breaks.py::NestedGraphBreakTests::test_nested_step_unsupported, test/dynamo/test_nested_graph_breaks.py::NestedGraphBreakTests::test_no_recompiles, test/dynamo/test_nested_graph_breaks.py::NestedGraphBreakTests::test_side_effects_cells, test/dynamo/test_nested_graph_breaks.py::NestedGraphBreakTests::test_side_effects_globals, test/dynamo/test_nested_graph_breaks.py::NestedGraphBreakTests::test_side_effects_globals_different_module, test/dynamo/test_nested_graph_breaks.py::NestedGraphBreakTests::test_single_graph_break, test/dynamo/test_nested_graph_breaks.py::NestedGraphBreakTests::test_single_graph_break_repeat, test/dynamo/test_nested_graph_breaks.py::NestedGraphBreakTests::test_supported_ctx_manager 2025-10-10T02:00:48.2056137Z 2025-10-10T02:00:52.1680308Z Running dynamo/test_backward_higher_order_ops 1/1 ... [2025-10-10 02:00:52.167376] 2025-10-10T02:00:52.1680826Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:00:52.1682049Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_backward_higher_order_ops.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:00:52.167839] 2025-10-10T02:00:56.3926801Z 2025-10-10T02:00:56.3928412Z dynamo/test_backward_higher_order_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_backward_higher_order_ops_1.1_c7ef207536f78c55_.log 2025-10-10T02:00:56.3932018Z Running 7 items in this shard: test/dynamo/test_backward_higher_order_ops.py::BackwardHigherOrderOpTests::test_invoke_in_eager, test/dynamo/test_backward_higher_order_ops.py::BackwardHigherOrderOpTests::test_invoke_in_pt2, test/dynamo/test_backward_higher_order_ops.py::BackwardHigherOrderOpTests::test_invoke_in_pt2_compiled_autograd, test/dynamo/test_backward_higher_order_ops.py::BackwardHigherOrderOpTests::test_invoke_in_pt2_compiled_autograd_graph_breaks, test/dynamo/test_backward_higher_order_ops.py::BackwardHigherOrderOpTests::test_invoke_in_pt2_compiled_autograd_side_effect, test/dynamo/test_backward_higher_order_ops.py::BackwardHigherOrderOpTests::test_invoke_make_bw, test/dynamo/test_backward_higher_order_ops.py::BackwardHigherOrderOpTests::test_invoke_make_fx_forward_contrived 2025-10-10T02:00:56.3934972Z 2025-10-10T02:01:00.3464722Z Running export/test_passes 1/1 ... [2025-10-10 02:01:00.345764] 2025-10-10T02:01:00.3465345Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:01:00.3466575Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_passes.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:01:00.346249] 2025-10-10T02:01:05.8740310Z 2025-10-10T02:01:05.8741244Z export/test_passes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_passes_1.1_f2eef6b4e8045550_.log 2025-10-10T02:01:05.8749416Z Running 28 items in this shard: test/export/test_passes.py::TestPasses::test_constant_folding_pass, test/export/test_passes.py::TestPasses::test_custom_obj_tuple_out, test/export/test_passes.py::TestPasses::test_fakify_script_objects, test/export/test_passes.py::TestPasses::test_fakify_script_objects_properly_handle_containers, test/export/test_passes.py::TestPasses::test_functionalization_with_view_copy, test/export/test_passes.py::TestPasses::test_inline_, test/export/test_passes.py::TestPasses::test_math_ops, test/export/test_passes.py::TestPasses::test_move_device_example_inputs, test/export/test_passes.py::TestPasses::test_move_device_submod, test/export/test_passes.py::TestPasses::test_move_device_to, test/export/test_passes.py::TestPasses::test_move_to_device_pass, test/export/test_passes.py::TestPasses::test_predispatch_autocast, test/export/test_passes.py::TestPasses::test_predispatch_autocast_and_set_grad, test/export/test_passes.py::TestPasses::test_predispatch_set_grad, test/export/test_passes.py::TestPasses::test_remove_auto_functionalized_pass, test/export/test_passes.py::TestPasses::test_remove_auto_functionalized_pass_tuple, test/export/test_passes.py::TestPasses::test_remove_effect_token_kwargs, test/export/test_passes.py::TestPasses::test_runtime_assert_inline_constraints_for_cond, test/export/test_passes.py::TestPasses::test_runtime_assert_inline_constraints_for_item, test/export/test_passes.py::TestPasses::test_runtime_assert_inline_constraints_for_nonzero, test/export/test_passes.py::TestPasses::test_runtime_assert_multiple_dims, test/export/test_passes.py::TestPasses::test_runtime_assert_one_dim, test/export/test_passes.py::TestPasses::test_runtime_assert_some_dims_not_specified, test/export/test_passes.py::TestPasses::test_runtime_assert_some_inps_not_used, test/export/test_passes.py::TestPasses::test_sequential_split, test/export/test_passes.py::TestPasses::test_sequential_split_graph, test/export/test_passes.py::TestPasses::test_view_to_view_copy, test/export/test_passes.py::TestPasses::test_views_op_having_view_copy 2025-10-10T02:01:05.8756685Z 2025-10-10T02:01:06.5376363Z 2025-10-10T02:01:06.5377632Z inductor/test_compiled_autograd 1/2 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_compiled_autograd_1.2_4ebc3f903fd9dde5_.log 2025-10-10T02:01:06.5566489Z Running 429 items in this shard: test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_accumulate_grad_polyfill_case_1_1, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_accumulate_grad_polyfill_case_1_2, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_accumulate_grad_polyfill_case_1_3, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_accumulate_grad_polyfill_case_1_5_2, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_accumulate_grad_polyfill_case_3_1, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_accumulate_grad_polyfill_case_3_2, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_anomaly_mode_already_nan, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_anomaly_mode_backward, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_anomaly_mode_grad, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_autograd_cpp_node_basic_is_traceable_True, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_autograd_cpp_node_data_dependent_is_traceable_True, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_autograd_cpp_node_id_is_traceable_True, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_autograd_cpp_node_non_traceable, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_autograd_cpp_node_saved_dynamic_is_traceable_True, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_autograd_cpp_node_saved_float_is_traceable_True, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_autograd_cpp_node_saved_int_is_traceable_False, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_autograd_cpp_node_saved_int_is_traceable_True, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_backward_hook_relative_ordering_partial, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_cache_hit, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_checkpointing_sac, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_checkpointing_simple_reentrant_False, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_checkpointing_simple_reentrant_True, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_compile_api_api_compile_backend_aot_eager, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_compile_api_api_compile_backend_eager, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_compile_api_api_compile_backend_inductor, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_compile_api_api_optimize_backend_aot_eager, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_compile_api_disable_api_compile_backend_eager, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_compile_api_disable_api_compile_backend_inductor, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_compiled_autograd_does_not_specialize_on_bw_symints, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_cpu_offloading, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_cudagraphs_cpu_graph, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_cudagraphs_cpu_scalar_used_in_cpp_custom_op, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_cudagraphs_cpu_scalar_used_in_python_custom_op, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_cudagraphs_sdpa, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_custom_fn_bw_graph_break, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_custom_fn_compiled_fw_bw_graph_break, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_custom_fn_dynamically_defined_class, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_custom_fn_multiple_grads, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_custom_fn_saved_attr, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_custom_fn_saved_multiple_tensors, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_custom_fn_saved_tensors, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_ddp_cpp_reducer_error, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_ddp_python_reducer, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_disk_offloading, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_dynamic_shapes_annotations, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_dynamic_shapes_eager_node, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_dynamo_boxed, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_flex_attention, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_free_activation_memory_subclass, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_higher_order_gradients, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_hipify_not_loaded_with_import_cpp_extension, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_hipify_not_loaded_with_import_torch, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_inplace_grad_update, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_inputs_aliasing_bytecode_stack_restore, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_issue106555, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_keep_graph_usage_after_compiled, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_logging_tensor_flaky, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_optimize_assert_backend_aot_eager, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_optimize_assert_backend_eager, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_optimize_assert_backend_inductor, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_output_nodes_all_leaves, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_reorder_multi_pre_hooks, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_reorder_multi_tensor_pre_hooks, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_reset, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_saved_tensor_unpack_hook_ordering, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_tensor_grad_hook1, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_tensor_grad_hook2, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_torch_compile_only_backward_call, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_torch_function_mode, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_trace_run_with_rng_state, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_verbose_logs_aot_dispatcher_nodes, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_verbose_logs_aot_dispatcher_nodes_hop, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_verbose_logs_cpp, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_verbose_logs_dynamic_shapes, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_verbose_logs_snapshot, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_access_saved_tensor_twice_without_recomputation_works, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_accumulate_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_accumulate_grad_posthooks_can_observe_tensor_prehook, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_accumulate_grad_posthooks_should_not_execute, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_accumulate_grad_with_zero_numel_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_anomaly_assign_parent_cleanup, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_anomaly_detect_nan, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_anomaly_mode_no_check_nan, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_autograd_inplace_view_of_view, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_autograd_inplace_views_creation_meta, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_autograd_inplace_views_cross_dtype, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_autograd_multiple_views_python, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_autograd_simple_views_python, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_autograd_views_codegen, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_backward_badcalls, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_backward_copy, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_backward_create_graph_warns, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_backward_hook_relative_ordering, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_backward_no_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_backward_twice_retained_graph_with_saved_values, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_backward_twice_with_saved_values, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_backward_with_inputs, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_calculate_shape_util, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_callback_adds_callback, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_cant_create_saved_tensors, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_checkpoint_detects_non_determinism, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_checkpoint_valid_reset_on_error, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_checkpointing, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_checkpointing_without_reentrant_correct_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_checkpointing_without_reentrant_custom_function_works, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_checkpointing_without_reentrant_dataparallel, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_checkpointing_without_reentrant_detached_tensor_use_reentrant_False, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_checkpointing_without_reentrant_input_requires_grad_False, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_checkpointing_without_reentrant_input_requires_grad_True, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_checkpointing_without_reentrant_memory_savings, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_create_graph_and_full_backward_hook_cycle, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_current_graph_task_execution_order, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_autograd_ac_early_stop, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_autograd_no_early_free, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_autograd_repeated_grad_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_cycle, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_error, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_exception, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_forward_mode_non_differentiable, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_forward_mode_non_tensor_before_tensor_args, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_forward_mode_wrong_formula, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_mark_dirty_not_differentiable, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_preserve_torch_function_when_return_as_is, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_saved_tensors, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_saving_mutated_view_no_leak, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_setup_context_simple, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_vmap_defaults, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_deep_reentrant, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_dep_nograd, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_dependent_backward, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_detach_base, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_detach_then_inplace_raises_in_autograd, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_disabling_saved_tensor_hooks, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_disabling_saved_tensor_hooks_nested, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_duplicate_backward_root, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_enable_grad_decorator_no_paren, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_first_grad_fn_access_in_no_grad_mode, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_free_deep_graph_complicated, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_free_deep_graph_pyfunction, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_function, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_get_data_and_hooks_from_raw_saved_variable, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_batched_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_empty_inputs, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_fn_badcalls, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_fn_input_metadata, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_fn_prehooks, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_fn_prehooks_multiple_outputs, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_nonleaf, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_nonleaf_register_hook, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_to_node, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_to_node_inplace, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_to_node_materialize, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_unreachable_discovery, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_check_batched_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_check_forward_or_backward_only, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_complex_non_complex_outputs, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_custom_error, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_dense_and_sparse_inputs, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_forward_ad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_forward_ad_respects_requires_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_forward_ad_runs_with_no_requires_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_input_layout2, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_input_layout4, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_output_shape_or_dtype_depend_on_values, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_test_outputs, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_validates_inputs, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_graph_save_on_cpu, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_hook_edge_case_when_called_with_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_hook_none, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_hooks_cpp, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_indexing, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_inplace_not_requires_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_inplace_on_view_backward, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_inplace_on_view_leaf_errors, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_inplace_on_view_weak_grad_fn, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_integer_outputs, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_legacy_function_deprecation_exception, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_lobpcg, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_mark_non_differentiable, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_materialize_grads, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_multi_backward, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_multi_backward_no_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_named_tensor_for_complex_views, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_naughty_anomaly_access, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_naughty_autograd_function_stashing_ctx, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_nested_anomaly_printstack_cleanup, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_next_functions, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_no_grad_python_function, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_no_requires_grad_inplace, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_no_unnecessary_save, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_not_implemented_fwad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_pickle, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_post_accumulate_grad_hook_gets_cleaned_up, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_post_accumulate_grad_hook_returns_not_None, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_pow_zero_tensor_gradient, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_power_function, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_prehook_ordering, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_profiler, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_profiler_aggregation_table, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_profiler_function_event_avg, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_profiler_seq_nr, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_profiler_shapes, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_record_function, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_reentrant_child_error, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_reentrant_with_callbacks_depth_0, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_reentrant_with_leaf_variable_hook, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_requires_grad_, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_retain_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_retain_grad_cycle, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_retains_grad_inplace_multiple_outputs, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_return_duplicate, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_return_duplicate_inplace, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_return_leaf, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_save_none_for_backward, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_save_on_cpu_and_checkpoint, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_save_output_nr, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_saved_tensor_hooks_custom_function_intermediates, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_saved_tensor_hooks_extra_enter_during_bw_no_leak, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_saved_variable_packing_unpacking_did_not_save_original_with_hooks, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_saved_variable_packing_unpacking_saved_original_with_default_hooks, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_saved_variable_version_counter, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_scalar_grad_mixed_device, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_select_expanded_v, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_set_data_tensorimpl_type, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_set_grad_coroutines_benign_exceptions, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_set_grad_enabled_wraps, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_set_grad_generator_functions, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_set_materialize_non_diff_grads, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_shape, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_sharded_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_sparse_gather_both_scalar, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_sparse_gather_dim_neg, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_sparse_gather_ind_scalar, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_tensor_grad_warnings, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_tensor_hooks_inplace_multiple_outputs, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_thread_shutdown, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_too_many_grads, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_unrelated_inputs, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_unused_output, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_var_mean_differentiable, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_version_counter, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_view_func_replay_with_modified_state, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_volatile_deprecated, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_will_engine_execute_node, test/inductor/test_compiled_autograd.py::TestNestedCheckpointWithCompiledAutograd::test_nested_checkpoint_early_stop_False, test/inductor/test_compiled_autograd.py::TestNestedCheckpointWithCompiledAutograd::test_nested_checkpoint_early_stop_True, test/inductor/test_compiled_autograd.py::TestNestedCheckpointWithCompiledAutograd::test_nested_checkpoint_kwargs_early_stop_True, test/inductor/test_compiled_autograd.py::TestNestedCheckpointWithCompiledAutograd::test_nested_checkpoint_non_tensor_inputs_and_outputs_early_stop_True, test/inductor/test_compiled_autograd.py::TestNestedCheckpointWithCompiledAutograd::test_nested_checkpoint_reentrant_backwards_early_stop_False, test/inductor/test_compiled_autograd.py::TestNestedCheckpointWithCompiledAutograd::test_nested_checkpoint_reentrant_backwards_early_stop_True, test/inductor/test_compiled_autograd.py::TestNestedCheckpointWithCompiledAutograd::test_nested_checkpoint_same_graph_early_stop_True, test/inductor/test_compiled_autograd.py::TestNestedCheckpointWithCompiledAutograd::test_nested_checkpoint_two_children_early_stop_False, test/inductor/test_compiled_autograd.py::TestNestedCheckpointWithCompiledAutograd::test_nested_checkpoint_two_children_early_stop_True, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_abstract_impl_on_existing_op, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_abstract_impl_on_existing_op_with_CompositeExplicitAutograd, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_backward_dict_grad_for_nontensor, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_backward_impl_on_existing_op_incorrect_schema_mutable, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_backward_impl_on_existing_op_incorrect_schema_no_output, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_backward_impl_on_existing_op_with_key_key_AutogradCUDA, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_backward_output_differentiability_tensorlist, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_backward_tensorlist_input_requires_list_grads_with_same_numel, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_basic_make_fx, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_data_dependent_basic, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_data_dependent_nms_dynamic_compile, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_defined_in_python, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_duplicate_impl, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_impl_abstract_overload, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_impl_device_cpu, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_impl_invalid_devices, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_impl_multiple, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_impl_on_existing_op_with_cpu_registration_key_CPU, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_impl_on_existing_op_with_cpu_registration_key_CompositeImplicitAutograd, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_impl_separate, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_infer_schema_supported, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_infer_schema_unsupported, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_invalid_qualname, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_invalid_schemas, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_is_functional_schema, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_is_tensorlist_like_type, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_legacy_define, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_legacy_impl, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_meta_for_data_dependent_shape_operation, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_name_must_match, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_new_data_dependent_symint, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_override_impl, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_override_meta, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_private_ctor, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_supported_param_types, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_symints, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_unsupported_schemas, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_allow_python_side_effects_utility, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_capture_constants, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_capture_input_num, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_capture_numpy_number, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_capture_tracked, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_capture_untracked_global_nested, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_cond_branches_no_arguments, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_cond_free_variable_in_both_branches, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_cond_graph_break_in_one_branch, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_cond_pytree_operands, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_cond_side_effect_in_one_branches, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_cond_source_fn_stack, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_cond_with_constant_pred, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_fallback_on_graph_break_simple, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_freevars_as_inputs_to_wrap, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_grad_source_fn_stack, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_hints_wrapper_no_hints, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_hopify_generic_wrap, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_internal_nonlocal, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_lift_tensors_with_compound_expressions, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_map_kwargs, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_map_lowers_to_graph, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_map_multi_return, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_map_pytree_return, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_map_source_fn_stack, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_map_subgraph_name_is_valid, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_nested_tuple_output, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_nested_wrap, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_no_freevars, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_output_with_dict, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_register_subclass, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_return_captured_var, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_return_captured_var_used_multiple_times, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_return_captured_vars, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_del_existing_attr_global_obj, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_del_existing_attr_nonlocal_obj, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_local_list_append_no_graph_break, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_mutate_global_list, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_mutate_global_num, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_mutate_global_num_builtin, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_mutate_global_tensor, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_mutate_nonlocal_num, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_mutate_nonlocal_num_builtin, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_mutate_nonlocal_tensor_builtin, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_nested_nonlocal_list_append_graph_break, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_nonlocal_list_append_graph_break, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_set_existing_attr_global_module, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_set_existing_attr_global_obj, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_set_existing_attr_nonlocal_module, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_set_new_attr_global_module, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_symint_in_slice, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_unbacked_symbol_closure, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_vmap_multiply_scalar, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_vmap_source_fn_stack, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_wrap_allow_local_assign_in_body_fn, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_wrap_kwarg, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_wrap_kwarg_default_else_branch, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_wrap_kwarg_only, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_wrap_kwarg_recompile, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_wrap_pytree_kwargs, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_wrap_source_fn_stack, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_functional_call_sequential_params_and_buffers, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_grad_call_compiled_backward_fn, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_grad_call_torch_compile_fn, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_grad_fn_with_kwargs, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_grad_freevar_python_scalar, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_grad_freevar_tensor, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_grad_has_aux, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_grad_pytree, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_grad_recompile, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_grad_with_graph_break, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_grad_with_side_effect, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_hessian, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_hessian_argnums, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_jacfwd, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_jacfwd_has_aux, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_jacrev_has_aux, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_jacrev_two_tensors_argnums, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_jvp_call_torch_compile_fn, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_jvp_freevar_tensor, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_jvp_has_aux, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_jvp_simple, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_jvp_two_tensors_has_aux, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vjp_call_compiled_backward_fn, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vjp_multiple_outputs, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vjp_multiple_outputs_python_struct, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_call_torch_compile_fn, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_free_const, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_multiple_invocation_in_dims, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_multiple_invocation_out_dims, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_multiple_outputs_diff_dims, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_over_vmap_captured, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_pytree_inputs, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_recompile, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_recompile_different_config, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_recompile_same_config, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_side_effects, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_side_effects_append_input, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_two_inputs, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_two_inputs_tuple_in_dims, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_with_conditional_graph_break, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_with_graph_break, test/inductor/test_compiled_autograd.py::ActivationCheckpointingTestsWithCompiledAutograd::test_cond_with_invalid_kwargs, test/inductor/test_compiled_autograd.py::ActivationCheckpointingTestsWithCompiledAutograd::test_dropout_inductor, test/inductor/test_compiled_autograd.py::ActivationCheckpointingTestsWithCompiledAutograd::test_flop_counter_for_cond, test/inductor/test_compiled_autograd.py::ActivationCheckpointingTestsWithCompiledAutograd::test_flop_counter_for_cond_unbalanced_branches, test/inductor/test_compiled_autograd.py::ActivationCheckpointingTestsWithCompiledAutograd::test_function, test/inductor/test_compiled_autograd.py::ActivationCheckpointingTestsWithCompiledAutograd::test_module, test/inductor/test_compiled_autograd.py::ActivationCheckpointingTestsWithCompiledAutograd::test_non_aliasing_util, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_device_mesh_compile, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dtensor_basic_export, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dtensor_constructor_w_dynamo_disable, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dtensor_constructor_w_graph_break, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dtensor_different_gradient_placement, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dtensor_dont_recompile_on_same_placement_devicemesh, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dtensor_dynamic, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dtensor_dynamic_loss_parallel_log_softmax, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dtensor_dynamic_slice, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dtensor_dynamo_device_mesh_attrs, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dtensor_partial_placement_graph_output, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dtensor_partial_placement_redistribute_unbalanced_correct_strides, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dynamo_dtensor, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dynamo_dtensor_from_local_dynamic_shapes, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dynamo_dtensor_from_local_redistribute, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dynamo_dtensor_from_local_redistribute_async, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dynamo_dtensor_recompile, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dynamo_to_local_kwargs, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dynamo_to_local_kwargs_forward_hook, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_fakify_dtensor, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_graph_input_is_async, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_placement_compile, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_unwrap_async_collective_tensor_tangent, test/inductor/test_compiled_autograd.py::TestCompiledAutogradOpInfoCUDA::test_hops_in_bwd_cond_simple_cuda_float32, test/inductor/test_compiled_autograd.py::TestCompiledAutogradOpInfoCUDA::test_hops_in_bwd_invoke_quant_packed_simple_cuda_float32, test/inductor/test_compiled_autograd.py::TestCompiledAutogradOpInfoCUDA::test_hops_in_bwd_invoke_subgraph_simple_cuda_float32, test/inductor/test_compiled_autograd.py::TestCompiledAutogradOpInfoCUDA::test_hops_in_bwd_map_nested_cuda_float32, test/inductor/test_compiled_autograd.py::TestCompiledAutogradOpInfoCUDA::test_hops_in_bwd_map_simple_cuda_float32, test/inductor/test_compiled_autograd.py::TestCompiledAutogradOpInfoCUDA::test_hops_in_bwd_while_loop_simple_cuda_float32 2025-10-10T02:01:06.5752740Z 2025-10-10T02:01:09.8826871Z Running inductor/test_torchbind 1/1 ... [2025-10-10 02:01:09.882093] 2025-10-10T02:01:09.8827334Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:01:09.8828565Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchbind.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:01:09.882520] 2025-10-10T02:01:10.4894981Z Running inductor/test_custom_partitioner_fn 1/1 ... [2025-10-10 02:01:10.488882] 2025-10-10T02:01:10.4895497Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:01:10.4896578Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_custom_partitioner_fn.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:01:10.489266] 2025-10-10T02:01:11.0981647Z 2025-10-10T02:01:11.0982675Z export/test_torchbind 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_torchbind_1.1_85b5c57de673f6fb_.log 2025-10-10T02:01:11.1021308Z Running 90 items in this shard: test/export/test_torchbind.py::TestExportTorchbind::test_aot_export_tensor_queue_operators, test/export/test_torchbind.py::TestExportTorchbind::test_attribute_as_custom_op_argument_pre_dispatch_False, test/export/test_torchbind.py::TestExportTorchbind::test_attribute_as_custom_op_argument_pre_dispatch_True, test/export/test_torchbind.py::TestExportTorchbind::test_attribute_pre_dispatch_False, test/export/test_torchbind.py::TestExportTorchbind::test_attribute_pre_dispatch_True, test/export/test_torchbind.py::TestExportTorchbind::test_custom_obj_list_out_pre_dispatch_False, test/export/test_torchbind.py::TestExportTorchbind::test_custom_obj_list_out_pre_dispatch_True, test/export/test_torchbind.py::TestExportTorchbind::test_custom_obj_tuple_out_pre_dispatch_False, test/export/test_torchbind.py::TestExportTorchbind::test_custom_obj_tuple_out_pre_dispatch_True, test/export/test_torchbind.py::TestExportTorchbind::test_custom_obj_unbacked_symint_pre_dispatch_False, test/export/test_torchbind.py::TestExportTorchbind::test_custom_obj_unbacked_symint_pre_dispatch_True, test/export/test_torchbind.py::TestExportTorchbind::test_deepcopy, test/export/test_torchbind.py::TestExportTorchbind::test_export_inplace_custom_op, test/export/test_torchbind.py::TestExportTorchbind::test_identifying_torchbind_ops, test/export/test_torchbind.py::TestExportTorchbind::test_input_as_custom_op_argument_pre_dispatch_False, test/export/test_torchbind.py::TestExportTorchbind::test_input_as_custom_op_argument_pre_dispatch_True, test/export/test_torchbind.py::TestExportTorchbind::test_input_pre_dispatch_False, test/export/test_torchbind.py::TestExportTorchbind::test_input_pre_dispatch_True, test/export/test_torchbind.py::TestExportTorchbind::test_make_fx_schema_checking_script_object, test/export/test_torchbind.py::TestExportTorchbind::test_make_fx_tensor_queue_methods_fakify_internal_states_make_fx_tracing_mode_fake, test/export/test_torchbind.py::TestExportTorchbind::test_make_fx_tensor_queue_methods_fakify_internal_states_make_fx_tracing_mode_symbolic, test/export/test_torchbind.py::TestExportTorchbind::test_make_fx_tensor_queue_methods_make_fx_tracing_mode_fake, test/export/test_torchbind.py::TestExportTorchbind::test_make_fx_tensor_queue_methods_make_fx_tracing_mode_symbolic, test/export/test_torchbind.py::TestExportTorchbind::test_make_fx_tensor_queue_operators_fallthrough_via_lib_impl, test/export/test_torchbind.py::TestExportTorchbind::test_make_fx_tensor_queue_operators_fallthrough_via_py_impl, test/export/test_torchbind.py::TestExportTorchbind::test_method_schema, test/export/test_torchbind.py::TestExportTorchbind::test_non_strict_export_methods, test/export/test_torchbind.py::TestExportTorchbind::test_none_pre_dispatch_False, test/export/test_torchbind.py::TestExportTorchbind::test_none_pre_dispatch_True, test/export/test_torchbind.py::TestExportTorchbind::test_safe_to_trace_with_real, test/export/test_torchbind.py::TestExportTorchbind::test_torchbind_alias_pre_dispatch_False, test/export/test_torchbind.py::TestExportTorchbind::test_torchbind_alias_pre_dispatch_True, test/export/test_torchbind.py::TestExportTorchbind::test_torchbind_input_and_alias_pre_dispatch_False, test/export/test_torchbind.py::TestExportTorchbind::test_torchbind_input_and_alias_pre_dispatch_True, test/export/test_torchbind.py::TestExportTorchbind::test_torchbind_op_fallthrough_keys_respects_lib_impl, test/export/test_torchbind.py::TestExportTorchbind::test_torchbind_op_register_fallthrough, test/export/test_torchbind.py::TestExportTorchbind::test_torchbind_register_attr_at_runtime_get_restored, test/export/test_torchbind.py::TestExportTorchbind::test_unlift_custom_obj_pre_dispatch_False, test/export/test_torchbind.py::TestExportTorchbind::test_unlift_custom_obj_pre_dispatch_True, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_body_aliasing_contents_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_body_aliasing_contents_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_body_aliasing_contents_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_input_aliasing_contents_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_input_aliasing_contents_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_input_aliasing_contents_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_non_fakified_method_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_non_fakified_method_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_non_fakified_method_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_script_obj_missing_attr_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_script_obj_missing_attr_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_script_obj_setattr_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_script_obj_setattr_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_global_obj_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_global_obj_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_global_obj_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_as_hop_input_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_as_hop_input_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_as_hop_input_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_attributes_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_attributes_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_attributes_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_closure_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_closure_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_closure_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_graph_breaks, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_torchbind_op_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_torchbind_op_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_torchbind_op_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_torchbind_op_with_autocast_device_cpu_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_torchbind_op_with_autocast_device_cpu_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_torchbind_op_with_autocast_device_cpu_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_torchbind_op_with_autocast_device_cuda_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_torchbind_op_with_autocast_device_cuda_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_torchbind_op_with_autocast_device_cuda_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_script_object_input_automatic_dynamic_shape, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_script_object_input_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_script_object_input_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_script_object_input_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_script_object_input_guards_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_script_object_input_guards_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_script_object_input_guards_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_tensor_op_in_tensor_flatten_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_tensor_op_in_tensor_flatten_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_tensor_op_in_tensor_flatten_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_export_obj_torchbind_op_with_autocast_device_cpu, test/export/test_torchbind.py::TestCompileTorchbind::test_export_obj_torchbind_op_with_autocast_device_cuda, test/export/test_torchbind.py::TestRegisterFakeClass::test_register_fake_class_from_real_not_classmethod, test/export/test_torchbind.py::TestRegisterFakeClass::test_register_fake_class_no_from_real, test/export/test_torchbind.py::TestRegisterFakeClass::test_register_fake_class_no_torch_bind_class, test/export/test_torchbind.py::TestRegisterFakeClass::test_register_fake_class_valid 2025-10-10T02:01:11.1054294Z 2025-10-10T02:01:14.9824669Z Running inductor/test_alignment 1/1 ... [2025-10-10 02:01:14.981952] 2025-10-10T02:01:14.9825125Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:01:14.9826873Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_alignment.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:01:14.982358] 2025-10-10T02:01:17.3131706Z 2025-10-10T02:01:17.3133288Z inductor/test_torchbind 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchbind_1.1_ec4c357d6bd3d8e1_.log 2025-10-10T02:01:17.3139005Z Running 16 items in this shard: test/inductor/test_torchbind.py::TestTorchbind::test_aoti_torchbind_name_collision, test/inductor/test_torchbind.py::TestTorchbind::test_torchbind_aot_compile, test/inductor/test_torchbind.py::TestTorchbind::test_torchbind_aot_compile_constant_folding, test/inductor/test_torchbind.py::TestTorchbind::test_torchbind_aoti, test/inductor/test_torchbind.py::TestTorchbind::test_torchbind_compile, test/inductor/test_torchbind.py::TestTorchbind::test_torchbind_compile_gpu_op_symint_graph_partition, test/inductor/test_torchbind.py::TestTorchbind::test_torchbind_compile_symint, test/inductor/test_torchbind.py::TestTorchbind::test_torchbind_config_not_generated, test/inductor/test_torchbind.py::TestTorchbind::test_torchbind_get_buf_bytes, test/inductor/test_torchbind.py::TestTorchbind::test_torchbind_hop_schema, test/inductor/test_torchbind.py::TestTorchbind::test_torchbind_hop_schema_no_input, test/inductor/test_torchbind.py::TestTorchbind::test_torchbind_hop_schema_no_output, test/inductor/test_torchbind.py::TestTorchbind::test_torchbind_inductor, test/inductor/test_torchbind.py::TestTorchbind::test_torchbind_input_aot_compile, test/inductor/test_torchbind.py::TestTorchbind::test_torchbind_list_return_aot_compile, test/inductor/test_torchbind.py::TestTorchbind::test_torchbind_queue 2025-10-10T02:01:17.3143890Z 2025-10-10T02:01:19.0717842Z 2025-10-10T02:01:19.0719391Z inductor/test_custom_partitioner_fn 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_custom_partitioner_fn_1.1_5695bb31f732dee1_.log 2025-10-10T02:01:19.0721173Z Running 1 items in this shard: test/inductor/test_custom_partitioner_fn.py::TestCustomPartitionerFn::test_custom_partitioner_fn 2025-10-10T02:01:19.0721970Z 2025-10-10T02:01:21.2886088Z Running dynamo/test_sources 1/1 ... [2025-10-10 02:01:21.288021] 2025-10-10T02:01:21.2887352Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:01:21.2889568Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_sources.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:01:21.288448] 2025-10-10T02:01:22.7129275Z 2025-10-10T02:01:22.7130417Z inductor/test_alignment 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_alignment_1.1_470237ed92fcad0c_.log 2025-10-10T02:01:22.7134925Z Running 12 items in this shard: test/inductor/test_alignment.py::GPUTests::test_Q4_K_dequantization_cuda, test/inductor/test_alignment.py::GPUTests::test_alignment_without_custom_op_cuda, test/inductor/test_alignment.py::GPUTests::test_incorrect_meta_for_custom_op_2d_cuda, test/inductor/test_alignment.py::GPUTests::test_no_align_for_custom_op_2d_cuda, test/inductor/test_alignment.py::GPUTests::test_no_align_for_custom_op_cuda, test/inductor/test_alignment.py::GPUTests::test_slice_cuda, test/inductor/test_alignment.py::GPUTests::test_slice_view_dtype_size_1024_cuda, test/inductor/test_alignment.py::GPUTests::test_slice_view_dtype_size_1048576_cuda, test/inductor/test_alignment.py::GPUTests::test_slice_view_dtype_size_128_cuda, test/inductor/test_alignment.py::GPUTests::test_unaligned_input_2d_cuda, test/inductor/test_alignment.py::GPUTests::test_unaligned_input_cuda, test/inductor/test_alignment.py::GPUTests::test_view_dtype_slice_cuda 2025-10-10T02:01:22.7149498Z 2025-10-10T02:01:22.9626259Z Running dynamo/test_resume 1/1 ... [2025-10-10 02:01:22.962146] 2025-10-10T02:01:22.9626761Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:01:22.9629117Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_resume.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:01:22.962529] 2025-10-10T02:01:25.2622020Z 2025-10-10T02:01:25.2622971Z dynamo/test_sources 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_sources_1.1_0c3a88f8f06b429e_.log 2025-10-10T02:01:25.2624308Z Running 3 items in this shard: test/dynamo/test_sources.py::SourceTests::test_is_local, test/dynamo/test_sources.py::SourceTests::test_property_closure, test/dynamo/test_sources.py::SourceTests::test_supported_nodes 2025-10-10T02:01:25.2625136Z 2025-10-10T02:01:26.6499501Z Running dynamo/test_debug_utils 1/1 ... [2025-10-10 02:01:26.649365] 2025-10-10T02:01:26.6500235Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:01:26.6502144Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_debug_utils.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:01:26.649763] 2025-10-10T02:01:26.8854606Z 2025-10-10T02:01:26.8856052Z dynamo/test_resume 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_resume_1.1_a7f6314d2933b47a_.log 2025-10-10T02:01:26.8857415Z Running 1 items in this shard: test/dynamo/test_resume.py::ResumeFunctionTests::test_freevars 2025-10-10T02:01:26.8858033Z 2025-10-10T02:01:29.1872388Z Running export/test_swap 1/1 ... [2025-10-10 02:01:29.186650] 2025-10-10T02:01:29.1872874Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:01:29.1874032Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_swap.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:01:29.187029] 2025-10-10T02:01:30.7779107Z Running dynamo/test_aot_autograd_cache 1/1 ... [2025-10-10 02:01:30.777346] 2025-10-10T02:01:30.7779595Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:01:30.7781205Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_aot_autograd_cache.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:01:30.777721] 2025-10-10T02:01:30.8230895Z 2025-10-10T02:01:30.8232103Z dynamo/test_debug_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_debug_utils_1.1_3a6d2a9374af50e4_.log 2025-10-10T02:01:30.8235394Z Running 4 items in this shard: test/dynamo/test_debug_utils.py::TestDebugUtilsCUDA::test_cast_model_to_fp64_dtype_args_cuda, test/dynamo/test_debug_utils.py::TestDebugUtilsCUDA::test_generate_env_vars_string_cuda, test/dynamo/test_debug_utils.py::TestDebugUtilsDeviceCUDA::test_aot_graph_parser_cuda, test/dynamo/test_debug_utils.py::TestDebugUtilsDeviceCUDA::test_sym_aot_graph_parser_cuda 2025-10-10T02:01:30.8236921Z 2025-10-10T02:01:33.2101014Z 2025-10-10T02:01:33.2102097Z export/test_swap 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_swap_1.1_70b480f71e377bb0_.log 2025-10-10T02:01:33.2111814Z Running 20 items in this shard: test/export/test_swap.py::TestSwap_nonstrict::test_custom_input_args, test/export/test_swap.py::TestSwap_nonstrict::test_custom_input_kwargs, test/export/test_swap.py::TestSwap_nonstrict::test_custom_input_kwargs_use_private, test/export/test_swap.py::TestSwap_nonstrict::test_custom_output, test/export/test_swap.py::TestSwap_nonstrict::test_dedup_sym_size, test/export/test_swap.py::TestSwap_nonstrict::test_nested_leaf, test/export/test_swap.py::TestSwap_nonstrict::test_remove_duplicate_pytree_different_order, test/export/test_swap.py::TestSwap_nonstrict::test_remove_duplicate_pytree_simple, test/export/test_swap.py::TestSwap_nonstrict::test_unflatten_preserve_signature, test/export/test_swap.py::TestSwap_nonstrict::test_unflatten_preserve_with_unused_input, test/export/test_swap.py::TestSwap_strict::test_custom_input_args, test/export/test_swap.py::TestSwap_strict::test_custom_input_kwargs, test/export/test_swap.py::TestSwap_strict::test_custom_input_kwargs_use_private, test/export/test_swap.py::TestSwap_strict::test_custom_output, test/export/test_swap.py::TestSwap_strict::test_dedup_sym_size, test/export/test_swap.py::TestSwap_strict::test_nested_leaf, test/export/test_swap.py::TestSwap_strict::test_remove_duplicate_pytree_different_order, test/export/test_swap.py::TestSwap_strict::test_remove_duplicate_pytree_simple, test/export/test_swap.py::TestSwap_strict::test_unflatten_preserve_signature, test/export/test_swap.py::TestSwap_strict::test_unflatten_preserve_with_unused_input 2025-10-10T02:01:33.2121166Z 2025-10-10T02:01:34.7780592Z Running inductor/test_binary_folding 1/1 ... [2025-10-10 02:01:34.777541] 2025-10-10T02:01:34.7781089Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:01:34.7782859Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_binary_folding.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:01:34.777922] 2025-10-10T02:01:37.0694888Z Running dynamo/test_base_hop 1/1 ... [2025-10-10 02:01:37.068893] 2025-10-10T02:01:37.0695511Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:01:37.0696713Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_base_hop.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:01:37.069271] 2025-10-10T02:01:38.4586548Z 2025-10-10T02:01:38.4587671Z dynamo/test_aot_autograd_cache 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_aot_autograd_cache_1.1_4e168b7c37066d68_.log 2025-10-10T02:01:38.4628227Z Running 102 items in this shard: test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_aot_runtime_trace_joint, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_autograd_function, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_autograd_guard_single_entry_device_cuda_bfloat16, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_autograd_guard_single_entry_device_cuda_float16, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_autograd_inductor_guards_device_cuda_bfloat16_requires_grad_False, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_autograd_inductor_guards_device_cuda_bfloat16_requires_grad_True, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_autograd_inductor_guards_device_cuda_float16_requires_grad_False, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_autograd_inductor_guards_device_cuda_float16_requires_grad_True, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_autograd_lazy_backward, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_autograd_no_dynamo_trace_backward, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_basic, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_cache_hot_load_device_cpu_bfloat16_dynamic_False, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_cache_hot_load_device_cpu_bfloat16_dynamic_True, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_cache_hot_load_device_cpu_float32_dynamic_False, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_cache_hot_load_device_cpu_float32_dynamic_True, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_cache_hot_load_device_cuda_bfloat16_dynamic_False, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_cache_hot_load_device_cuda_bfloat16_dynamic_True, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_cache_hot_load_device_cuda_float32_dynamic_False, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_cache_hot_load_device_cuda_float32_dynamic_True, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_cache_lazy_backward_for_compiled_autograd, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_clear_fx_graph_cache, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_compiled_autograd_bypass, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_constant_tensor_device_guards, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_custom_autograd_function, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_custom_autograd_function_miss, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_custom_autograd_function_with_custom_triton_kernel, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_custom_autograd_function_with_custom_triton_kernel_cache_invalidation, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_dynamic_shapes_different_sizes, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_fx_graph_cache_off, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_inference_graph_cache_hit_with_compiled_autograd_enabled, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_invoke_subgraph, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_multi_graph_specialization, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_multiple_compile_triton_kernels, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_nn_module_with_params_global_constant, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_non_bundled_to_bundled_config_change, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_saved_tensors_hooks_autograd_cache, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_saved_tensors_hooks_autograd_cache_symbolic, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_symbol_specialization, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_triton_op_cache_invalidation, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_triton_op_cache_multiple_ops_invalidation, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_unsafe_mark_cacheable_fn_select_allow_in_graph, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_unsafe_mark_cacheable_fn_select_tag_activation_checkpoint, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_view_replay, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheTests::test_vmap, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_aot_runtime_trace_joint, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_autograd_function, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_autograd_guard_single_entry_device_cuda_bfloat16, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_autograd_guard_single_entry_device_cuda_float16, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_autograd_inductor_guards_device_cuda_bfloat16_requires_grad_False, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_autograd_inductor_guards_device_cuda_bfloat16_requires_grad_True, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_autograd_inductor_guards_device_cuda_float16_requires_grad_False, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_autograd_inductor_guards_device_cuda_float16_requires_grad_True, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_autograd_lazy_backward, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_autograd_no_dynamo_trace_backward, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_basic, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_cache_hot_load_device_cpu_bfloat16_dynamic_False, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_cache_hot_load_device_cpu_bfloat16_dynamic_True, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_cache_hot_load_device_cpu_float32_dynamic_False, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_cache_hot_load_device_cpu_float32_dynamic_True, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_cache_hot_load_device_cuda_bfloat16_dynamic_False, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_cache_hot_load_device_cuda_bfloat16_dynamic_True, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_cache_hot_load_device_cuda_float32_dynamic_False, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_cache_hot_load_device_cuda_float32_dynamic_True, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_cache_lazy_backward_for_compiled_autograd, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_clear_fx_graph_cache, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_compiled_autograd_bypass, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_constant_tensor_device_guards, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_custom_autograd_function, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_custom_autograd_function_miss, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_custom_autograd_function_with_custom_triton_kernel, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_custom_autograd_function_with_custom_triton_kernel_cache_invalidation, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_dynamic_shapes_different_sizes, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_fx_graph_cache_off, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_inference_graph_cache_hit_with_compiled_autograd_enabled, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_invoke_subgraph, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_multi_graph_specialization, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_multiple_compile_triton_kernels, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_nn_module_with_params_global_constant, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_non_bundled_to_bundled_config_change, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_saved_tensors_hooks_autograd_cache, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_saved_tensors_hooks_autograd_cache_symbolic, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_symbol_specialization, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_triton_op_cache_invalidation, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_triton_op_cache_multiple_ops_invalidation, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_unsafe_mark_cacheable_fn_select_allow_in_graph, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_unsafe_mark_cacheable_fn_select_tag_activation_checkpoint, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_view_replay, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCacheBundledTests::test_vmap, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCachePicklerTests::test_basic_hash_key, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCachePicklerTests::test_different_configs, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCachePicklerTests::test_different_global_configs, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCachePicklerTests::test_different_graphs, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCachePicklerTests::test_different_inputs, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCachePicklerTests::test_freezing, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCachePicklerTests::test_identical_graphs_and_configs, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCachePicklerTests::test_incompatible_function, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCachePicklerTests::test_nn_module_with_params, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCachePicklerTests::test_normal_torch_function, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCachePicklerTests::test_private_builtin, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCachePicklerTests::test_private_namespace, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCachePicklerTests::test_safe_torchfunction, test/dynamo/test_aot_autograd_cache.py::AOTAutogradCachePicklerTests::test_sanitize_gm_for_cache 2025-10-10T02:01:38.4667672Z 2025-10-10T02:01:41.2440839Z 2025-10-10T02:01:41.2441862Z dynamo/test_base_hop 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_base_hop_1.1_543d11ff3d3850da_.log 2025-10-10T02:01:41.2445047Z Running 11 items in this shard: test/dynamo/test_base_hop.py::BaseHOPTest::test_aliasing_mutation_error, test/dynamo/test_base_hop.py::BaseHOPTest::test_aot_eager, test/dynamo/test_base_hop.py::BaseHOPTest::test_auto_functionalize, test/dynamo/test_base_hop.py::BaseHOPTest::test_dynamo, test/dynamo/test_base_hop.py::BaseHOPTest::test_eager_call, test/dynamo/test_base_hop.py::BaseHOPTest::test_int_input, test/dynamo/test_base_hop.py::BaseHOPTest::test_none_input, test/dynamo/test_base_hop.py::BaseHOPTest::test_schema_gen_pytree_in_out, test/dynamo/test_base_hop.py::BaseHOPTest::test_schema_gen_pytree_in_out_with_mutation, test/dynamo/test_base_hop.py::BaseHOPTest::test_schema_gen_single_return, test/dynamo/test_base_hop.py::BaseHOPTest::test_schema_gen_single_return_with_mutation 2025-10-10T02:01:41.2447858Z 2025-10-10T02:01:42.3102266Z Running dynamo/test_list 1/1 ... [2025-10-10 02:01:42.309692] 2025-10-10T02:01:42.3102718Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:01:42.3104408Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_list.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:01:42.310096] 2025-10-10T02:01:42.6092950Z 2025-10-10T02:01:42.6094017Z inductor/test_binary_folding 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_binary_folding_1.1_a1218328b8bd8e7c_.log 2025-10-10T02:01:42.6097850Z Running 6 items in this shard: test/inductor/test_binary_folding.py::FreezingCpuTests::test_conv_binary_folding_cpu, test/inductor/test_binary_folding.py::FreezingCpuTests::test_conv_bn_folding_cpu, test/inductor/test_binary_folding.py::FreezingCpuTests::test_linear_binary_folding_cpu, test/inductor/test_binary_folding.py::FreezingGpuTests::test_conv_binary_folding_cuda, test/inductor/test_binary_folding.py::FreezingGpuTests::test_conv_bn_folding_cuda, test/inductor/test_binary_folding.py::FreezingGpuTests::test_linear_binary_folding_cuda 2025-10-10T02:01:42.6100887Z 2025-10-10T02:01:45.1868741Z Running export/test_unflatten 1/1 ... [2025-10-10 02:01:45.186371] 2025-10-10T02:01:45.1869253Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:01:45.1870721Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_unflatten.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:01:45.186746] 2025-10-10T02:01:46.5341958Z 2025-10-10T02:01:46.5342803Z dynamo/test_list 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_list_1.1_4248bb7d7c30055b_.log 2025-10-10T02:01:46.5350727Z Running 39 items in this shard: test/dynamo/test_list.py::TupleTests::test___contains__, test/dynamo/test_list.py::TupleTests::test___getitem__, test/dynamo/test_list.py::TupleTests::test_binop_add, test/dynamo/test_list.py::TupleTests::test_binop_imul, test/dynamo/test_list.py::TupleTests::test_cmp_eq, test/dynamo/test_list.py::TupleTests::test_cmp_greater_than, test/dynamo/test_list.py::TupleTests::test_cmp_greater_than_or_equal, test/dynamo/test_list.py::TupleTests::test_cmp_less_than, test/dynamo/test_list.py::TupleTests::test_cmp_less_than_or_equal, test/dynamo/test_list.py::TupleTests::test_cmp_ne, test/dynamo/test_list.py::TupleTests::test_count, test/dynamo/test_list.py::TupleTests::test_index, test/dynamo/test_list.py::ListTests::test___contains__, test/dynamo/test_list.py::ListTests::test___delitem__, test/dynamo/test_list.py::ListTests::test___getitem__, test/dynamo/test_list.py::ListTests::test___setitem__, test/dynamo/test_list.py::ListTests::test_append, test/dynamo/test_list.py::ListTests::test_binop_add, test/dynamo/test_list.py::ListTests::test_binop_delitem_global_list, test/dynamo/test_list.py::ListTests::test_binop_iadd, test/dynamo/test_list.py::ListTests::test_binop_iadd_global_list, test/dynamo/test_list.py::ListTests::test_binop_imul, test/dynamo/test_list.py::ListTests::test_binop_imul_global_list, test/dynamo/test_list.py::ListTests::test_clear, test/dynamo/test_list.py::ListTests::test_cmp_eq, test/dynamo/test_list.py::ListTests::test_cmp_greater_than, test/dynamo/test_list.py::ListTests::test_cmp_greater_than_or_equal, test/dynamo/test_list.py::ListTests::test_cmp_less_than, test/dynamo/test_list.py::ListTests::test_cmp_less_than_or_equal, test/dynamo/test_list.py::ListTests::test_cmp_ne, test/dynamo/test_list.py::ListTests::test_copy, test/dynamo/test_list.py::ListTests::test_count, test/dynamo/test_list.py::ListTests::test_extend, test/dynamo/test_list.py::ListTests::test_index, test/dynamo/test_list.py::ListTests::test_insert, test/dynamo/test_list.py::ListTests::test_pop, test/dynamo/test_list.py::ListTests::test_remove, test/dynamo/test_list.py::ListTests::test_reverse, test/dynamo/test_list.py::ListTests::test_sort 2025-10-10T02:01:46.5358036Z 2025-10-10T02:01:46.5902054Z Running inductor/test_needs_exact_strides 1/1 ... [2025-10-10 02:01:46.589680] 2025-10-10T02:01:46.5902626Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:01:46.5906083Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_needs_exact_strides.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:01:46.590211] 2025-10-10T02:01:49.2095130Z 2025-10-10T02:01:49.2096420Z export/test_unflatten 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_unflatten_1.1_de28863d6eab6ae8_.log 2025-10-10T02:01:49.2110181Z Running 28 items in this shard: test/export/test_unflatten.py::TestUnflatten::test_assert_tensor_metadata_stack, test/export/test_unflatten.py::TestUnflatten::test_attr_as_submod_input, test/export/test_unflatten.py::TestUnflatten::test_dedup_sym_size, test/export/test_unflatten.py::TestUnflatten::test_double_nested_submodule, test/export/test_unflatten.py::TestUnflatten::test_duplicate_placeholder, test/export/test_unflatten.py::TestUnflatten::test_fx_trace, test/export/test_unflatten.py::TestUnflatten::test_nested_leaf_non_strict, test/export/test_unflatten.py::TestUnflatten::test_placeholder_and_get_attr_ordering_after_unflattened, test/export/test_unflatten.py::TestUnflatten::test_simple_alias, test/export/test_unflatten.py::TestUnflatten::test_unflatten_buffer_mutation, test/export/test_unflatten.py::TestUnflatten::test_unflatten_constant_obj, test/export/test_unflatten.py::TestUnflatten::test_unflatten_constant_tensor, test/export/test_unflatten.py::TestUnflatten::test_unflatten_container_type, test/export/test_unflatten.py::TestUnflatten::test_unflatten_eager, test/export/test_unflatten.py::TestUnflatten::test_unflatten_empty_branch, test/export/test_unflatten.py::TestUnflatten::test_unflatten_nested, test/export/test_unflatten.py::TestUnflatten::test_unflatten_nested_access, test/export/test_unflatten.py::TestUnflatten::test_unflatten_none, test/export/test_unflatten.py::TestUnflatten::test_unflatten_param_list_dict, test/export/test_unflatten.py::TestUnflatten::test_unflatten_preserve_signature, test/export/test_unflatten.py::TestUnflatten::test_unflatten_preserve_with_unused_input, test/export/test_unflatten.py::TestUnflatten::test_unflatten_requires_grad_param, test/export/test_unflatten.py::TestUnflatten::test_unflatten_shared_submodule, test/export/test_unflatten.py::TestUnflatten::test_unflatten_skipped_call_module, test/export/test_unflatten.py::TestUnflatten::test_unflatten_submodule_ordering, test/export/test_unflatten.py::TestUnflatten::test_unflatten_with_inplace_compile, test/export/test_unflatten.py::TestUnflatten::test_unflatten_wrong_input, test/export/test_unflatten.py::TestUnflatten::test_unflattened_module_nodes_has_meta_val 2025-10-10T02:01:49.2125075Z 2025-10-10T02:01:50.4821123Z Running dynamo/test_verify_correctness 1/1 ... [2025-10-10 02:01:50.481526] 2025-10-10T02:01:50.4821779Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:01:50.4823422Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_verify_correctness.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:01:50.481986] 2025-10-10T02:01:53.0639320Z Running export/test_export 1/1 ... [2025-10-10 02:01:53.063385] 2025-10-10T02:01:53.0639909Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:01:53.0643077Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_export.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:01:53.063887] 2025-10-10T02:01:53.9215067Z 2025-10-10T02:01:53.9216385Z inductor/test_needs_exact_strides 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_needs_exact_strides_1.1_1964273ba08a4005_.log 2025-10-10T02:01:53.9218376Z Running 2 items in this shard: test/inductor/test_needs_exact_strides.py::TestNeedsExactStrides::test_custom_op_float32, test/inductor/test_needs_exact_strides.py::TestNeedsExactStrides::test_custom_op_float8_e8m0fnu 2025-10-10T02:01:53.9219475Z 2025-10-10T02:01:54.5552076Z 2025-10-10T02:01:54.5553533Z dynamo/test_verify_correctness 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_verify_correctness_1.1_d0753fae103be4ae_.log 2025-10-10T02:01:54.5555796Z Running 4 items in this shard: test/dynamo/test_verify_correctness.py::TestVerifyCorrectness::test_example_inputs, test/dynamo/test_verify_correctness.py::TestVerifyCorrectness::test_incorrect_verify_false, test/dynamo/test_verify_correctness.py::TestVerifyCorrectness::test_incorrect_verify_true, test/dynamo/test_verify_correctness.py::TestVerifyCorrectness::test_torchscript 2025-10-10T02:01:54.5557425Z 2025-10-10T02:01:57.8429390Z Running inductor/test_minifier_isolate 1/1 ... [2025-10-10 02:01:57.842332] 2025-10-10T02:01:57.8429888Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:01:57.8431938Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_minifier_isolate.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:01:57.842817] 2025-10-10T02:01:58.5121372Z Running dynamo/test_logging 1/1 ... [2025-10-10 02:01:58.511579] 2025-10-10T02:01:58.5121960Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:01:58.5124608Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_logging.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:01:58.511966] 2025-10-10T02:02:05.0033195Z 2025-10-10T02:02:05.0034503Z export/test_export 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_export_1.1_799d906e592648bc_.log 2025-10-10T02:02:05.0155080Z Running 463 items in this shard: test/export/test_export.py::TestDynamismExpression::test_export_assume_static_by_default, test/export/test_export.py::TestDynamismExpression::test_export_constraints_error, test/export/test_export.py::TestDynamismExpression::test_export_constraints_error_not_in_range, test/export/test_export.py::TestDynamismExpression::test_export_inline_constraints, test/export/test_export.py::TestDynamismExpression::test_export_slice_maxsize, test/export/test_export.py::TestDynamismExpression::test_export_slice_unbacked_dim1, test/export/test_export.py::TestDynamismExpression::test_export_strict_narrow_unbacked_expr, test/export/test_export.py::TestDynamismExpression::test_no_grad_param_inplace, test/export/test_export.py::TestDynamismExpression::test_reshape_view_backed_size_oblivious, test/export/test_export.py::TestExport::test__scaled_dot_product_flash_attention, test/export/test_export.py::TestExport::test_additional_inputs_constants, test/export/test_export.py::TestExport::test_allow_explicit_guards_as_runtime_asserts, test/export/test_export.py::TestExport::test_args_type_checked, test/export/test_export.py::TestExport::test_aten_lift_fresh_copy, test/export/test_export.py::TestExport::test_attention, test/export/test_export.py::TestExport::test_attr_assignment_extra, test/export/test_export.py::TestExport::test_automatic_constrain_size, test/export/test_export.py::TestExport::test_automatic_dynamic_shapes_constant_relation, test/export/test_export.py::TestExport::test_automatic_dynamic_shapes_linear_relation, test/export/test_export.py::TestExport::test_automatic_dynamic_shapes_simple_equality, test/export/test_export.py::TestExport::test_baddbmm, test/export/test_export.py::TestExport::test_basic, test/export/test_export.py::TestExport::test_basic_non_strict_fake_tensor, test/export/test_export.py::TestExport::test_basic_non_strict_real_tensor, test/export/test_export.py::TestExport::test_bincount, test/export/test_export.py::TestExport::test_buffer_util, test/export/test_export.py::TestExport::test_capture_subclass_constructor, test/export/test_export.py::TestExport::test_capture_subclass_constructor_torch_ir, test/export/test_export.py::TestExport::test_capture_subclass_wrong, test/export/test_export.py::TestExport::test_ccode_python_mod, test/export/test_export.py::TestExport::test_cdist_forward_compute_mode_zero_export, test/export/test_export.py::TestExport::test_check_specialized_int, test/export/test_export.py::TestExport::test_checks_to_constrain_range, test/export/test_export.py::TestExport::test_cleanup_dynamic_markers, test/export/test_export.py::TestExport::test_colin_unbacked_backed_vr_sub, test/export/test_export.py::TestExport::test_colon_parameter, test/export/test_export.py::TestExport::test_compiling_state, test/export/test_export.py::TestExport::test_cond_access_identical_symint_closure, test/export/test_export.py::TestExport::test_cond_branches_return_constant_int, test/export/test_export.py::TestExport::test_cond_branches_return_same_int, test/export/test_export.py::TestExport::test_cond_buffers, test/export/test_export.py::TestExport::test_cond_contains_unbacked_no_escape, test/export/test_export.py::TestExport::test_cond_int_closure, test/export/test_export.py::TestExport::test_cond_unflatten, test/export/test_export.py::TestExport::test_cond_with_module_stack_export_with, test/export/test_export.py::TestExport::test_cond_with_module_stack_export_with_unflatten, test/export/test_export.py::TestExport::test_constant_aliasing, test/export/test_export.py::TestExport::test_constant_input_naming, test/export/test_export.py::TestExport::test_constant_no_user_inp, test/export/test_export.py::TestExport::test_constant_output, test/export/test_export.py::TestExport::test_constant_output_dup, test/export/test_export.py::TestExport::test_constant_requires_grad_const, test/export/test_export.py::TestExport::test_constant_return, test/export/test_export.py::TestExport::test_constant_tensor_mutation, test/export/test_export.py::TestExport::test_constant_tensor_with_non_functional, test/export/test_export.py::TestExport::test_constant_tensor_with_non_functional_nested, test/export/test_export.py::TestExport::test_constrain_decomp, test/export/test_export.py::TestExport::test_constrain_size_in_eager, test/export/test_export.py::TestExport::test_constrain_size_with_constrain_value, test/export/test_export.py::TestExport::test_constrain_size_with_various_cases, test/export/test_export.py::TestExport::test_conv_dynamic, test/export/test_export.py::TestExport::test_crop_like, test/export/test_export.py::TestExport::test_cse_for_symint, test/export/test_export.py::TestExport::test_custom_op_auto_functionalize, test/export/test_export.py::TestExport::test_custom_op_auto_functionalize_pre_dispatch, test/export/test_export.py::TestExport::test_custom_op_auto_warn_pre_dispatch, test/export/test_export.py::TestExport::test_custom_op_preserve, test/export/test_export.py::TestExport::test_custom_pytree, test/export/test_export.py::TestExport::test_custom_tag_metadata_re_export, test/export/test_export.py::TestExport::test_decomp_batch_norm_functional_predispatch, test/export/test_export.py::TestExport::test_decomp_item_in_prim_after_decomposition, test/export/test_export.py::TestExport::test_decomp_item_in_prim_before_decomposition, test/export/test_export.py::TestExport::test_default_decomposition_core_cia_ops, test/export/test_export.py::TestExport::test_derived_dim_1_2, test/export/test_export.py::TestExport::test_derived_dim_basic, test/export/test_export.py::TestExport::test_derived_dim_integer, test/export/test_export.py::TestExport::test_derived_dim_nested, test/export/test_export.py::TestExport::test_derived_dim_out_of_order, test/export/test_export.py::TestExport::test_derived_dim_out_of_order_repeat_derived, test/export/test_export.py::TestExport::test_derived_dim_out_of_order_simplified, test/export/test_export.py::TestExport::test_derived_dim_out_of_order_simplified_repeat_non_derived, test/export/test_export.py::TestExport::test_derived_dim_repeat_derived, test/export/test_export.py::TestExport::test_detect_leak_nonstrict, test/export/test_export.py::TestExport::test_detect_leak_nonstrict_with_stacktrace, test/export/test_export.py::TestExport::test_detect_leak_strict, test/export/test_export.py::TestExport::test_device_to_dynamic, test/export/test_export.py::TestExport::test_device_to_gpu, test/export/test_export.py::TestExport::test_device_to_mutation, test/export/test_export.py::TestExport::test_device_to_mutation_float, test/export/test_export.py::TestExport::test_device_to_static, test/export/test_export.py::TestExport::test_dim_1_2, test/export/test_export.py::TestExport::test_dim_auto_and_dim, test/export/test_export.py::TestExport::test_dim_dynamic, test/export/test_export.py::TestExport::test_dim_dynamic_divisibility, test/export/test_export.py::TestExport::test_dim_dynamic_specialization, test/export/test_export.py::TestExport::test_dim_hint_range_violations, test/export/test_export.py::TestExport::test_dim_hint_ranges, test/export/test_export.py::TestExport::test_disable_forced_specializations_errors, test/export/test_export.py::TestExport::test_disable_forced_specializations_ok, test/export/test_export.py::TestExport::test_distributed_all_gather, test/export/test_export.py::TestExport::test_distributed_all_gather_into_tensor, test/export/test_export.py::TestExport::test_distributed_all_reduce, test/export/test_export.py::TestExport::test_distributed_all_to_all_single, test/export/test_export.py::TestExport::test_distributed_reduce_scatter_tensor, test/export/test_export.py::TestExport::test_dont_duck_size_for_auto_dynamic, test/export/test_export.py::TestExport::test_double_lifted_constants, test/export/test_export.py::TestExport::test_draft_export_checks_aliasing, test/export/test_export.py::TestExport::test_draft_export_checks_mutation, test/export/test_export.py::TestExport::test_draft_export_checks_mutation_list, test/export/test_export.py::TestExport::test_draft_export_checks_mutation_with_nan, test/export/test_export.py::TestExport::test_draft_export_fake_kernel_inference_errors, test/export/test_export.py::TestExport::test_draft_export_infers_fake_kernel, test/export/test_export.py::TestExport::test_duplicate_modules_with_non_persistent_buffers, test/export/test_export.py::TestExport::test_dynamic_lr_shift, test/export/test_export.py::TestExport::test_dynamic_shapes_bounds, test/export/test_export.py::TestExport::test_dynamic_shapes_builder_basic, test/export/test_export.py::TestExport::test_dynamic_shapes_builder_kwargs, test/export/test_export.py::TestExport::test_dynamic_shapes_builder_pytree, test/export/test_export.py::TestExport::test_dynamic_shapes_dataclass, test/export/test_export.py::TestExport::test_dynamic_shapes_inferred_basic, test/export/test_export.py::TestExport::test_dynamic_shapes_serdes_generic, test/export/test_export.py::TestExport::test_dynamic_shapes_serdes_user_errors, test/export/test_export.py::TestExport::test_dynamic_shapes_serdes_various, test/export/test_export.py::TestExport::test_dynamic_shapes_spec_with_pytree, test/export/test_export.py::TestExport::test_dynamic_shapes_wrapped_with_shape_guards, test/export/test_export.py::TestExport::test_dynamic_sym_round, test/export/test_export.py::TestExport::test_ends_of_bounds_oblivious, test/export/test_export.py::TestExport::test_error_does_not_reference_eager_fallback, test/export/test_export.py::TestExport::test_error_when_passing_mutating_primitive_op, test/export/test_export.py::TestExport::test_exception, test/export/test_export.py::TestExport::test_expand_copy_export_handles_implicit_true, test/export/test_export.py::TestExport::test_export_api_with_dynamic_shapes, test/export/test_export.py::TestExport::test_export_as_backend, test/export/test_export.py::TestExport::test_export_associative_scan_lifted_buffers, test/export/test_export.py::TestExport::test_export_associative_scan_symbol_dim, test/export/test_export.py::TestExport::test_export_associative_scan_symbol_scandim, test/export/test_export.py::TestExport::test_export_aten_to_unflatten, test/export/test_export.py::TestExport::test_export_aten_to_unflatten_subclass, test/export/test_export.py::TestExport::test_export_aten_to_unflatten_subclass_pre_dispatch, test/export/test_export.py::TestExport::test_export_cond_preserve_torch_fn_for_subgraphs, test/export/test_export.py::TestExport::test_export_cond_symbool_pred, test/export/test_export.py::TestExport::test_export_cond_warns_constant_pred, test/export/test_export.py::TestExport::test_export_custom_decomp_table_basic_pop, test/export/test_export.py::TestExport::test_export_custom_decomp_table_container_methods, test/export/test_export.py::TestExport::test_export_custom_op_lib, test/export/test_export.py::TestExport::test_export_custom_triton_kernel, test/export/test_export.py::TestExport::test_export_custom_triton_kernel_mutable, test/export/test_export.py::TestExport::test_export_cyclic_reference_leak, test/export/test_export.py::TestExport::test_export_decomp_torture_case_1, test/export/test_export.py::TestExport::test_export_decomp_torture_case_2, test/export/test_export.py::TestExport::test_export_decomps_dynamic, test/export/test_export.py::TestExport::test_export_decomps_simple, test/export/test_export.py::TestExport::test_export_dynamo_config, test/export/test_export.py::TestExport::test_export_for_training_run_decomp, test/export/test_export.py::TestExport::test_export_for_training_with_container_type, test/export/test_export.py::TestExport::test_export_for_training_with_dynamic_shapes, test/export/test_export.py::TestExport::test_export_for_training_with_mutation, test/export/test_export.py::TestExport::test_export_for_training_with_state_dict_hooks, test/export/test_export.py::TestExport::test_export_func_with_default_kwargs, test/export/test_export.py::TestExport::test_export_func_with_keyword_only_args, test/export/test_export.py::TestExport::test_export_func_with_kwargs, test/export/test_export.py::TestExport::test_export_func_with_pytree_kwargs, test/export/test_export.py::TestExport::test_export_func_with_var_keyword_args, test/export/test_export.py::TestExport::test_export_func_with_var_keyword_pytree_args, test/export/test_export.py::TestExport::test_export_func_with_var_postional_args, test/export/test_export.py::TestExport::test_export_function_schema, test/export/test_export.py::TestExport::test_export_graph_with_no_inputs, test/export/test_export.py::TestExport::test_export_input_mutation_bug, test/export/test_export.py::TestExport::test_export_input_mutation_dynamic_shape, test/export/test_export.py::TestExport::test_export_input_mutation_static_shape, test/export/test_export.py::TestExport::test_export_leak_compile, test/export/test_export.py::TestExport::test_export_linear_preserve_dynamic_shape, test/export/test_export.py::TestExport::test_export_max_nonstrict, test/export/test_export.py::TestExport::test_export_max_onnx_reported, test/export/test_export.py::TestExport::test_export_method, test/export/test_export.py::TestExport::test_export_mod_constraints, test/export/test_export.py::TestExport::test_export_module, test/export/test_export.py::TestExport::test_export_preserve_linear_at_aot_level, test/export/test_export.py::TestExport::test_export_preserve_linear_but_not_custom_op, test/export/test_export.py::TestExport::test_export_rnn_variants_with_warning, test/export/test_export.py::TestExport::test_export_scan_pytree_output, test/export/test_export.py::TestExport::test_export_script_module, test/export/test_export.py::TestExport::test_export_statically_known_true, test/export/test_export.py::TestExport::test_export_then_compile_tensor_ctor, test/export/test_export.py::TestExport::test_export_with_autocast, test/export/test_export.py::TestExport::test_export_with_fake_tensor_inputs, test/export/test_export.py::TestExport::test_export_with_fake_tensor_inputs_on_cuda_devices, test/export/test_export.py::TestExport::test_export_with_inline_constraints, test/export/test_export.py::TestExport::test_export_with_inline_constraints_complex, test/export/test_export.py::TestExport::test_export_with_set_grad_enabled, test/export/test_export.py::TestExport::test_export_with_wrong_inputs, test/export/test_export.py::TestExport::test_external_call_non_strict_real_tensor, test/export/test_export.py::TestExport::test_fake_inputs, test/export/test_export.py::TestExport::test_fake_weights, test/export/test_export.py::TestExport::test_filter_traceback_frames, test/export/test_export.py::TestExport::test_flex_attention_export, test/export/test_export.py::TestExport::test_float_conversion, test/export/test_export.py::TestExport::test_float_conversion_from_int, test/export/test_export.py::TestExport::test_fqn, test/export/test_export.py::TestExport::test_from_node_metadata_export, test/export/test_export.py::TestExport::test_full_on_scalar_tensor, test/export/test_export.py::TestExport::test_function_holding_tensor, test/export/test_export.py::TestExport::test_hints_wrapper, test/export/test_export.py::TestExport::test_hoo_inline_users_issue, test/export/test_export.py::TestExport::test_if_functional, test/export/test_export.py::TestExport::test_if_post_autograd_op_preserved, test/export/test_export.py::TestExport::test_inductor_backend_inside_nonstrict, test/export/test_export.py::TestExport::test_inline_script_class_method, test/export/test_export.py::TestExport::test_inline_script_class_method_recursive, test/export/test_export.py::TestExport::test_inline_script_function, test/export/test_export.py::TestExport::test_inline_script_method, test/export/test_export.py::TestExport::test_int_shape_specialization, test/export/test_export.py::TestExport::test_intermediate_shape_comp, test/export/test_export.py::TestExport::test_is_exporting, test/export/test_export.py::TestExport::test_is_nonzero, test/export/test_export.py::TestExport::test_isnonzero, test/export/test_export.py::TestExport::test_issue_113041, test/export/test_export.py::TestExport::test_issue_157289, test/export/test_export.py::TestExport::test_issue_161902, test/export/test_export.py::TestExport::test_istft_op, test/export/test_export.py::TestExport::test_keep_composite_ops_invalid, test/export/test_export.py::TestExport::test_keep_composite_ops_linear_convd, test/export/test_export.py::TestExport::test_keep_composite_ops_linear_convd_for_training_ir, test/export/test_export.py::TestExport::test_kwarg_dynamic_shapes_diff_order, test/export/test_export.py::TestExport::test_kwargs_reorder, test/export/test_export.py::TestExport::test_layer_norm_unbacked_normalized_shape, test/export/test_export.py::TestExport::test_layer_sharing, test/export/test_export.py::TestExport::test_lazy_module_kwargs, test/export/test_export.py::TestExport::test_lifted_constants, test/export/test_export.py::TestExport::test_linear_conv, test/export/test_export.py::TestExport::test_malformed_fqn_from_source_name, test/export/test_export.py::TestExport::test_map, test/export/test_export.py::TestExport::test_map_buffers, test/export/test_export.py::TestExport::test_mask_nonzero_static, test/export/test_export.py::TestExport::test_masked_select_dynamic, test/export/test_export.py::TestExport::test_math_pow, test/export/test_export.py::TestExport::test_mismatched_dynamic_shapes, test/export/test_export.py::TestExport::test_mixed_input, test/export/test_export.py::TestExport::test_module, test/export/test_export.py::TestExport::test_module_dict_key, test/export/test_export.py::TestExport::test_module_input, test/export/test_export.py::TestExport::test_module_input_subclasses_parameterization_nested, test/export/test_export.py::TestExport::test_module_list_slice, test/export/test_export.py::TestExport::test_module_with_dict_container_inp_out, test/export/test_export.py::TestExport::test_modules_access_for_deleted_submodule, test/export/test_export.py::TestExport::test_more_multidimensional_slicing, test/export/test_export.py::TestExport::test_multidimensional_slicing, test/export/test_export.py::TestExport::test_multinomial_dynamic, test/export/test_export.py::TestExport::test_multiple_definitions_same_name_dim, test/export/test_export.py::TestExport::test_namedtuple_input_export, test/export/test_export.py::TestExport::test_native_multi_attention_head, test/export/test_export.py::TestExport::test_nested_dynamic_shapes_spec, test/export/test_export.py::TestExport::test_nested_module, test/export/test_export.py::TestExport::test_nested_module_fake_tensor_leak, test/export/test_export.py::TestExport::test_nested_module_with_constant_buffer, test/export/test_export.py::TestExport::test_nested_module_with_init_buffer, test/export/test_export.py::TestExport::test_nested_module_with_parameter, test/export/test_export.py::TestExport::test_nn_module_stack, test/export/test_export.py::TestExport::test_nn_module_stack_shared_submodule, test/export/test_export.py::TestExport::test_no_check_is_size_error, test/export/test_export.py::TestExport::test_no_suggested_fixes_for_data_dependent_errors, test/export/test_export.py::TestExport::test_no_tensor_computation, test/export/test_export.py::TestExport::test_no_tensor_computation_2, test/export/test_export.py::TestExport::test_no_tensor_computation_3, test/export/test_export.py::TestExport::test_no_tensor_computation_4, test/export/test_export.py::TestExport::test_non_arg_name_dynamic_shapes_api, test/export/test_export.py::TestExport::test_non_arg_name_dynamic_shapes_api_with_container_type, test/export/test_export.py::TestExport::test_non_arg_name_dynamic_shapes_api_with_kwarg, test/export/test_export.py::TestExport::test_non_persistent_buffer, test/export/test_export.py::TestExport::test_non_strict_dynamic_shapes, test/export/test_export.py::TestExport::test_non_strict_dynamic_shapes_suggested_fixes, test/export/test_export.py::TestExport::test_none_buffers, test/export/test_export.py::TestExport::test_nonstrict_retrace_preserves_metadata, test/export/test_export.py::TestExport::test_nonzero_2, test/export/test_export.py::TestExport::test_nonzero_dynamic, test/export/test_export.py::TestExport::test_not_registered_parameter, test/export/test_export.py::TestExport::test_operator_aten_tensor_mode_variant, test/export/test_export.py::TestExport::test_output_node_name, test/export/test_export.py::TestExport::test_pad_sequence, test/export/test_export.py::TestExport::test_param_util, test/export/test_export.py::TestExport::test_partial_patched_forward, test/export/test_export.py::TestExport::test_placeholder_naming_collisions, test/export/test_export.py::TestExport::test_placeholder_naming_collisions_hoo_subgraphs, test/export/test_export.py::TestExport::test_placeholder_naming_order, test/export/test_export.py::TestExport::test_placeholder_naming_order_variadic, test/export/test_export.py::TestExport::test_placeholder_update_preserving, test/export/test_export.py::TestExport::test_predispatch_cond, test/export/test_export.py::TestExport::test_predispatch_grad_wrappers, test/export/test_export.py::TestExport::test_preserve_annotation, test/export/test_export.py::TestExport::test_preserve_module_call_signature_unflatten_specialization, test/export/test_export.py::TestExport::test_preserve_requires_grad_placeholders, test/export/test_export.py::TestExport::test_preserve_shape_dynamism_for_unused_inputs, test/export/test_export.py::TestExport::test_profiling_code, test/export/test_export.py::TestExport::test_python_asserts_with_sym_int, test/export/test_export.py::TestExport::test_pytree_register_data_class, test/export/test_export.py::TestExport::test_pytree_register_nested_data_class, test/export/test_export.py::TestExport::test_raise_user_error_when_guard_on_data_dependent_operation, test/export/test_export.py::TestExport::test_range_constraints_with_replacement, test/export/test_export.py::TestExport::test_real_tensor_alias_dtype_mismatch, test/export/test_export.py::TestExport::test_real_tensor_bool_cast, test/export/test_export.py::TestExport::test_real_tensor_errors_on_aliasing_custom_op, test/export/test_export.py::TestExport::test_real_tensor_for_max_op, test/export/test_export.py::TestExport::test_real_tensor_size_mismatch, test/export/test_export.py::TestExport::test_redundant_assert_max_upper_bound, test/export/test_export.py::TestExport::test_redundant_asserts, test/export/test_export.py::TestExport::test_refine_dynamic_shapes_from_suggested_fixes, test/export/test_export.py::TestExport::test_register_constant, test/export/test_export.py::TestExport::test_repeat_interleave, test/export/test_export.py::TestExport::test_replace_unbacked_with_very_large_upperbound, test/export/test_export.py::TestExport::test_replaced_unbacked_bindings, test/export/test_export.py::TestExport::test_reshape_view_helper, test/export/test_export.py::TestExport::test_retracable_ep, test/export/test_export.py::TestExport::test_retrace_pre_autograd, test/export/test_export.py::TestExport::test_run_decomposition_supports_user_input_mutation, test/export/test_export.py::TestExport::test_run_decompositions_keep_metadata, test/export/test_export.py::TestExport::test_run_decompositions_keep_tensor_constant_metadata, test/export/test_export.py::TestExport::test_runtime_assert_for_prim, test/export/test_export.py::TestExport::test_runtime_assert_for_prm_str, test/export/test_export.py::TestExport::test_runtime_assert_with_size, test/export/test_export.py::TestExport::test_sdpa_gqa, test/export/test_export.py::TestExport::test_sequential_slicing, test/export/test_export.py::TestExport::test_set_example_inputs, test/export/test_export.py::TestExport::test_set_grad_as_side_effect, test/export/test_export.py::TestExport::test_set_grad_empty, test/export/test_export.py::TestExport::test_set_grad_unflatten, test/export/test_export.py::TestExport::test_setgrad_lifted_tensor, test/export/test_export.py::TestExport::test_shared_submodule_nn_module_stack, test/export/test_export.py::TestExport::test_simple_export_for_training, test/export/test_export.py::TestExport::test_simple_unbacked_view, test/export/test_export.py::TestExport::test_size_input, test/export/test_export.py::TestExport::test_slice_nn_module_stack, test/export/test_export.py::TestExport::test_solver_unsupported_sympy_function, test/export/test_export.py::TestExport::test_specialize_derived_dim_roots, test/export/test_export.py::TestExport::test_split_const_gm_with_lifted_constants, test/export/test_export.py::TestExport::test_stack_trace, test/export/test_export.py::TestExport::test_stack_trace_make_fx, test/export/test_export.py::TestExport::test_state_primitives, test/export/test_export.py::TestExport::test_state_shape_attribute_assignment, test/export/test_export.py::TestExport::test_state_tensors, test/export/test_export.py::TestExport::test_static_dim_constraints, test/export/test_export.py::TestExport::test_subclass_context, test/export/test_export.py::TestExport::test_subclass_nested_attr_access, test/export/test_export.py::TestExport::test_subclass_nested_attr_access_complicated_metadata, test/export/test_export.py::TestExport::test_subclass_nested_attr_access_const_metadata, test/export/test_export.py::TestExport::test_subclass_nested_attr_access_const_metadata_not_top_level, test/export/test_export.py::TestExport::test_subclass_nested_attr_access_submodule, test/export/test_export.py::TestExport::test_subclasses_parameterization, test/export/test_export.py::TestExport::test_subclasses_parameterization_nested, test/export/test_export.py::TestExport::test_suggest_torch_checks_with_non_negative_check, test/export/test_export.py::TestExport::test_suggest_torch_checks_with_regular_check, test/export/test_export.py::TestExport::test_suggested_fixes_for_data_dependent_errors_basic, test/export/test_export.py::TestExport::test_suggested_fixes_for_data_dependent_errors_puzzlers, test/export/test_export.py::TestExport::test_suggested_fixes_new_roots, test/export/test_export.py::TestExport::test_sym_float_operators, test/export/test_export.py::TestExport::test_sym_or_sym_and, test/export/test_export.py::TestExport::test_sym_sqrt, test/export/test_export.py::TestExport::test_symbool_item, test/export/test_export.py::TestExport::test_symfloat_item, test/export/test_export.py::TestExport::test_symint_input_additional_inputs, test/export/test_export.py::TestExport::test_symint_input_basic, test/export/test_export.py::TestExport::test_symint_input_ranges, test/export/test_export.py::TestExport::test_symint_input_shapes_collection, test/export/test_export.py::TestExport::test_symint_input_specialization, test/export/test_export.py::TestExport::test_symint_item, test/export/test_export.py::TestExport::test_symint_output, test/export/test_export.py::TestExport::test_symint_tensor_return, test/export/test_export.py::TestExport::test_tag_ac_export, test/export/test_export.py::TestExport::test_tensor_attribute_zero_args, test/export/test_export.py::TestExport::test_tensor_constant_aten_to, test/export/test_export.py::TestExport::test_tensor_constant_with_wrapped_method, test/export/test_export.py::TestExport::test_to_module_with_mutated_buffer, test/export/test_export.py::TestExport::test_to_module_with_mutated_buffer_multiple, test/export/test_export.py::TestExport::test_to_module_with_mutated_buffer_multiple_update_sub_later, test/export/test_export.py::TestExport::test_tolist, test/export/test_export.py::TestExport::test_torch_check_eq_commutativity, test/export/test_export.py::TestExport::test_torch_fn, test/export/test_export.py::TestExport::test_trace_under_fake, test/export/test_export.py::TestExport::test_train_eval_on_exported_preautograd_module, test/export/test_export.py::TestExport::test_unbacked_3d_matmul, test/export/test_export.py::TestExport::test_unbacked_bincount, test/export/test_export.py::TestExport::test_unbacked_bindings_for_divisible_u_symint, test/export/test_export.py::TestExport::test_unbacked_deferred_runtime_retrace, test/export/test_export.py::TestExport::test_unbacked_expand, test/export/test_export.py::TestExport::test_unbacked_infer_size, test/export/test_export.py::TestExport::test_unbacked_kth_value, test/export/test_export.py::TestExport::test_unbacked_linear_layer_norm_input, test/export/test_export.py::TestExport::test_unbacked_noncontig_lin, test/export/test_export.py::TestExport::test_unbacked_pad, test/export/test_export.py::TestExport::test_unbacked_scalar_constructor, test/export/test_export.py::TestExport::test_unbacked_slice_forward, test/export/test_export.py::TestExport::test_unbacked_slice_simple, test/export/test_export.py::TestExport::test_unbacked_stack, test/export/test_export.py::TestExport::test_unbacked_to_cond, test/export/test_export.py::TestExport::test_unbacked_to_cond_passthrough, test/export/test_export.py::TestExport::test_unbacked_unsqueeze, test/export/test_export.py::TestExport::test_unflatten_asserts, test/export/test_export.py::TestExport::test_unflatten_buffer_update_child2parent_swap, test/export/test_export.py::TestExport::test_unflatten_closure, test/export/test_export.py::TestExport::test_unflatten_isinstance, test/export/test_export.py::TestExport::test_unflatten_multiple_graphs_dispatch, test/export/test_export.py::TestExport::test_unflatten_multiple_graphs_preserve_signature_no_error, test/export/test_export.py::TestExport::test_unflatten_multiple_graphs_shared_submodule, test/export/test_export.py::TestExport::test_unflatten_multiple_graphs_state, test/export/test_export.py::TestExport::test_unflatten_no_unroll, test/export/test_export.py::TestExport::test_unflatten_placeholder_update_child2parent_swap, test/export/test_export.py::TestExport::test_unflatten_placeholder_update_grandchild2cousin_swap, test/export/test_export.py::TestExport::test_unflatten_random_dag_5, test/export/test_export.py::TestExport::test_unflatten_random_dag_6, test/export/test_export.py::TestExport::test_unflatten_random_dag_buf_8, test/export/test_export.py::TestExport::test_unflatten_random_dag_const_preserving_3, test/export/test_export.py::TestExport::test_unflatten_random_dag_const_preserving_3_1, test/export/test_export.py::TestExport::test_unflatten_random_dag_mutating_buf_4, test/export/test_export.py::TestExport::test_unflatten_random_dag_mutating_buf_6, test/export/test_export.py::TestExport::test_unflatten_random_dag_mutating_buf_9, test/export/test_export.py::TestExport::test_unflatten_random_dag_mutating_buf_preserving_10, test/export/test_export.py::TestExport::test_unflatten_random_dag_mutating_buf_preserving_4, test/export/test_export.py::TestExport::test_unflatten_random_dag_mutating_buf_preserving_4_1, test/export/test_export.py::TestExport::test_unflatten_random_dag_mutating_buf_preserving_5, test/export/test_export.py::TestExport::test_unflatten_random_dag_mutating_buf_preserving_7, test/export/test_export.py::TestExport::test_unflatten_random_dag_preserving_4, test/export/test_export.py::TestExport::test_unused_aliases, test/export/test_export.py::TestExport::test_unused_constant, test/export/test_export.py::TestExport::test_use_embedding_twice, test/export/test_export.py::TestExport::test_user_input_and_buffer_mutation, test/export/test_export.py::TestExport::test_vmap, test/export/test_export.py::TestExport::test_vmap_custom_autograd_function, test/export/test_export.py::TestExport::test_vmap_to_assert, test/export/test_export.py::TestExport::test_where_decomp, test/export/test_export.py::TestExport::test_while_loop_assert_separation, test/export/test_export.py::TestExport::test_while_loop_index_assertions, test/export/test_export.py::TestExport::test_while_loop_simple, test/export/test_export.py::TestExport::test_while_loop_tensor_constant_idx, test/export/test_export.py::TestExport::test_wrapper_module, test/export/test_export.py::TestOneOffModelExportResult::test_assert_tensor_metadata_device_index, test/export/test_export.py::TestOneOffModelExportResult::test_constant_fqn, test/export/test_export.py::TestOneOffModelExportResult::test_constant_name, test/export/test_export.py::TestOneOffModelExportResult::test_duplicated_getitem, test/export/test_export.py::TestOneOffModelExportResult::test_export_with_dict_input_nested_in_args, test/export/test_export.py::TestOneOffModelExportResult::test_hf_logging_logger, test/export/test_export.py::TestOneOffModelExportResult::test_input_output_no_stacktrace, test/export/test_export.py::TestOneOffModelExportResult::test_int_list_output, test/export/test_export.py::TestOneOffModelExportResult::test_logging_logger, test/export/test_export.py::TestOneOffModelExportResult::test_nested_retrace, test/export/test_export.py::TestOneOffModelExportResult::test_none_input_output, test/export/test_export.py::TestOneOffModelExportResult::test_primitive_constant_output, test/export/test_export.py::TestOneOffModelExportResult::test_print, test/export/test_export.py::TestOneOffModelExportResult::test_print_graph_signature, test/export/test_export.py::TestOneOffModelExportResult::test_scaled_dot_product_attention_cpu, test/export/test_export.py::TestOneOffModelExportResult::test_scaled_dot_product_attention_cuda, test/export/test_export.py::TestOneOffModelExportResult::test_strict_export_with_shared_parameters, test/export/test_export.py::TestOneOffModelExportResult::test_torchrec_jagged_tensor, test/export/test_export.py::TestOneOffModelExportResult::test_unbacked_sdpa, test/export/test_export.py::TestOneOffModelExportResult::test_warning, test/export/test_export.py::TestExportCustomClass::test_export_script_module, test/export/test_export.py::TestExportCustomClass::test_export_unbacked_lt, test/export/test_export.py::TestExportCustomClass::test_int_lift_constant, test/export/test_export.py::TestExportCustomClass::test_is_fx_tracing, test/export/test_export.py::TestExportCustomClass::test_item, test/export/test_export.py::TestExportCustomClass::test_lift_custom_obj, test/export/test_export.py::TestExportCustomClass::test_preserve_cia_op, test/export/test_export.py::TestExportCustomClass::test_preserve_non_cia_op, test/export/test_export.py::TestExportCustomClass::test_unbacked_contiguous, test/export/test_export.py::TestExportCustomClass::test_unbacked_select_index 2025-10-10T02:02:05.0270998Z 2025-10-10T02:02:07.1446007Z 2025-10-10T02:02:07.1447200Z dynamo/test_logging 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_logging_1.1_53ed6fd2cc9e4943_.log 2025-10-10T02:02:07.1467752Z Running 51 items in this shard: test/dynamo/test_logging.py::LoggingTests::test_all, test/dynamo/test_logging.py::LoggingTests::test_aot, test/dynamo/test_logging.py::LoggingTests::test_aot_graphs, test/dynamo/test_logging.py::LoggingTests::test_aot_joint_graph, test/dynamo/test_logging.py::LoggingTests::test_autotuning, test/dynamo/test_logging.py::LoggingTests::test_bytecode, test/dynamo/test_logging.py::LoggingTests::test_cudagraph_static_inputs, test/dynamo/test_logging.py::LoggingTests::test_cudagraphs, test/dynamo/test_logging.py::LoggingTests::test_custom_format, test/dynamo/test_logging.py::LoggingTests::test_custom_format_exc, test/dynamo/test_logging.py::LoggingTests::test_ddp_graphs, test/dynamo/test_logging.py::LoggingTests::test_default_logging, test/dynamo/test_logging.py::LoggingTests::test_distributed_rank_logging, test/dynamo/test_logging.py::LoggingTests::test_dump_compile_times, test/dynamo/test_logging.py::LoggingTests::test_dynamo_debug, test/dynamo/test_logging.py::LoggingTests::test_dynamo_debug_default_off_artifacts, test/dynamo/test_logging.py::LoggingTests::test_dynamo_error, test/dynamo/test_logging.py::LoggingTests::test_dynamo_info, test/dynamo/test_logging.py::LoggingTests::test_fusion, test/dynamo/test_logging.py::LoggingTests::test_graph_breaks, test/dynamo/test_logging.py::LoggingTests::test_graph_region_expansion, test/dynamo/test_logging.py::LoggingTests::test_guards_polyfill_sloc, test/dynamo/test_logging.py::LoggingTests::test_guards_recompiles, test/dynamo/test_logging.py::LoggingTests::test_guards_sloc, test/dynamo/test_logging.py::LoggingTests::test_guards_sloc_vr, test/dynamo/test_logging.py::LoggingTests::test_hierarchical_compile, test/dynamo/test_logging.py::LoggingTests::test_inductor_debug, test/dynamo/test_logging.py::LoggingTests::test_inductor_error, test/dynamo/test_logging.py::LoggingTests::test_inductor_info, test/dynamo/test_logging.py::LoggingTests::test_invalid_artifact_flag, test/dynamo/test_logging.py::LoggingTests::test_invalid_artifact_flag_error_msg, test/dynamo/test_logging.py::LoggingTests::test_kernel_code, test/dynamo/test_logging.py::LoggingTests::test_log_traced_frames, test/dynamo/test_logging.py::LoggingTests::test_logs_out, test/dynamo/test_logging.py::LoggingTests::test_multiline_format, test/dynamo/test_logging.py::LoggingTests::test_open_registration, test/dynamo/test_logging.py::LoggingTests::test_open_registration_python_api, test/dynamo/test_logging.py::LoggingTests::test_open_registration_with_registered_parent, test/dynamo/test_logging.py::LoggingTests::test_optimizer_non_static_param, test/dynamo/test_logging.py::LoggingTests::test_output_code, test/dynamo/test_logging.py::LoggingTests::test_recompiles, test/dynamo/test_logging.py::LoggingTests::test_schedule, test/dynamo/test_logging.py::LoggingTests::test_trace_call, test/dynamo/test_logging.py::LoggingTests::test_trace_call_graph_break, test/dynamo/test_logging.py::LoggingTests::test_trace_call_inline_call, test/dynamo/test_logging.py::LoggingTests::test_trace_call_prefix, test/dynamo/test_logging.py::LoggingTests::test_trace_source_cond, test/dynamo/test_logging.py::LoggingTests::test_trace_source_funcname, test/dynamo/test_logging.py::LoggingTests::test_trace_source_if_stmt, test/dynamo/test_logging.py::LoggingTests::test_trace_source_nested, test/dynamo/test_logging.py::LoggingTests::test_trace_source_simple 2025-10-10T02:02:07.1488784Z 2025-10-10T02:02:08.8933914Z Running dynamo/test_deviceguard 1/1 ... [2025-10-10 02:02:08.892887] 2025-10-10T02:02:08.8934519Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:02:08.8936616Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_deviceguard.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:02:08.893260] 2025-10-10T02:02:11.0438127Z Running dynamo/test_aot_autograd 1/1 ... [2025-10-10 02:02:11.043193] 2025-10-10T02:02:11.0438754Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:02:11.0439774Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_aot_autograd.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:02:11.043606] 2025-10-10T02:02:13.0170823Z 2025-10-10T02:02:13.0172041Z dynamo/test_deviceguard 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_deviceguard_1.1_9a49e3f89bda1ff3_.log 2025-10-10T02:02:13.0173853Z Running 4 items in this shard: test/dynamo/test_deviceguard.py::TestDeviceGuard::test_device_guard, test/dynamo/test_deviceguard.py::TestDeviceGuard::test_device_guard_no_index, test/dynamo/test_deviceguard.py::TestCUDADeviceGuard::test_device_guard, test/dynamo/test_deviceguard.py::TestCUDADeviceGuard::test_device_guard_no_index 2025-10-10T02:02:13.0175090Z 2025-10-10T02:02:15.2682826Z 2025-10-10T02:02:15.2683786Z dynamo/test_aot_autograd 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_aot_autograd_1.1_bd6312e2ff94e5c7_.log 2025-10-10T02:02:15.2702350Z Running 49 items in this shard: test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_LSTM, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_alias_inputs, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_aot_autograd_expand_mutation_backwards, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_aot_autograd_expand_mutation_error, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_aot_autograd_expand_mutation_functionalizes, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_aot_autograd_raises_invalid_leaf_set, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_aot_export_joint_simple_repro, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_aot_grad_mode_mutation, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_aot_sequence_nr, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_arg_dupe_via_dynamo_recompiles, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_arg_dupe_via_dynamo_recompiles_many_args, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_arg_dupe_via_dynamo_recompiles_many_args_param, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_arg_dupe_via_dynamo_recompiles_many_args_param_non_tensor_arg, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_arg_dupe_via_dynamo_recompiles_many_args_param_non_tensor_arg_list, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_arg_dupe_via_dynamo_recompiles_many_with_global, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_autograd_function_tangent_mutation, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_call_fn_with_non_const_inputs_aot_safe, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_call_fn_with_non_const_inputs_aot_unsafe, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_call_fn_with_non_const_inputs_aot_unsafe_control_flow, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_data_ptr_access_copy, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_data_ptr_access_fails_in_backward, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_data_ptr_access_fails_in_forward, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_different_inputs_overlapping_set_with_mutation, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_donated_buffer1, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_donated_buffer2, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_donated_buffer3, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_donated_buffer4, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_donated_buffer5, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_donated_buffer6, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_donated_buffer_with_retain_or_create_graph1, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_donated_buffer_with_retain_or_create_graph2, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_donated_buffer_with_retain_or_create_graph3, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_donated_buffer_with_retain_or_create_graph4, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_double_backward_errors, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_eager_sequence_nr, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_grad_inputs_alias_inputs, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_inputs_overlapping_with_mutation_recompile, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_inputs_overlapping_with_mutation_stress, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_joint_custom_pass, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_multiple_aot_autograd_calls_dupe_args, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_mutation, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_mutation1, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_negative_testing, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_negative_testing_mutation, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_nn_parameter_construction, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_no_storage_overlap_guards_no_aliasing, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_no_storage_overlap_guards_no_mutation, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_requires_grad_fake_via_dynamo_recompiles, test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_split_with_sizes_aot_autograd_cleans_up_traceback_meta 2025-10-10T02:02:15.2719833Z 2025-10-10T02:02:17.0045138Z Running inductor/test_augmented_graph_helper 1/1 ... [2025-10-10 02:02:17.003833] 2025-10-10T02:02:17.0045902Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:02:17.0047724Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_augmented_graph_helper.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:02:17.004203] 2025-10-10T02:02:19.1221884Z Running dynamo/test_cudagraphs 1/1 ... [2025-10-10 02:02:19.121644] 2025-10-10T02:02:19.1222425Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:02:19.1223992Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_cudagraphs.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:02:19.122033] 2025-10-10T02:02:20.6265223Z 2025-10-10T02:02:20.6267440Z inductor/test_augmented_graph_helper 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_augmented_graph_helper_1.1_ef0ac65fad0c43d6_.log 2025-10-10T02:02:20.6278414Z Running 16 items in this shard: test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_cycle_through_merge, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_cycle_with_extra_deps, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_extra_deps_with_merge, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_has_path_direct, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_has_path_through_merge, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_has_path_transitive, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_has_path_with_extra_deps, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_initial_state, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_merged_deps_collection, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_multiple_merge_unmerge, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_no_cycle_in_dag, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_simple_cycle_detection, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_simple_merge, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_transitive_merge, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_unmerge_from_singleton, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_unmerge_node 2025-10-10T02:02:20.6287773Z 2025-10-10T02:02:23.2452159Z 2025-10-10T02:02:23.2453333Z dynamo/test_cudagraphs 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_cudagraphs_1.1_9f1060c0752f2213_.log 2025-10-10T02:02:23.2456313Z Running 8 items in this shard: test/dynamo/test_cudagraphs.py::TestAotCudagraphs::test_basic, test/dynamo/test_cudagraphs.py::TestAotCudagraphs::test_dead_fill, test/dynamo/test_cudagraphs.py::TestAotCudagraphs::test_dtoh, test/dynamo/test_cudagraphs.py::TestAotCudagraphs::test_factory, test/dynamo/test_cudagraphs.py::TestAotCudagraphs::test_htod, test/dynamo/test_cudagraphs.py::TestAotCudagraphs::test_mutate_constant, test/dynamo/test_cudagraphs.py::TestAotCudagraphs::test_mutate_input, test/dynamo/test_cudagraphs.py::TestAotCudagraphs::test_mutated_metadata 2025-10-10T02:02:23.2458783Z 2025-10-10T02:02:24.5580371Z Running inductor/test_caching 1/1 ... [2025-10-10 02:02:24.557537] 2025-10-10T02:02:24.5580958Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:02:24.5584456Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_caching.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:02:24.558072] 2025-10-10T02:02:27.1544897Z Running export/test_upgrader 1/1 ... [2025-10-10 02:02:27.153981] 2025-10-10T02:02:27.1545471Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:02:27.1547416Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_upgrader.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:02:27.154353] 2025-10-10T02:02:28.7317663Z 2025-10-10T02:02:28.7318654Z inductor/test_caching 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_caching_1.1_8d229fa359ca9a0d_.log 2025-10-10T02:02:28.7387506Z Running 142 items in this shard: test/inductor/test_caching.py::ConfigTest::test_versioned_config_env_var_override_enabled_False, test/inductor/test_caching.py::ConfigTest::test_versioned_config_env_var_override_enabled_True, test/inductor/test_caching.py::ConfigTest::test_versioned_config_oss_default_enabled_False, test/inductor/test_caching.py::ConfigTest::test_versioned_config_oss_default_enabled_True, test/inductor/test_caching.py::ConfigTest::test_versioned_config_version_check_enabled_False, test/inductor/test_caching.py::ConfigTest::test_versioned_config_version_check_enabled_True, test/inductor/test_caching.py::ContextTest::test_all_or_none_isolation_context_all_runtime_context_False_all_compile_context_False, test/inductor/test_caching.py::ContextTest::test_all_or_none_isolation_context_all_runtime_context_False_all_compile_context_True, test/inductor/test_caching.py::ContextTest::test_all_or_none_isolation_context_all_runtime_context_True_all_compile_context_False, test/inductor/test_caching.py::ContextTest::test_all_or_none_isolation_context_all_runtime_context_True_all_compile_context_True, test/inductor/test_caching.py::ContextTest::test_isolation_key_is_distinct, test/inductor/test_caching.py::ContextTest::test_isolation_key_is_repeatable, test/inductor/test_caching.py::ContextTest::test_select_compile_context_matches_forms_of_context, test/inductor/test_caching.py::ContextTest::test_select_runtime_context_matches_forms_of_context, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected0_compile_forms_of_context_selected0, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected0_compile_forms_of_context_selected1, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected0_compile_forms_of_context_selected10, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected0_compile_forms_of_context_selected2, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected0_compile_forms_of_context_selected3, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected0_compile_forms_of_context_selected4, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected0_compile_forms_of_context_selected5, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected0_compile_forms_of_context_selected6, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected0_compile_forms_of_context_selected7, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected0_compile_forms_of_context_selected8, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected0_compile_forms_of_context_selected9, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected1_compile_forms_of_context_selected0, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected1_compile_forms_of_context_selected1, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected1_compile_forms_of_context_selected10, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected1_compile_forms_of_context_selected2, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected1_compile_forms_of_context_selected3, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected1_compile_forms_of_context_selected4, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected1_compile_forms_of_context_selected5, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected1_compile_forms_of_context_selected6, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected1_compile_forms_of_context_selected7, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected1_compile_forms_of_context_selected8, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected1_compile_forms_of_context_selected9, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected2_compile_forms_of_context_selected0, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected2_compile_forms_of_context_selected1, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected2_compile_forms_of_context_selected10, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected2_compile_forms_of_context_selected2, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected2_compile_forms_of_context_selected3, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected2_compile_forms_of_context_selected4, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected2_compile_forms_of_context_selected5, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected2_compile_forms_of_context_selected6, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected2_compile_forms_of_context_selected7, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected2_compile_forms_of_context_selected8, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected2_compile_forms_of_context_selected9, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected3_compile_forms_of_context_selected0, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected3_compile_forms_of_context_selected1, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected3_compile_forms_of_context_selected10, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected3_compile_forms_of_context_selected2, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected3_compile_forms_of_context_selected3, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected3_compile_forms_of_context_selected4, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected3_compile_forms_of_context_selected5, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected3_compile_forms_of_context_selected6, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected3_compile_forms_of_context_selected7, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected3_compile_forms_of_context_selected8, test/inductor/test_caching.py::ContextTest::test_selected_isolation_context_runtime_forms_of_context_selected3_compile_forms_of_context_selected9, test/inductor/test_caching.py::ExceptionsTest::test_exception_is_CacheError_exception_typename_CacheError, test/inductor/test_caching.py::ExceptionsTest::test_exception_is_CacheError_exception_typename_FileLockTimeoutError, test/inductor/test_caching.py::ExceptionsTest::test_exception_is_CacheError_exception_typename_KeyEncodingError, test/inductor/test_caching.py::ExceptionsTest::test_exception_is_CacheError_exception_typename_KeyPicklingError, test/inductor/test_caching.py::ExceptionsTest::test_exception_is_CacheError_exception_typename_LockTimeoutError, test/inductor/test_caching.py::ExceptionsTest::test_exception_is_CacheError_exception_typename_SystemError, test/inductor/test_caching.py::ExceptionsTest::test_exception_is_CacheError_exception_typename_UserError, test/inductor/test_caching.py::ExceptionsTest::test_exception_is_CacheError_exception_typename_ValueDecodingError, test/inductor/test_caching.py::ExceptionsTest::test_exception_is_CacheError_exception_typename_ValueEncodingError, test/inductor/test_caching.py::ExceptionsTest::test_exception_is_CacheError_exception_typename_ValuePicklingError, test/inductor/test_caching.py::ExceptionsTest::test_exception_is_CacheError_exception_typename_ValueUnPicklingError, test/inductor/test_caching.py::ExceptionsTest::test_exception_other, test/inductor/test_caching.py::ImplementationsTest::test_get_impl_typename__InMemoryCacheImpl, test/inductor/test_caching.py::ImplementationsTest::test_get_impl_typename__OnDiskCacheImpl, test/inductor/test_caching.py::ImplementationsTest::test_insert_impl_typename__InMemoryCacheImpl, test/inductor/test_caching.py::ImplementationsTest::test_insert_impl_typename__OnDiskCacheImpl, test/inductor/test_caching.py::ImplementationsTest::test_insert_will_not_overwrite_impl_typename__InMemoryCacheImpl, test/inductor/test_caching.py::ImplementationsTest::test_insert_will_not_overwrite_impl_typename__OnDiskCacheImpl, test/inductor/test_caching.py::ImplementationsTest::test_key_encoding_impl_typename__InMemoryCacheImpl, test/inductor/test_caching.py::ImplementationsTest::test_key_encoding_impl_typename__OnDiskCacheImpl, test/inductor/test_caching.py::ImplementationsTest::test_value_decoding_impl_typename__InMemoryCacheImpl, test/inductor/test_caching.py::ImplementationsTest::test_value_decoding_impl_typename__OnDiskCacheImpl, test/inductor/test_caching.py::ImplementationsTest::test_value_encoding_impl_typename__InMemoryCacheImpl, test/inductor/test_caching.py::ImplementationsTest::test_value_encoding_impl_typename__OnDiskCacheImpl, test/inductor/test_caching.py::ImplementationsTest::test_version_mismatch_impl_typename__InMemoryCacheImpl, test/inductor/test_caching.py::ImplementationsTest::test_version_mismatch_impl_typename__OnDiskCacheImpl, test/inductor/test_caching.py::LocksTest::test_BLOCKING, test/inductor/test_caching.py::LocksTest::test_BLOCKING_WITH_TIMEOUT, test/inductor/test_caching.py::LocksTest::test_NON_BLOCKING, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_BLOCKING_WITH_TIMEOUT_acquisition_mode_safe_release_after_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_BLOCKING_WITH_TIMEOUT_acquisition_mode_safe_release_before_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_BLOCKING_WITH_TIMEOUT_acquisition_mode_safe_release_never, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_BLOCKING_WITH_TIMEOUT_acquisition_mode_safe_release_unlocked, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_BLOCKING_WITH_TIMEOUT_acquisition_mode_unsafe_release_after_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_BLOCKING_WITH_TIMEOUT_acquisition_mode_unsafe_release_before_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_BLOCKING_WITH_TIMEOUT_acquisition_mode_unsafe_release_never, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_BLOCKING_WITH_TIMEOUT_acquisition_mode_unsafe_release_unlocked, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_BLOCKING_acquisition_mode_safe_release_after_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_BLOCKING_acquisition_mode_safe_release_before_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_BLOCKING_acquisition_mode_safe_release_never, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_BLOCKING_acquisition_mode_safe_release_unlocked, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_BLOCKING_acquisition_mode_unsafe_release_after_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_BLOCKING_acquisition_mode_unsafe_release_before_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_BLOCKING_acquisition_mode_unsafe_release_never, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_BLOCKING_acquisition_mode_unsafe_release_unlocked, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_NON_BLOCKING_acquisition_mode_safe_release_after_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_NON_BLOCKING_acquisition_mode_safe_release_before_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_NON_BLOCKING_acquisition_mode_safe_release_never, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_NON_BLOCKING_acquisition_mode_safe_release_unlocked, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_NON_BLOCKING_acquisition_mode_unsafe_release_after_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_NON_BLOCKING_acquisition_mode_unsafe_release_before_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_NON_BLOCKING_acquisition_mode_unsafe_release_never, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_FileLock_lock_timeout_NON_BLOCKING_acquisition_mode_unsafe_release_unlocked, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_BLOCKING_WITH_TIMEOUT_acquisition_mode_safe_release_after_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_BLOCKING_WITH_TIMEOUT_acquisition_mode_safe_release_before_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_BLOCKING_WITH_TIMEOUT_acquisition_mode_safe_release_never, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_BLOCKING_WITH_TIMEOUT_acquisition_mode_safe_release_unlocked, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_BLOCKING_WITH_TIMEOUT_acquisition_mode_unsafe_release_after_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_BLOCKING_WITH_TIMEOUT_acquisition_mode_unsafe_release_before_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_BLOCKING_WITH_TIMEOUT_acquisition_mode_unsafe_release_never, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_BLOCKING_WITH_TIMEOUT_acquisition_mode_unsafe_release_unlocked, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_BLOCKING_acquisition_mode_safe_release_after_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_BLOCKING_acquisition_mode_safe_release_before_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_BLOCKING_acquisition_mode_safe_release_never, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_BLOCKING_acquisition_mode_safe_release_unlocked, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_BLOCKING_acquisition_mode_unsafe_release_after_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_BLOCKING_acquisition_mode_unsafe_release_before_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_BLOCKING_acquisition_mode_unsafe_release_never, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_BLOCKING_acquisition_mode_unsafe_release_unlocked, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_NON_BLOCKING_acquisition_mode_safe_release_after_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_NON_BLOCKING_acquisition_mode_safe_release_before_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_NON_BLOCKING_acquisition_mode_safe_release_never, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_NON_BLOCKING_acquisition_mode_safe_release_unlocked, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_NON_BLOCKING_acquisition_mode_unsafe_release_after_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_NON_BLOCKING_acquisition_mode_unsafe_release_before_timeout, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_NON_BLOCKING_acquisition_mode_unsafe_release_never, test/inductor/test_caching.py::LocksTest::test_acquire_with_timeout_lock_typename_Lock_lock_timeout_NON_BLOCKING_acquisition_mode_unsafe_release_unlocked, test/inductor/test_caching.py::UtilsTest::test_lru_cache, test/inductor/test_caching.py::UtilsTest::test_try_pickle_key_pickle_able_False, test/inductor/test_caching.py::UtilsTest::test_try_pickle_key_pickle_able_True, test/inductor/test_caching.py::UtilsTest::test_try_pickle_value_pickle_able_False, test/inductor/test_caching.py::UtilsTest::test_try_pickle_value_pickle_able_True, test/inductor/test_caching.py::UtilsTest::test_try_unpickle_value_unpickle_able_False, test/inductor/test_caching.py::UtilsTest::test_try_unpickle_value_unpickle_able_True 2025-10-10T02:02:28.7453439Z 2025-10-10T02:02:30.7771348Z 2025-10-10T02:02:30.7772294Z export/test_upgrader 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_upgrader_1.1_7c1498cc0631c845_.log 2025-10-10T02:02:30.7774840Z Running 6 items in this shard: test/export/test_upgrader.py::TestUpgrader::test_field_renaming_chain_from_v0_complete, test/export/test_upgrader.py::TestUpgrader::test_field_renaming_chain_from_v0_missing_field, test/export/test_upgrader.py::TestUpgrader::test_field_renaming_from_v1_partial_chain, test/export/test_upgrader.py::TestUpgrader::test_nn_module_stack_error_handling_invalid_type, test/export/test_upgrader.py::TestUpgrader::test_nn_module_stack_transformation_from_v0, test/export/test_upgrader.py::TestUpgrader::test_nodes_without_metadata_handled_gracefully 2025-10-10T02:02:30.7776841Z 2025-10-10T02:02:32.6632783Z Running dynamo/test_sets 1/1 ... [2025-10-10 02:02:32.662701] 2025-10-10T02:02:32.6633238Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:02:32.6634676Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_sets.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:02:32.663116] 2025-10-10T02:02:34.6061463Z Running dynamo/test_unspec 1/1 ... [2025-10-10 02:02:34.605592] 2025-10-10T02:02:34.6062157Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:02:34.6063892Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_unspec.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:02:34.605972] 2025-10-10T02:02:36.9866200Z 2025-10-10T02:02:36.9867209Z dynamo/test_sets 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_sets_1.1_9a443ccf099885fe_.log 2025-10-10T02:02:36.9898904Z Running 124 items in this shard: test/dynamo/test_sets.py::CustomSetTests::test_custom_add, test/dynamo/test_sets.py::CustomSetTests::test_custom_contains, test/dynamo/test_sets.py::MiscTests::test_isdisjoint_with_generator, test/dynamo/test_sets.py::TestSetGuards::test_in_guard, test/dynamo/test_sets.py::TestSetGuards::test_set_guard_on_keys_change, test/dynamo/test_sets.py::TestSetGuards::test_set_multiple_types, test/dynamo/test_sets.py::TestSetGuards::test_set_recompile_on_key_change, test/dynamo/test_sets.py::TestSetGuards::test_set_recompile_on_key_pop, test/dynamo/test_sets.py::TestSetGuards::test_set_with_function, test/dynamo/test_sets.py::TestSetGuards::test_set_with_tensors, test/dynamo/test_sets.py::FrozensetTests::test_binop_and, test/dynamo/test_sets.py::FrozensetTests::test_binop_or, test/dynamo/test_sets.py::FrozensetTests::test_binop_sub, test/dynamo/test_sets.py::FrozensetTests::test_binop_xor, test/dynamo/test_sets.py::FrozensetTests::test_cmp_eq, test/dynamo/test_sets.py::FrozensetTests::test_cmp_greater_than, test/dynamo/test_sets.py::FrozensetTests::test_cmp_greater_than_or_equal, test/dynamo/test_sets.py::FrozensetTests::test_cmp_less_than, test/dynamo/test_sets.py::FrozensetTests::test_cmp_less_than_or_equal, test/dynamo/test_sets.py::FrozensetTests::test_cmp_ne, test/dynamo/test_sets.py::FrozensetTests::test_constructor_iterable, test/dynamo/test_sets.py::FrozensetTests::test_contains, test/dynamo/test_sets.py::FrozensetTests::test_copy, test/dynamo/test_sets.py::FrozensetTests::test_difference, test/dynamo/test_sets.py::FrozensetTests::test_equality, test/dynamo/test_sets.py::FrozensetTests::test_in_frozenset, test/dynamo/test_sets.py::FrozensetTests::test_intersection, test/dynamo/test_sets.py::FrozensetTests::test_isdisjoint, test/dynamo/test_sets.py::FrozensetTests::test_issubset, test/dynamo/test_sets.py::FrozensetTests::test_issuperset, test/dynamo/test_sets.py::FrozensetTests::test_symmetric_difference, test/dynamo/test_sets.py::FrozensetTests::test_to_frozenset, test/dynamo/test_sets.py::FrozensetTests::test_to_set, test/dynamo/test_sets.py::FrozensetTests::test_union, test/dynamo/test_sets.py::SetTests::test_add, test/dynamo/test_sets.py::SetTests::test_binop_and, test/dynamo/test_sets.py::SetTests::test_binop_or, test/dynamo/test_sets.py::SetTests::test_binop_sub, test/dynamo/test_sets.py::SetTests::test_binop_xor, test/dynamo/test_sets.py::SetTests::test_clear, test/dynamo/test_sets.py::SetTests::test_cmp_eq, test/dynamo/test_sets.py::SetTests::test_cmp_greater_than, test/dynamo/test_sets.py::SetTests::test_cmp_greater_than_or_equal, test/dynamo/test_sets.py::SetTests::test_cmp_less_than, test/dynamo/test_sets.py::SetTests::test_cmp_less_than_or_equal, test/dynamo/test_sets.py::SetTests::test_cmp_ne, test/dynamo/test_sets.py::SetTests::test_constructor_iterable, test/dynamo/test_sets.py::SetTests::test_contains, test/dynamo/test_sets.py::SetTests::test_copy, test/dynamo/test_sets.py::SetTests::test_difference, test/dynamo/test_sets.py::SetTests::test_difference_update, test/dynamo/test_sets.py::SetTests::test_discard, test/dynamo/test_sets.py::SetTests::test_equality, test/dynamo/test_sets.py::SetTests::test_in_frozenset, test/dynamo/test_sets.py::SetTests::test_intersection, test/dynamo/test_sets.py::SetTests::test_intersection_update, test/dynamo/test_sets.py::SetTests::test_isdisjoint, test/dynamo/test_sets.py::SetTests::test_issubset, test/dynamo/test_sets.py::SetTests::test_issuperset, test/dynamo/test_sets.py::SetTests::test_pop, test/dynamo/test_sets.py::SetTests::test_remove, test/dynamo/test_sets.py::SetTests::test_symmetric_difference, test/dynamo/test_sets.py::SetTests::test_symmetric_difference_update, test/dynamo/test_sets.py::SetTests::test_to_frozenset, test/dynamo/test_sets.py::SetTests::test_to_set, test/dynamo/test_sets.py::SetTests::test_union, test/dynamo/test_sets.py::SetTests::test_update, test/dynamo/test_sets.py::UserDefinedSetTests::test_add, test/dynamo/test_sets.py::UserDefinedSetTests::test_binop_and, test/dynamo/test_sets.py::UserDefinedSetTests::test_binop_or, test/dynamo/test_sets.py::UserDefinedSetTests::test_binop_sub, test/dynamo/test_sets.py::UserDefinedSetTests::test_binop_xor, test/dynamo/test_sets.py::UserDefinedSetTests::test_clear, test/dynamo/test_sets.py::UserDefinedSetTests::test_cmp_eq, test/dynamo/test_sets.py::UserDefinedSetTests::test_cmp_greater_than, test/dynamo/test_sets.py::UserDefinedSetTests::test_cmp_greater_than_or_equal, test/dynamo/test_sets.py::UserDefinedSetTests::test_cmp_less_than, test/dynamo/test_sets.py::UserDefinedSetTests::test_cmp_less_than_or_equal, test/dynamo/test_sets.py::UserDefinedSetTests::test_cmp_ne, test/dynamo/test_sets.py::UserDefinedSetTests::test_constructor_iterable, test/dynamo/test_sets.py::UserDefinedSetTests::test_contains, test/dynamo/test_sets.py::UserDefinedSetTests::test_copy, test/dynamo/test_sets.py::UserDefinedSetTests::test_difference, test/dynamo/test_sets.py::UserDefinedSetTests::test_difference_update, test/dynamo/test_sets.py::UserDefinedSetTests::test_discard, test/dynamo/test_sets.py::UserDefinedSetTests::test_equality, test/dynamo/test_sets.py::UserDefinedSetTests::test_in_frozenset, test/dynamo/test_sets.py::UserDefinedSetTests::test_intersection, test/dynamo/test_sets.py::UserDefinedSetTests::test_intersection_update, test/dynamo/test_sets.py::UserDefinedSetTests::test_isdisjoint, test/dynamo/test_sets.py::UserDefinedSetTests::test_issubset, test/dynamo/test_sets.py::UserDefinedSetTests::test_issuperset, test/dynamo/test_sets.py::UserDefinedSetTests::test_pop, test/dynamo/test_sets.py::UserDefinedSetTests::test_remove, test/dynamo/test_sets.py::UserDefinedSetTests::test_symmetric_difference, test/dynamo/test_sets.py::UserDefinedSetTests::test_symmetric_difference_update, test/dynamo/test_sets.py::UserDefinedSetTests::test_to_frozenset, test/dynamo/test_sets.py::UserDefinedSetTests::test_to_set, test/dynamo/test_sets.py::UserDefinedSetTests::test_union, test/dynamo/test_sets.py::UserDefinedSetTests::test_update, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_binop_and, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_binop_or, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_binop_sub, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_binop_xor, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_cmp_eq, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_cmp_greater_than, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_cmp_greater_than_or_equal, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_cmp_less_than, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_cmp_less_than_or_equal, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_cmp_ne, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_constructor_iterable, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_contains, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_copy, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_difference, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_equality, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_in_frozenset, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_intersection, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_isdisjoint, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_issubset, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_issuperset, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_symmetric_difference, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_to_frozenset, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_to_set, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_union 2025-10-10T02:02:36.9927839Z 2025-10-10T02:02:38.9306913Z 2025-10-10T02:02:38.9308197Z dynamo/test_unspec 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_unspec_1.1_36fcfd22ed8d330c_.log 2025-10-10T02:02:38.9330241Z Running 51 items in this shard: test/dynamo/test_unspec.py::UnspecTests::test_argmin_coerces_symint_to_intlist_spec, test/dynamo/test_unspec.py::UnspecTests::test_bool_tensor_ctor, test/dynamo/test_unspec.py::UnspecTests::test_builtin_getitem, test/dynamo/test_unspec.py::UnspecTests::test_builtin_max_min, test/dynamo/test_unspec.py::UnspecTests::test_compiled_random_calls_are_random, test/dynamo/test_unspec.py::UnspecTests::test_conv1d_symint_padding, test/dynamo/test_unspec.py::UnspecTests::test_data_dependent_evaluate_expr_graph_break, test/dynamo/test_unspec.py::UnspecTests::test_defaults, test/dynamo/test_unspec.py::UnspecTests::test_exponential, test/dynamo/test_unspec.py::UnspecTests::test_feed_random_values_into_graph_only, test/dynamo/test_unspec.py::UnspecTests::test_isinstance_symint, test/dynamo/test_unspec.py::UnspecTests::test_item_max, test/dynamo/test_unspec.py::UnspecTests::test_mark_01_dynamic, test/dynamo/test_unspec.py::UnspecTests::test_mark_static_inside, test/dynamo/test_unspec.py::UnspecTests::test_mark_unbacked, test/dynamo/test_unspec.py::UnspecTests::test_mark_unbacked_channels_last, test/dynamo/test_unspec.py::UnspecTests::test_mark_unbacked_hint_consistency, test/dynamo/test_unspec.py::UnspecTests::test_multiple_consecutive_random_calls_before_graph, test/dynamo/test_unspec.py::UnspecTests::test_no_recompilations, test/dynamo/test_unspec.py::UnspecTests::test_no_recompilations_with_efficient_attention, test/dynamo/test_unspec.py::UnspecTests::test_no_recompiles, test/dynamo/test_unspec.py::UnspecTests::test_no_recompiles_prod_backward, test/dynamo/test_unspec.py::UnspecTests::test_numpy_correctness, test/dynamo/test_unspec.py::UnspecTests::test_propagate_dynamic_dim, test/dynamo/test_unspec.py::UnspecTests::test_prune_torch_check, test/dynamo/test_unspec.py::UnspecTests::test_random_call_with_while_loop, test/dynamo/test_unspec.py::UnspecTests::test_random_object, test/dynamo/test_unspec.py::UnspecTests::test_random_object_methods, test/dynamo/test_unspec.py::UnspecTests::test_random_object_overridden_methods, test/dynamo/test_unspec.py::UnspecTests::test_random_values_with_graph_break, test/dynamo/test_unspec.py::UnspecTests::test_rshift_dynamic, test/dynamo/test_unspec.py::UnspecTests::test_shape_graph_break, test/dynamo/test_unspec.py::UnspecTests::test_specializing_numpy_float_in_control_flow, test/dynamo/test_unspec.py::UnspecTests::test_split_aot_autograd, test/dynamo/test_unspec.py::UnspecTests::test_sum_dimlist_spec, test/dynamo/test_unspec.py::UnspecTests::test_sym_int_conversion, test/dynamo/test_unspec.py::UnspecTests::test_symbol_guard_limit_before_specialize, test/dynamo/test_unspec.py::UnspecTests::test_symfloat_no_replacement, test/dynamo/test_unspec.py::UnspecTests::test_symfloat_to_tensor, test/dynamo/test_unspec.py::UnspecTests::test_tensorfiy_python_scalars_1, test/dynamo/test_unspec.py::UnspecTests::test_tensorfiy_python_scalars_2, test/dynamo/test_unspec.py::UnspecTests::test_tensorfiy_python_scalars_3, test/dynamo/test_unspec.py::UnspecTests::test_to_tensor, test/dynamo/test_unspec.py::UnspecTests::test_unspec_float_input, test/dynamo/test_unspec.py::UnspecTests::test_unspec_float_input_f64, test/dynamo/test_unspec.py::UnspecTests::test_unspec_float_output, test/dynamo/test_unspec.py::UnspecTests::test_unspec_float_precision, test/dynamo/test_unspec.py::UnspecTests::test_unspec_roundtrip_float_input, test/dynamo/test_unspec.py::UnspecTests::test_unspecialized_float_multiply_precision, test/dynamo/test_unspec.py::UnspecTests::test_use_and_specialize, test/dynamo/test_unspec.py::UnspecTestsDeviceCUDA::test_builtin_functions_on_device_cuda 2025-10-10T02:02:38.9352580Z 2025-10-10T02:02:40.8955931Z Running dynamo/test_python_dispatcher 1/1 ... [2025-10-10 02:02:40.895024] 2025-10-10T02:02:40.8956570Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:02:40.8958238Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_python_dispatcher.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:02:40.895404] 2025-10-10T02:02:42.9145702Z Running dynamo/test_optimizers 1/1 ... [2025-10-10 02:02:42.913997] 2025-10-10T02:02:42.9146291Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:02:42.9148215Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_optimizers.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:02:42.914424] 2025-10-10T02:02:44.9683489Z 2025-10-10T02:02:44.9684785Z dynamo/test_python_dispatcher 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_python_dispatcher_1.1_31d628dd1efd6e18_.log 2025-10-10T02:02:44.9688895Z Running 6 items in this shard: test/dynamo/test_python_dispatcher.py::PythonDispatcherTests::test_dispatch_key1, test/dynamo/test_python_dispatcher.py::PythonDispatcherTests::test_dispatch_key2, test/dynamo/test_python_dispatcher.py::PythonDispatcherTests::test_dispatch_key3, test/dynamo/test_python_dispatcher.py::PythonDispatcherTests::test_dispatch_key4, test/dynamo/test_python_dispatcher.py::PythonDispatcherTests::test_dispatch_key_set_guard, test/dynamo/test_python_dispatcher.py::PythonDispatcherTests::test_functorch_interpreter 2025-10-10T02:02:44.9692166Z 2025-10-10T02:02:47.0877064Z 2025-10-10T02:02:47.0877855Z dynamo/test_optimizers 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_optimizers_1.1_5a96456b2482bde0_.log 2025-10-10T02:02:47.0879643Z Running 3 items in this shard: test/dynamo/test_optimizers.py::End2EndTests::test_init_group, test/dynamo/test_optimizers.py::End2EndTests::test_optimizing_over_tensor_with_requires_grad, test/dynamo/test_optimizers.py::End2EndTests::test_state_dict 2025-10-10T02:02:47.0880672Z 2025-10-10T02:02:48.9188364Z Running dynamo/test_flat_apply 1/1 ... [2025-10-10 02:02:48.918304] 2025-10-10T02:02:48.9189003Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:02:48.9195636Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_flat_apply.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:02:48.918707] 2025-10-10T02:02:50.9268993Z Running dynamo/test_higher_order_ops 1/1 ... [2025-10-10 02:02:50.926297] 2025-10-10T02:02:50.9270248Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:02:50.9272889Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_higher_order_ops.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:02:50.926716] 2025-10-10T02:02:53.0420797Z 2025-10-10T02:02:53.0421949Z dynamo/test_flat_apply 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_flat_apply_1.1_e475d81eda6118a4_.log 2025-10-10T02:02:53.0425072Z Running 4 items in this shard: test/dynamo/test_flat_apply.py::FlatApplyTests::test_non_tensor_output, test/dynamo/test_flat_apply.py::FlatApplyTests::test_nonstrict_trace_captured_tensor_post_aot_graph, test/dynamo/test_flat_apply.py::FlatApplyTests::test_nonstrict_trace_dynamo_graph, test/dynamo/test_flat_apply.py::FlatApplyTests::test_simple 2025-10-10T02:02:53.0427297Z 2025-10-10T02:02:56.9861257Z Running export/test_nativert 1/1 ... [2025-10-10 02:02:56.985564] 2025-10-10T02:02:56.9861786Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:02:56.9863772Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_nativert.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:02:56.985963] 2025-10-10T02:02:59.4092950Z 2025-10-10T02:02:59.4094034Z dynamo/test_higher_order_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_higher_order_ops_1.1_0c512c3b1d5a9285_.log 2025-10-10T02:02:59.4174794Z Running 229 items in this shard: test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_access_module_attr, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_allow_python_side_effects_utility, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_capture_constants, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_capture_global_num, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_capture_global_num_adds_guard, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_capture_input_num, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_capture_numpy_number, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_capture_tracked, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_capture_tracked_nested, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_capture_untracked_global, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_capture_untracked_global_nested, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_capture_untracked_nonlocal, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_capture_value_created_in_subgraph, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_concat_unbacked_shape_tensor, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_cond_branches_no_arguments, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_cond_branches_no_arguments_no_closure, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_cond_free_variable_in_both_branches, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_cond_graph_break_in_one_branch, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_cond_pytree_operands, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_cond_pytree_operands_with_non_tensor_leaves, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_cond_side_effect_in_one_branches, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_cond_source_fn_stack, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_cond_subgraph_name_is_valid, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_cond_with_constant_pred, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_cond_with_empty_operands, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_dynamic_shapes_over_vmap_batch_size, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_enum_arg, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_error_message_sane, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_fallback_on_graph_break_complicated, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_fallback_on_graph_break_simple, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_flat_list_output, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_fn_with_kwargs_in_torch_ops, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_freevars_as_inputs_to_wrap, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_grad_source_fn_stack, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_hints_wrapper, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_hints_wrapper_incorrect_type, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_hints_wrapper_no_hints, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_hints_wrapper_pytree_inputs, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_hooks, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_hopify_generic_wrap, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_inlined_functions, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_internal_nonlocal, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_lift_tensor_constant, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_lift_tensors_with_compound_expressions, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_lift_tensors_with_shared_symbols, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_make_closure, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_map_example_value_metadata_consistent_with_eager, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_map_graph_break, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_map_kwargs, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_map_lowers_to_graph, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_map_multi_return, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_map_pytree_return, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_map_side_effect, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_map_source_fn_stack, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_map_subgraph_name_is_valid, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_map_symint_input, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_modules, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_nested_tuple_output, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_nested_wrap, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_no_freevars, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_output_with_dict, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_register_mode, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_register_subclass, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_return_captured_var, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_return_captured_var_used_multiple_times, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_return_captured_vars, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_same_freevar_twice, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_del_existing_attr_global_module, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_del_existing_attr_global_obj, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_del_existing_attr_nonlocal_module, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_del_existing_attr_nonlocal_obj, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_in_body, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_local_list_append_no_graph_break, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_mutate_global_list, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_mutate_global_num, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_mutate_global_num_builtin, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_mutate_global_tensor, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_mutate_global_tensor_builtin, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_mutate_nonlocal_num, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_mutate_nonlocal_num_builtin, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_mutate_nonlocal_tensor, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_mutate_nonlocal_tensor_builtin, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_nested_nonlocal_list_append_graph_break, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_nonlocal_list_append_graph_break, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_set_existing_attr_global_module, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_set_existing_attr_global_obj, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_set_existing_attr_nonlocal_module, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_set_existing_attr_nonlocal_obj, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_set_new_attr_global_module, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_set_new_attr_global_obj, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_set_new_attr_nonlocal_module, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_side_effect_set_new_attr_nonlocal_obj, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_support_float_in_output, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_symint_in_slice, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_symint_input, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_tensor_and_unbacked_symbol_closure, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_tensor_to_list_closure, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_tensor_with_unbacked_shape_closure, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_unbacked_symbol_closure, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_vmap_multiply_scalar, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_vmap_source_fn_stack, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_wrap_all_kwarg, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_wrap_allow_local_assign_in_body_fn, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_wrap_kwarg, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_wrap_kwarg_default, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_wrap_kwarg_default_else_branch, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_wrap_kwarg_default_if_branch, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_wrap_kwarg_int, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_wrap_kwarg_only, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_wrap_kwarg_recompile, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_wrap_pytree_args_nested, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_wrap_pytree_args_not_const_symint_tensor, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_wrap_pytree_args_with_symint_constant, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_wrap_pytree_kwargs, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_wrap_source_fn_stack, test/dynamo/test_higher_order_ops.py::HigherOrderOpTests::test_wrap_subgraph_name_is_valid, test/dynamo/test_higher_order_ops.py::HigherOrderOpVmapGuardTests::test_dual_level_guard, test/dynamo/test_higher_order_ops.py::HigherOrderOpVmapGuardTests::test_emit_functorch_guard_if_active, test/dynamo/test_higher_order_ops.py::HigherOrderOpVmapGuardTests::test_grad_guard_fail, test/dynamo/test_higher_order_ops.py::HigherOrderOpVmapGuardTests::test_jvp_guard_fail, test/dynamo/test_higher_order_ops.py::HigherOrderOpVmapGuardTests::test_linearize_recompiles, test/dynamo/test_higher_order_ops.py::HigherOrderOpVmapGuardTests::test_vmap_grad_guard_ok, test/dynamo/test_higher_order_ops.py::HigherOrderOpVmapGuardTests::test_vmap_grad_vmap_guard_fail, test/dynamo/test_higher_order_ops.py::HigherOrderOpVmapGuardTests::test_vmap_guard_fail, test/dynamo/test_higher_order_ops.py::HigherOrderOpVmapGuardTests::test_vmap_guard_fail_different_state, test/dynamo/test_higher_order_ops.py::HigherOrderOpVmapGuardTests::test_vmap_guard_ok, test/dynamo/test_higher_order_ops.py::HigherOrderOpVmapGuardTests::test_vmap_recompile_different_states, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_functional_call, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_functional_call_disable_inline_nn_module, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_functional_call_sequential_params_and_buffers, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_grad, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_grad_call_compiled_backward_fn, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_grad_call_torch_compile_fn, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_grad_capture_tensor, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_grad_closure_scalar, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_grad_fn_with_kwargs, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_grad_freevar_python_scalar, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_grad_freevar_tensor, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_grad_has_aux, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_grad_non_tensor_input, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_grad_over_grad, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_grad_pytree, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_grad_recompile, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_grad_two_tensor_all_grad_has_aux, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_grad_two_tensor_has_aux, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_grad_with_graph_break, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_grad_with_side_effect, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_hessian, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_hessian_argnums, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_jacfwd, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_jacfwd_has_aux, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_jacfwd_randomness, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_jacfwd_two_tensors_argnums, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_jacrev, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_jacrev_has_aux, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_jacrev_two_tensors_argnums, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_jvp_call_torch_compile_fn, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_jvp_freevar_python_scalar, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_jvp_freevar_tensor, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_jvp_has_aux, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_jvp_jvp, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_jvp_simple, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_jvp_two_tensors_disable_enable_disable_grad, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_jvp_two_tensors_disable_grad, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_jvp_two_tensors_has_aux, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_linearize_jvp_fn, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vjp, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vjp_call_compiled_backward_fn, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vjp_has_aux, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vjp_multiple_outputs, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vjp_multiple_outputs_python_struct, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_call_compiled_backward_fn, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_call_torch_compile_fn, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_free_const, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_free_tensor, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_get_wrapped, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_kwargs, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_multiple_invocation_in_dims, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_multiple_invocation_out_dims, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_multiple_outputs, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_multiple_outputs_diff_dims, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_multiple_outputs_out_dims_tuple, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_new_tensor_implicit_via_op, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_new_tensor_in_body, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_new_tensor_unused_in_body, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_out_dims_None, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_over_vmap_captured, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_over_vmap_two_inputs, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_previous_illegal_op_no_graph_break, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_pytree_inputs, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_recompile, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_recompile_different_config, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_recompile_same_config, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_recompile_with_randomness, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_side_effects, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_side_effects_append_input, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_two_inputs, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_two_inputs_tuple_in_dims, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_with_conditional_graph_break, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_with_graph_break, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_with_graph_break_2, test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_vmap_with_graph_break_lambda, test/dynamo/test_higher_order_ops.py::ActivationCheckpointingTests::test_cond_with_invalid_kwargs, test/dynamo/test_higher_order_ops.py::ActivationCheckpointingTests::test_cond_with_kwargs, test/dynamo/test_higher_order_ops.py::ActivationCheckpointingTests::test_cond_with_mismatched_output, test/dynamo/test_higher_order_ops.py::ActivationCheckpointingTests::test_dropout, test/dynamo/test_higher_order_ops.py::ActivationCheckpointingTests::test_dropout_inductor, test/dynamo/test_higher_order_ops.py::ActivationCheckpointingTests::test_fallback, test/dynamo/test_higher_order_ops.py::ActivationCheckpointingTests::test_flop_counter_for_cond, test/dynamo/test_higher_order_ops.py::ActivationCheckpointingTests::test_flop_counter_for_cond_unbalanced_branches, test/dynamo/test_higher_order_ops.py::ActivationCheckpointingTests::test_flop_counter_for_nested_cond, test/dynamo/test_higher_order_ops.py::ActivationCheckpointingTests::test_function, test/dynamo/test_higher_order_ops.py::ActivationCheckpointingTests::test_function_with_kwargs, test/dynamo/test_higher_order_ops.py::ActivationCheckpointingTests::test_module, test/dynamo/test_higher_order_ops.py::ActivationCheckpointingTests::test_non_aliasing_util, test/dynamo/test_higher_order_ops.py::ActivationCheckpointingTests::test_override_fallthrough_dispatch_key, test/dynamo/test_higher_order_ops.py::TestHigherOrderOpsOpInfoCUDA::test_hops_compile_backend_aot_eager_auto_functionalize_simple_cuda_float32, test/dynamo/test_higher_order_ops.py::TestHigherOrderOpsOpInfoCUDA::test_hops_compile_backend_aot_eager_cond_simple_cuda_float32, test/dynamo/test_higher_order_ops.py::TestHigherOrderOpsOpInfoCUDA::test_hops_compile_backend_aot_eager_invoke_quant_packed_simple_cuda_float32, test/dynamo/test_higher_order_ops.py::TestHigherOrderOpsOpInfoCUDA::test_hops_compile_backend_aot_eager_invoke_quant_simple_cuda_float32, test/dynamo/test_higher_order_ops.py::TestHigherOrderOpsOpInfoCUDA::test_hops_compile_backend_aot_eager_invoke_subgraph_simple_cuda_float32, test/dynamo/test_higher_order_ops.py::TestHigherOrderOpsOpInfoCUDA::test_hops_compile_backend_aot_eager_while_loop_stack_output_simple_cuda_float32, test/dynamo/test_higher_order_ops.py::TestHigherOrderOpsOpInfoCUDA::test_hops_compile_backend_inductor_auto_functionalize_simple_cuda_float32, test/dynamo/test_higher_order_ops.py::TestHigherOrderOpsOpInfoCUDA::test_hops_compile_backend_inductor_cond_simple_cuda_float32, test/dynamo/test_higher_order_ops.py::TestHigherOrderOpsOpInfoCUDA::test_hops_compile_backend_inductor_invoke_quant_packed_simple_cuda_float32, test/dynamo/test_higher_order_ops.py::TestHigherOrderOpsOpInfoCUDA::test_hops_compile_backend_inductor_invoke_quant_simple_cuda_float32, test/dynamo/test_higher_order_ops.py::TestHigherOrderOpsOpInfoCUDA::test_hops_compile_backend_inductor_invoke_subgraph_simple_cuda_float32, test/dynamo/test_higher_order_ops.py::TestHigherOrderOpsOpInfoCUDA::test_hops_compile_backend_inductor_while_loop_stack_output_simple_cuda_float32 2025-10-10T02:02:59.4251194Z 2025-10-10T02:03:03.3200261Z Running inductor/test_cpu_repro 1/1 ... [2025-10-10 02:03:03.319330] 2025-10-10T02:03:03.3200811Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:03:03.3202953Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cpu_repro.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:03:03.319753] 2025-10-10T02:03:04.5657019Z 2025-10-10T02:03:04.5658230Z export/test_nativert 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_nativert_1.1_ea78fa49e0fcb5eb_.log 2025-10-10T02:03:04.5660148Z Running 6 items in this shard: test/export/test_nativert.py::TestNativeRT::test_aoti_0_cpu, test/export/test_nativert.py::TestNativeRT::test_aoti_1_cpu, test/export/test_nativert.py::TestNativeRT::test_aoti_2_cpu, test/export/test_nativert.py::TestNativeRT::test_aoti_3_cuda, test/export/test_nativert.py::TestNativeRT::test_aoti_4_cuda, test/export/test_nativert.py::TestNativeRT::test_aoti_5_cuda 2025-10-10T02:03:04.5661538Z 2025-10-10T02:03:08.5363130Z Running dynamo/test_graph_deduplication 1/1 ... [2025-10-10 02:03:08.535798] 2025-10-10T02:03:08.5363756Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:03:08.5366935Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_graph_deduplication.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:03:08.536268] 2025-10-10T02:03:12.7101368Z 2025-10-10T02:03:12.7102604Z dynamo/test_graph_deduplication 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_graph_deduplication_1.1_fc2a050026445f44_.log 2025-10-10T02:03:12.7112984Z Running 18 items in this shard: test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_autocast_ordering, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_cycle_detection_arg_and_additional_deps, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_cycle_detection_complex, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_cycle_detection_no_cycle, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_cycle_detection_simple, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_cycle_detection_single_node, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_cycle_detection_two_node, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_dependent_subgraphs, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_input_aliasing, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_input_mutation, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_multiple_subgraphs, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_mutation_ordering, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_output_nodes_last, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_param_transfer_to_submodule, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_single_subgraph, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_single_subgraph2, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_tuple_inputs, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_tuple_return 2025-10-10T02:03:12.7122392Z 2025-10-10T02:03:16.5816552Z Running dynamo/test_export 1/1 ... [2025-10-10 02:03:16.581128] 2025-10-10T02:03:16.5817137Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:03:16.5819575Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_export.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:03:16.581513] 2025-10-10T02:03:21.1062621Z 2025-10-10T02:03:21.1063786Z dynamo/test_export 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_export_1.1_015a8a0a286ca800_.log 2025-10-10T02:03:21.1116449Z Running 187 items in this shard: test/dynamo/test_export.py::ExportTests::test_access_class_method_from_user_class_attr, test/dynamo/test_export.py::ExportTests::test_access_class_method_from_user_class_builtin, test/dynamo/test_export.py::ExportTests::test_byte_tensor_does_not_crash, test/dynamo/test_export.py::ExportTests::test_capture_symbolic_tracing_simple_within_fake_mode, test/dynamo/test_export.py::ExportTests::test_capture_symbolic_tracing_within_fake_mode, test/dynamo/test_export.py::ExportTests::test_cond_free_variables_overlapping, test/dynamo/test_export.py::ExportTests::test_cond_op_param_buffer_lifted, test/dynamo/test_export.py::ExportTests::test_cond_raise_user_error_on_branch_args_mismatch, test/dynamo/test_export.py::ExportTests::test_cond_raise_user_error_on_branch_return_multiple_tensors, test/dynamo/test_export.py::ExportTests::test_cond_raise_user_error_on_branch_return_non_tensor, test/dynamo/test_export.py::ExportTests::test_cond_raise_user_error_on_mismatch_return_length, test/dynamo/test_export.py::ExportTests::test_cond_raise_user_error_on_mismatch_return_tensor_meta, test/dynamo/test_export.py::ExportTests::test_cond_raise_user_error_on_missing_args, test/dynamo/test_export.py::ExportTests::test_cond_raise_user_error_on_non_list_operands, test/dynamo/test_export.py::ExportTests::test_cond_raise_user_error_on_non_tensor_operands, test/dynamo/test_export.py::ExportTests::test_cond_raise_user_error_on_unsupported_pred, test/dynamo/test_export.py::ExportTests::test_cond_supported_pred_types, test/dynamo/test_export.py::ExportTests::test_constraint_violation_error_messages, test/dynamo/test_export.py::ExportTests::test_dataclass_input_output, test/dynamo/test_export.py::ExportTests::test_dict_return, test/dynamo/test_export.py::ExportTests::test_dict_return_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_dupes, test/dynamo/test_export.py::ExportTests::test_dupes_2, test/dynamo/test_export.py::ExportTests::test_dupes_2_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_dupes_and_bypass, test/dynamo/test_export.py::ExportTests::test_dupes_and_bypass_reorder_with_non_tensor_arg, test/dynamo/test_export.py::ExportTests::test_dupes_and_bypass_reorder_with_non_tensor_arg_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_dupes_and_bypass_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_dupes_and_bypass_with_non_tensor_arg, test/dynamo/test_export.py::ExportTests::test_dupes_and_bypass_with_non_tensor_arg_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_dupes_and_bypass_with_non_tensor_output, test/dynamo/test_export.py::ExportTests::test_dupes_and_bypass_with_non_tensor_output_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_dupes_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_dynamic_slicing, test/dynamo/test_export.py::ExportTests::test_dynamic_slicing_invalid, test/dynamo/test_export.py::ExportTests::test_dynamic_slicing_simple, test/dynamo/test_export.py::ExportTests::test_dynamo_enum_in_tuple, test/dynamo/test_export.py::ExportTests::test_dynamo_list_index, test/dynamo/test_export.py::ExportTests::test_empty, test/dynamo/test_export.py::ExportTests::test_enforce_equalities, test/dynamo/test_export.py::ExportTests::test_export, test/dynamo/test_export.py::ExportTests::test_export_compare_optimize_with_make_fx, test/dynamo/test_export.py::ExportTests::test_export_cond_in_aten_symbolic, test/dynamo/test_export.py::ExportTests::test_export_control_flow_with_getattr, test/dynamo/test_export.py::ExportTests::test_export_decomp, test/dynamo/test_export.py::ExportTests::test_export_decomp_asserts_bad_args, test/dynamo/test_export.py::ExportTests::test_export_defaults_ok, test/dynamo/test_export.py::ExportTests::test_export_dynamic_control_flow_error, test/dynamo/test_export.py::ExportTests::test_export_dynamic_dim_cleanup, test/dynamo/test_export.py::ExportTests::test_export_dynamic_dim_not_1, test/dynamo/test_export.py::ExportTests::test_export_dynamic_dim_range_constraint, test/dynamo/test_export.py::ExportTests::test_export_graph_bypass, test/dynamo/test_export.py::ExportTests::test_export_graph_bypass_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_export_graph_with_complex_reorder, test/dynamo/test_export.py::ExportTests::test_export_graph_with_complex_reorder_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_export_graph_with_list, test/dynamo/test_export.py::ExportTests::test_export_graph_with_list_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_export_identity, test/dynamo/test_export.py::ExportTests::test_export_masking_with_no_grad, test/dynamo/test_export.py::ExportTests::test_export_meta, test/dynamo/test_export.py::ExportTests::test_export_meta_val, test/dynamo/test_export.py::ExportTests::test_export_mismatched_out, test/dynamo/test_export.py::ExportTests::test_export_mismatched_out_2, test/dynamo/test_export.py::ExportTests::test_export_mismatched_out_2_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_export_mismatched_out_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_export_module_specify_constraints_signature, test/dynamo/test_export.py::ExportTests::test_export_multi_dynamic_dim_constraint, test/dynamo/test_export.py::ExportTests::test_export_multi_dynamic_dim_unsafe_relationship, test/dynamo/test_export.py::ExportTests::test_export_nn_module_stack_patched_module, test/dynamo/test_export.py::ExportTests::test_export_no_raise, test/dynamo/test_export.py::ExportTests::test_export_no_tensor_computation_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_export_pass_arg_by_name, test/dynamo/test_export.py::ExportTests::test_export_pass_arg_by_name_star_args, test/dynamo/test_export.py::ExportTests::test_export_persist_assert, test/dynamo/test_export.py::ExportTests::test_export_preserve_constraints_as_metadata_tensor, test/dynamo/test_export.py::ExportTests::test_export_preserves_nn_module_stack_for_get_attr, test/dynamo/test_export.py::ExportTests::test_export_raise_guard_full_constraint, test/dynamo/test_export.py::ExportTests::test_export_raise_guard_partial_constraint, test/dynamo/test_export.py::ExportTests::test_export_raise_on_relationship, test/dynamo/test_export.py::ExportTests::test_export_shape_control_flow_1, test/dynamo/test_export.py::ExportTests::test_export_specialized_int, test/dynamo/test_export.py::ExportTests::test_export_symbolic_shape, test/dynamo/test_export.py::ExportTests::test_export_with_args_and_empty_kwargs, test/dynamo/test_export.py::ExportTests::test_export_with_args_with_default_None, test/dynamo/test_export.py::ExportTests::test_export_with_args_with_default_float, test/dynamo/test_export.py::ExportTests::test_export_with_args_with_default_tensor, test/dynamo/test_export.py::ExportTests::test_export_with_args_with_default_tuple, test/dynamo/test_export.py::ExportTests::test_export_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_export_with_builtin_op_on_assume_constant, test/dynamo/test_export.py::ExportTests::test_export_with_cond_branches_calling_methods, test/dynamo/test_export.py::ExportTests::test_export_with_cond_closure, test/dynamo/test_export.py::ExportTests::test_export_with_cond_dynamic_shape_pred, test/dynamo/test_export.py::ExportTests::test_export_with_cond_with_closed_function, test/dynamo/test_export.py::ExportTests::test_export_with_constant_dict_values, test/dynamo/test_export.py::ExportTests::test_export_with_constant_free_function, test/dynamo/test_export.py::ExportTests::test_export_with_constant_free_function_and_class_method, test/dynamo/test_export.py::ExportTests::test_export_with_constant_free_function_and_class_method_multiarg, test/dynamo/test_export.py::ExportTests::test_export_with_constant_free_function_and_class_method_multiarg_diff, test/dynamo/test_export.py::ExportTests::test_export_with_constant_global_function, test/dynamo/test_export.py::ExportTests::test_export_with_constant_in_unspecialized_nn_module, test/dynamo/test_export.py::ExportTests::test_export_with_constant_list_nonzero, test/dynamo/test_export.py::ExportTests::test_export_with_constant_list_nonzero_free_function, test/dynamo/test_export.py::ExportTests::test_export_with_constant_method_on_module, test/dynamo/test_export.py::ExportTests::test_export_with_constant_method_on_module_invoke_twice, test/dynamo/test_export.py::ExportTests::test_export_with_constant_none_control_flow, test/dynamo/test_export.py::ExportTests::test_export_with_constant_none_control_flow_free_func, test/dynamo/test_export.py::ExportTests::test_export_with_constant_not_none_control_flow, test/dynamo/test_export.py::ExportTests::test_export_with_constant_not_none_control_flow_free_func, test/dynamo/test_export.py::ExportTests::test_export_with_constant_not_none_control_flow_pos, test/dynamo/test_export.py::ExportTests::test_export_with_constant_not_return_const, test/dynamo/test_export.py::ExportTests::test_export_with_constant_tuple_nonzero, test/dynamo/test_export.py::ExportTests::test_export_with_functools_wrapped_fn, test/dynamo/test_export.py::ExportTests::test_export_with_functools_wrapped_method, test/dynamo/test_export.py::ExportTests::test_export_with_kwargs, test/dynamo/test_export.py::ExportTests::test_export_with_kwargs_and_empty_args, test/dynamo/test_export.py::ExportTests::test_export_with_kwargs_with_default_None, test/dynamo/test_export.py::ExportTests::test_export_with_kwargs_with_default_float, test/dynamo/test_export.py::ExportTests::test_export_with_kwargs_with_default_tensor, test/dynamo/test_export.py::ExportTests::test_export_with_kwargs_with_default_tuple, test/dynamo/test_export.py::ExportTests::test_export_with_map_cond, test/dynamo/test_export.py::ExportTests::test_export_with_map_zero_sized_tensor, test/dynamo/test_export.py::ExportTests::test_export_with_map_zero_sized_tensor_suppress_errors, test/dynamo/test_export.py::ExportTests::test_export_with_module_layer, test/dynamo/test_export.py::ExportTests::test_export_with_nonzero_static, test/dynamo/test_export.py::ExportTests::test_export_with_shallow_list_copy_with_side_effects, test/dynamo/test_export.py::ExportTests::test_export_with_shallow_list_copy_wo_side_effects, test/dynamo/test_export.py::ExportTests::test_export_with_stack_trace, test/dynamo/test_export.py::ExportTests::test_export_with_symbool_inputs, test/dynamo/test_export.py::ExportTests::test_export_with_wrapped_fn, test/dynamo/test_export.py::ExportTests::test_exported_graph_serialization, test/dynamo/test_export.py::ExportTests::test_func_return, test/dynamo/test_export.py::ExportTests::test_func_return_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_fx_pytree, test/dynamo/test_export.py::ExportTests::test_immutable_list_dict, test/dynamo/test_export.py::ExportTests::test_input_container_type, test/dynamo/test_export.py::ExportTests::test_invalid_input_global, test/dynamo/test_export.py::ExportTests::test_invalid_input_global_multiple_access, test/dynamo/test_export.py::ExportTests::test_invalid_input_nonlocal, test/dynamo/test_export.py::ExportTests::test_invalid_input_unused_nonlocal_ok, test/dynamo/test_export.py::ExportTests::test_list_contains, test/dynamo/test_export.py::ExportTests::test_list_not_contains, test/dynamo/test_export.py::ExportTests::test_list_unpack, test/dynamo/test_export.py::ExportTests::test_list_unpack_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_map_cond_param_buffer_lifted, test/dynamo/test_export.py::ExportTests::test_mixed_real_and_fake_inputs, test/dynamo/test_export.py::ExportTests::test_multiple_outputs_op_with_evaluator, test/dynamo/test_export.py::ExportTests::test_nested_cond_op_param_buffer_lifted, test/dynamo/test_export.py::ExportTests::test_no_tensor_computation, test/dynamo/test_export.py::ExportTests::test_no_tensor_computation_2, test/dynamo/test_export.py::ExportTests::test_no_tensor_computation_2_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_no_tensor_computation_fail, test/dynamo/test_export.py::ExportTests::test_not_functionalize, test/dynamo/test_export.py::ExportTests::test_param_buffer_safe_from_mutation_recurse, test/dynamo/test_export.py::ExportTests::test_param_buffer_safe_from_mutation_simple, test/dynamo/test_export.py::ExportTests::test_pre_dispatch_simple, test/dynamo/test_export.py::ExportTests::test_predispatch_with_for_out_dtype, test/dynamo/test_export.py::ExportTests::test_predispatch_with_for_out_dtype_nested, test/dynamo/test_export.py::ExportTests::test_predispatch_with_higher_order, test/dynamo/test_export.py::ExportTests::test_predispatch_with_higher_order_nested, test/dynamo/test_export.py::ExportTests::test_preserve_fx_node_metadata, test/dynamo/test_export.py::ExportTests::test_preserve_fx_node_metadata_graph_break, test/dynamo/test_export.py::ExportTests::test_preserve_fx_node_metadata_inline, test/dynamo/test_export.py::ExportTests::test_preserve_fx_node_metadata_recompile, test/dynamo/test_export.py::ExportTests::test_remove_redundant_dynamic_dim_in_error_message, test/dynamo/test_export.py::ExportTests::test_retracibility, test/dynamo/test_export.py::ExportTests::test_retracibility_dict_container_inp_out, test/dynamo/test_export.py::ExportTests::test_retracibility_nested_list_out, test/dynamo/test_export.py::ExportTests::test_round_dynamic_shapes, test/dynamo/test_export.py::ExportTests::test_strict_fake_tensor_prop_real_tensors, test/dynamo/test_export.py::ExportTests::test_subclass_parameters, test/dynamo/test_export.py::ExportTests::test_sum_param, test/dynamo/test_export.py::ExportTests::test_sym_contains, test/dynamo/test_export.py::ExportTests::test_symbolic_tracing_within_fake_mode_with_constraints, test/dynamo/test_export.py::ExportTests::test_symbolic_tracing_within_fake_mode_with_constraints_with_parameters, test/dynamo/test_export.py::ExportTests::test_symbool, test/dynamo/test_export.py::ExportTests::test_torch_inference_mode_ctx, test/dynamo/test_export.py::ExportTests::test_trivial_constraint, test/dynamo/test_export.py::ExportTests::test_uncaptured_higher_order_op_error_not_suppresed, test/dynamo/test_export.py::ExportTests::test_untracked_inputs_in_constraints, test/dynamo/test_export.py::ExportTests::test_zeroes_in_and_out_different_shape_on_test, test/dynamo/test_export.py::ExportTests::test_zeroes_in_and_out_different_shape_on_test_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_zeroes_in_new_shape_scalar_out, test/dynamo/test_export.py::ExportTests::test_zeroes_in_new_shape_scalar_out_permute, test/dynamo/test_export.py::ExportTests::test_zeroes_in_new_shape_scalar_out_permute_dupe_and_bypass, test/dynamo/test_export.py::ExportTestsDeviceCUDA::test_export_fast_binary_broadcast_check_cuda, test/dynamo/test_export.py::ExportTestsDeviceCUDA::test_export_fast_binary_broadcast_check_unbacked_cuda, test/dynamo/test_export.py::ExportTestsDeviceCUDA::test_export_with_parameters_cuda 2025-10-10T02:03:21.1166806Z 2025-10-10T02:03:24.9415282Z Running dynamo/test_error_messages 1/1 ... [2025-10-10 02:03:24.941012] 2025-10-10T02:03:24.9415875Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:03:24.9417777Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_error_messages.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:03:24.941410] 2025-10-10T02:03:29.1654471Z 2025-10-10T02:03:29.1655573Z dynamo/test_error_messages 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_error_messages_1.1_052d645fe43b2353_.log 2025-10-10T02:03:29.1669322Z Running 39 items in this shard: test/dynamo/test_error_messages.py::ErrorMessagesTest::test_assert_failure_in_generic_ctx_mgr, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_backend_fake_tensor_exc, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_class_property, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_cpp_extension_recommends_custom_ops, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_data_dependent_branching_fullgraph, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_data_dependent_branching_gb, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_data_dependent_operator2, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_dict_items_input, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_disable_message, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_dynamic_shape_operator_no_meta_kernel, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_dynamo_graph_break_fn, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_dynamo_graph_break_fn_with_msg, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_faketensor_nyi, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_generic_ctx_mgr_graph_break, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_graph_break_in_buggy_resume_prologue, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_graph_break_traceback_above_dynamo_shows_user_code, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_graph_break_traceback_collapsed_resume_frames, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_internal_compiler_stacktrace_verbose, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_load_build_class, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_lru_cache_warning_logs_nested_call, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_lru_cache_warning_logs_user_stack_trace, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_nested_compile_user_frames, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_no_internal_compiler_stacktrace, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_observed_exception, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_optree_graph_break_message, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_reconstruction_failure, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_reconstruction_failure_gb, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_skipfile_call, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_skipfile_dynamo_call, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_skipfile_inline, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_slice_with_tensor, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_sort_with_nonconstant_keys, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_super_call_function, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_super_call_method, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_uninitialized_module, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_unsupported_builtin, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_unsupported_bytecode, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_unsupported_context, test/dynamo/test_error_messages.py::ErrorMessagesTest::test_warnings 2025-10-10T02:03:29.1681586Z 2025-10-10T02:03:33.0558685Z Running export/test_hop 1/1 ... [2025-10-10 02:03:33.055342] 2025-10-10T02:03:33.0559284Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:03:33.0561212Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_hop.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:03:33.055723] 2025-10-10T02:03:37.8813158Z 2025-10-10T02:03:37.8814051Z export/test_hop 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_hop_1.1_44ab375d262f0764_.log 2025-10-10T02:03:37.8828985Z Running 44 items in this shard: test/export/test_hop.py::TestHOPCUDA::test_aot_export_auto_functionalize_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_aot_export_cond_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_aot_export_flex_attention_backward_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_aot_export_flex_attention_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_aot_export_invoke_quant_packed_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_aot_export_invoke_quant_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_aot_export_invoke_subgraph_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_aot_export_local_map_hop_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_aot_export_scan_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_aot_export_while_loop_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_aot_export_while_loop_stack_output_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_pre_dispatch_export_auto_functionalize_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_pre_dispatch_export_cond_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_pre_dispatch_export_flex_attention_backward_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_pre_dispatch_export_flex_attention_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_pre_dispatch_export_invoke_quant_packed_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_pre_dispatch_export_invoke_quant_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_pre_dispatch_export_invoke_subgraph_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_pre_dispatch_export_local_map_hop_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_pre_dispatch_export_scan_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_pre_dispatch_export_while_loop_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_pre_dispatch_export_while_loop_stack_output_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_retrace_export_auto_functionalize_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_retrace_export_cond_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_retrace_export_flex_attention_backward_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_retrace_export_flex_attention_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_retrace_export_invoke_quant_packed_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_retrace_export_invoke_quant_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_retrace_export_invoke_subgraph_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_retrace_export_local_map_hop_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_retrace_export_scan_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_retrace_export_while_loop_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_retrace_export_while_loop_stack_output_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_serialize_export_auto_functionalize_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_serialize_export_cond_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_serialize_export_flex_attention_backward_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_serialize_export_flex_attention_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_serialize_export_invoke_quant_packed_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_serialize_export_invoke_quant_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_serialize_export_invoke_subgraph_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_serialize_export_local_map_hop_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_serialize_export_scan_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_serialize_export_while_loop_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_serialize_export_while_loop_stack_output_simple_cuda_float32 2025-10-10T02:03:37.8843552Z 2025-10-10T02:03:41.7749354Z Running dynamo/test_cudagraphs_expandable_segments 1/1 ... [2025-10-10 02:03:41.774293] 2025-10-10T02:03:41.7750511Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:03:41.7751725Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_cudagraphs_expandable_segments.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:03:41.774660] 2025-10-10T02:03:45.9480014Z 2025-10-10T02:03:45.9481192Z dynamo/test_cudagraphs_expandable_segments 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_cudagraphs_expandable_segments_1.1_a7bea9564038dfdc_.log 2025-10-10T02:03:45.9484969Z Running 8 items in this shard: test/dynamo/test_cudagraphs_expandable_segments.py::TestAotCudagraphs::test_basic, test/dynamo/test_cudagraphs_expandable_segments.py::TestAotCudagraphs::test_dead_fill, test/dynamo/test_cudagraphs_expandable_segments.py::TestAotCudagraphs::test_dtoh, test/dynamo/test_cudagraphs_expandable_segments.py::TestAotCudagraphs::test_factory, test/dynamo/test_cudagraphs_expandable_segments.py::TestAotCudagraphs::test_htod, test/dynamo/test_cudagraphs_expandable_segments.py::TestAotCudagraphs::test_mutate_constant, test/dynamo/test_cudagraphs_expandable_segments.py::TestAotCudagraphs::test_mutate_input, test/dynamo/test_cudagraphs_expandable_segments.py::TestAotCudagraphs::test_mutated_metadata 2025-10-10T02:03:45.9487773Z 2025-10-10T02:03:49.8212805Z Running dynamo/test_recompile_ux 1/1 ... [2025-10-10 02:03:49.820791] 2025-10-10T02:03:49.8213447Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:03:49.8215919Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_recompile_ux.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:03:49.821211] 2025-10-10T02:03:53.9955103Z 2025-10-10T02:03:53.9955982Z dynamo/test_recompile_ux 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_recompile_ux_1.1_fcd2102d267c6683_.log 2025-10-10T02:03:53.9959615Z Running 10 items in this shard: test/dynamo/test_recompile_ux.py::RecompileUxTests::test_drop_cache_on_skip, test/dynamo/test_recompile_ux.py::RecompileUxTests::test_dynamic_input, test/dynamo/test_recompile_ux.py::RecompileUxTests::test_fail_on_recompile_limit_hit, test/dynamo/test_recompile_ux.py::RecompileUxTests::test_loop_torture, test/dynamo/test_recompile_ux.py::RecompileUxTests::test_mismatched_type, test/dynamo/test_recompile_ux.py::RecompileUxTests::test_multiple_guard_fails, test/dynamo/test_recompile_ux.py::RecompileUxTests::test_multiple_guard_fails_report_all, test/dynamo/test_recompile_ux.py::RecompileUxTests::test_nvfuser_guards, test/dynamo/test_recompile_ux.py::RecompileUxTests::test_recompile_child_run_only, test/dynamo/test_recompile_ux.py::RecompileUxTests::test_verbose_tensor_check 2025-10-10T02:03:53.9962474Z 2025-10-10T02:03:57.9095695Z Running inductor/test_mmdecomp 1/1 ... [2025-10-10 02:03:57.909030] 2025-10-10T02:03:57.9096323Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:03:57.9098583Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_mmdecomp.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:03:57.909428] 2025-10-10T02:04:05.2395494Z 2025-10-10T02:04:05.2396735Z inductor/test_mmdecomp 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_mmdecomp_1.1_a1642879794c1991_.log 2025-10-10T02:04:05.2406989Z Running 28 items in this shard: test/inductor/test_mmdecomp.py::TestDecompCUDA::test_batched_mm_bfloat16_bs_10_cuda_bfloat16, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_batched_mm_bfloat16_bs_1_cuda_bfloat16, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_batched_mm_bfloat16_bs_2_cuda_bfloat16, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_batched_mm_bfloat16_bs_4_cuda_bfloat16, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_batched_mm_float32_bs_10_cuda_float32, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_batched_mm_float32_bs_1_cuda_float32, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_batched_mm_float32_bs_2_cuda_float32, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_batched_mm_float32_bs_4_cuda_float32, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_bmm_batch2_last_dim_size_is_one_cuda, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_dynamic_shape_mm_bfloat16_cuda_bfloat16, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_dynamic_shape_mm_float32_cuda_float32, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_simple_mm_bfloat16_cuda_bfloat16, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_simple_mm_float32_cuda_float32, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_some_batched_bfloat16_bs_10_cuda_bfloat16, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_some_batched_bfloat16_bs_1_cuda_bfloat16, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_some_batched_bfloat16_bs_2_cuda_bfloat16, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_some_batched_bfloat16_bs_4_cuda_bfloat16, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_some_batched_float32_bs_10_cuda_float32, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_some_batched_float32_bs_1_cuda_float32, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_some_batched_float32_bs_2_cuda_float32, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_some_batched_float32_bs_4_cuda_float32, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_some_batched_int32_bs_10_cuda_int32, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_some_batched_int32_bs_1_cuda_int32, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_some_batched_int32_bs_2_cuda_int32, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_some_batched_int32_bs_4_cuda_int32, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_some_bfloat16_cuda_bfloat16, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_some_float32_cuda_float32, test/inductor/test_mmdecomp.py::TestDecompCUDA::test_some_int32_cuda_int32 2025-10-10T02:04:05.2416203Z 2025-10-10T02:04:06.7020694Z Uploading artifacts took 1.46 seconds 2025-10-10T02:04:09.1178201Z Running dynamo/test_precompile_context 1/1 ... [2025-10-10 02:04:09.117197] 2025-10-10T02:04:09.1179054Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:04:09.1180586Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_precompile_context.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:04:09.117622] 2025-10-10T02:04:16.9021819Z 2025-10-10T02:04:16.9023024Z dynamo/test_precompile_context 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_precompile_context_1.1_aa671318c4c80024_.log 2025-10-10T02:04:16.9025564Z Running 3 items in this shard: test/dynamo/test_precompile_context.py::PrecompileContextTests::test_basic, test/dynamo/test_precompile_context.py::PrecompileContextTests::test_editable, test/dynamo/test_precompile_context.py::PrecompileContextTests::test_serialize_by_key 2025-10-10T02:04:16.9027313Z 2025-10-10T02:04:19.1203907Z 2025-10-10T02:04:19.1204869Z inductor/test_minifier_isolate 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_minifier_isolate_1.1_1234750d7788ee70_.log 2025-10-10T02:04:19.1206349Z Running 2 items in this shard: test/inductor/test_minifier_isolate.py::MinifierIsolateTests::test_after_aot_cpu_runtime_error, test/inductor/test_minifier_isolate.py::MinifierIsolateTests::test_after_aot_gpu_runtime_error 2025-10-10T02:04:19.1207282Z 2025-10-10T02:04:21.1839315Z Running dynamo/test_bytecode_utils 1/1 ... [2025-10-10 02:04:21.183386] 2025-10-10T02:04:21.1851656Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:04:21.1854712Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_bytecode_utils.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:04:21.183830] 2025-10-10T02:04:23.5158008Z Running export/test_pass_infra 1/1 ... [2025-10-10 02:04:23.515228] 2025-10-10T02:04:23.5158589Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:04:23.5160613Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_pass_infra.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:04:23.515596] 2025-10-10T02:04:23.8168340Z 2025-10-10T02:04:23.8169610Z inductor/test_cpu_repro 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cpu_repro_1.1_8c0256faf5df4382_.log 2025-10-10T02:04:23.8604992Z Running 745 items in this shard: test/inductor/test_cpu_repro.py::CPUReproTests::test_ModularIndexing_range_issue_103133, test/inductor/test_cpu_repro.py::CPUReproTests::test__adaptive_avg_pool2d, test/inductor/test_cpu_repro.py::CPUReproTests::test_acosh_with_negative_large_input, test/inductor/test_cpu_repro.py::CPUReproTests::test_add_layernorm, test/inductor/test_cpu_repro.py::CPUReproTests::test_argmax_argmin_with_nan_value, test/inductor/test_cpu_repro.py::CPUReproTests::test_argmin, test/inductor/test_cpu_repro.py::CPUReproTests::test_asinh_with_corner_inputs, test/inductor/test_cpu_repro.py::CPUReproTests::test_aten_normal_dtype, test/inductor/test_cpu_repro.py::CPUReproTests::test_atomic_add_lowp_fp, test/inductor/test_cpu_repro.py::CPUReproTests::test_attention_size_mismatch, test/inductor/test_cpu_repro.py::CPUReproTests::test_auto_simd, test/inductor/test_cpu_repro.py::CPUReproTests::test_auto_zvec_vsx_simd, test/inductor/test_cpu_repro.py::CPUReproTests::test_avx2_bool_constant_pad_nd, test/inductor/test_cpu_repro.py::CPUReproTests::test_bf16_zeros, test/inductor/test_cpu_repro.py::CPUReproTests::test_bitwise_logical_op_bool, test/inductor/test_cpu_repro.py::CPUReproTests::test_bitwise_right_shift, test/inductor/test_cpu_repro.py::CPUReproTests::test_bitwise_shift_corner_inputs, test/inductor/test_cpu_repro.py::CPUReproTests::test_bool_max, test/inductor/test_cpu_repro.py::CPUReproTests::test_bool_reduction_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_broadcast_mul_lowp_fp, test/inductor/test_cpu_repro.py::CPUReproTests::test_broadcast_scalar_cpp_tile_2d_kernel, test/inductor/test_cpu_repro.py::CPUReproTests::test_cat_mul, test/inductor/test_cpu_repro.py::CPUReproTests::test_channel_shuffle_cl_output, test/inductor/test_cpu_repro.py::CPUReproTests::test_channels_last_view_as_complex, test/inductor/test_cpu_repro.py::CPUReproTests::test_complex_cholesky_mh_view_fallback, test/inductor/test_cpu_repro.py::CPUReproTests::test_complex_memory_overlap, test/inductor/test_cpu_repro.py::CPUReproTests::test_concat_inner_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_consistent_remove_buffers, test/inductor/test_cpu_repro.py::CPUReproTests::test_constant_bool_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_constant_store, test/inductor/test_cpu_repro.py::CPUReproTests::test_conv1d_strided_weight_torch_compile, test/inductor/test_cpu_repro.py::CPUReproTests::test_conv2d_autocast, test/inductor/test_cpu_repro.py::CPUReproTests::test_conv2d_bn_mixed_dtype, test/inductor/test_cpu_repro.py::CPUReproTests::test_conv2d_packed, test/inductor/test_cpu_repro.py::CPUReproTests::test_conv_in_channel_1_dynamic_shapes, test/inductor/test_cpu_repro.py::CPUReproTests::test_conv_stride_constraints, test/inductor/test_cpu_repro.py::CPUReproTests::test_conv_transpose2d_has_output_size_input, test/inductor/test_cpu_repro.py::CPUReproTests::test_conv_transpose2d_packed_cpu, test/inductor/test_cpu_repro.py::CPUReproTests::test_conv_used_from_multiple_places, test/inductor/test_cpu_repro.py::CPUReproTests::test_convert_double_to_fp32_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_convert_fp32_int64_oob_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_convert_fp32_to_double_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_convert_fp32_to_int64_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_convert_int32_to_int64_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_convert_int64_to_fp32_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_convert_int64_to_int32_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_convert_int8_to_half_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_cpp_kernel_profile, test/inductor/test_cpu_repro.py::CPUReproTests::test_cpu_vec_cosim, test/inductor/test_cpu_repro.py::CPUReproTests::test_decomposed_dequant_relu_quant_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_decomposed_dequant_relu_quant_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_decomposed_fake_quant_per_channel, test/inductor/test_cpu_repro.py::CPUReproTests::test_dequant_maxpool2d_lowering_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_dequant_maxpool2d_lowering_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_dequant_quant_lowering_fp8_e4m3, test/inductor/test_cpu_repro.py::CPUReproTests::test_dequant_quant_lowering_fp8_e5m2, test/inductor/test_cpu_repro.py::CPUReproTests::test_dequant_quant_lowering_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_dequant_quant_lowering_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_dequant_relu_quant_dequant_relu_quant_lowering_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_dequant_relu_quant_dequant_relu_quant_lowering_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_disabled_amp_is_inference_False, test/inductor/test_cpu_repro.py::CPUReproTests::test_disabled_amp_is_inference_True, test/inductor/test_cpu_repro.py::CPUReproTests::test_do_not_insert_to_dtype_for_memory_copy_only_kernel, test/inductor/test_cpu_repro.py::CPUReproTests::test_double_pointwise_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_double_reduction_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_dropout, test/inductor/test_cpu_repro.py::CPUReproTests::test_eliminate_meaningless_copy, test/inductor/test_cpu_repro.py::CPUReproTests::test_embedding_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_embedding_vec_bf16, test/inductor/test_cpu_repro.py::CPUReproTests::test_expr_vec_non_contiguous, test/inductor/test_cpu_repro.py::CPUReproTests::test_float32_to_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_for_loop_collapsed, test/inductor/test_cpu_repro.py::CPUReproTests::test_fp32_load_with_to_lowp_fp, test/inductor/test_cpu_repro.py::CPUReproTests::test_fp8_cast_bfloat16_shape_15,3,13, test/inductor/test_cpu_repro.py::CPUReproTests::test_fp8_cast_bfloat16_shape_4,2048,4096, test/inductor/test_cpu_repro.py::CPUReproTests::test_fp8_cast_float16_shape_15,3,13, test/inductor/test_cpu_repro.py::CPUReproTests::test_fp8_cast_float16_shape_4,2048,4096, test/inductor/test_cpu_repro.py::CPUReproTests::test_fp8_cast_float32_shape_15,3,13, test/inductor/test_cpu_repro.py::CPUReproTests::test_fp8_cast_float32_shape_4,2048,4096, test/inductor/test_cpu_repro.py::CPUReproTests::test_fractional_max_pool2d_3d_input, test/inductor/test_cpu_repro.py::CPUReproTests::test_frexp, test/inductor/test_cpu_repro.py::CPUReproTests::test_full_bits_lowp, test/inductor/test_cpu_repro.py::CPUReproTests::test_full_boolean_dynamic_shape, test/inductor/test_cpu_repro.py::CPUReproTests::test_fused_attention_conv, test/inductor/test_cpu_repro.py::CPUReproTests::test_fused_node, test/inductor/test_cpu_repro.py::CPUReproTests::test_group_norm_backward_symint_divisible_channels, test/inductor/test_cpu_repro.py::CPUReproTests::test_group_norm_large_input, test/inductor/test_cpu_repro.py::CPUReproTests::test_group_norm_large_size, test/inductor/test_cpu_repro.py::CPUReproTests::test_group_norm_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_highp_to_lowp_cse_var_cache_with_store, test/inductor/test_cpu_repro.py::CPUReproTests::test_horizontal_fusion, test/inductor/test_cpu_repro.py::CPUReproTests::test_in_out_buffer, test/inductor/test_cpu_repro.py::CPUReproTests::test_index_add, test/inductor/test_cpu_repro.py::CPUReproTests::test_index_propagation_issue_102065, test/inductor/test_cpu_repro.py::CPUReproTests::test_index_put, test/inductor/test_cpu_repro.py::CPUReproTests::test_index_put2, test/inductor/test_cpu_repro.py::CPUReproTests::test_inplace_add_alpha, test/inductor/test_cpu_repro.py::CPUReproTests::test_inplace_squeeze_needed, test/inductor/test_cpu_repro.py::CPUReproTests::test_insert_to_dtype_count, test/inductor/test_cpu_repro.py::CPUReproTests::test_int32_pointwise_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_int32_reduction_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_int64_pointwise_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_int64_reduction_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_int_div, test/inductor/test_cpu_repro.py::CPUReproTests::test_int_div_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_invalid_dropout_args, test/inductor/test_cpu_repro.py::CPUReproTests::test_invalid_index_of_empty_tensor, test/inductor/test_cpu_repro.py::CPUReproTests::test_ir_node_str, test/inductor/test_cpu_repro.py::CPUReproTests::test_issue122380, test/inductor/test_cpu_repro.py::CPUReproTests::test_issue_148058, test/inductor/test_cpu_repro.py::CPUReproTests::test_large_mean, test/inductor/test_cpu_repro.py::CPUReproTests::test_linear_buffer_reuse, test/inductor/test_cpu_repro.py::CPUReproTests::test_linear_float64, test/inductor/test_cpu_repro.py::CPUReproTests::test_linear_packed, test/inductor/test_cpu_repro.py::CPUReproTests::test_linear_used_from_multiple_places, test/inductor/test_cpu_repro.py::CPUReproTests::test_linear_with_no_default_contiguous_input, test/inductor/test_cpu_repro.py::CPUReproTests::test_linear_with_reshape, test/inductor/test_cpu_repro.py::CPUReproTests::test_load_half, test/inductor/test_cpu_repro.py::CPUReproTests::test_load_inf_bf16, test/inductor/test_cpu_repro.py::CPUReproTests::test_load_same_bool_tensor_twice, test/inductor/test_cpu_repro.py::CPUReproTests::test_local_buffer_in_outer_loop_fusion, test/inductor/test_cpu_repro.py::CPUReproTests::test_local_buffer_with_line_reuse, test/inductor/test_cpu_repro.py::CPUReproTests::test_logical_op_store_to_lowp_data_dtype, test/inductor/test_cpu_repro.py::CPUReproTests::test_low_fp_index_expr_issue_147279, test/inductor/test_cpu_repro.py::CPUReproTests::test_lowp_fp_neg_abs, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_change_input_sizes_cpu_unbatched_False_input_size_2_hidden_size_5_num_layers_3_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_2_seq_len_3, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_masked_fill_softmax, test/inductor/test_cpu_repro.py::CPUReproTests::test_masked_fill_with_inf_or_nan_value, test/inductor/test_cpu_repro.py::CPUReproTests::test_masked_load_int64_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_max_reduction_lowp_fp, test/inductor/test_cpu_repro.py::CPUReproTests::test_maxpool2d_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_maxpool2d_with_pre_loop_collapse_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_memory_copy_with_fusion, test/inductor/test_cpu_repro.py::CPUReproTests::test_meta_device, test/inductor/test_cpu_repro.py::CPUReproTests::test_mkl_linear, test/inductor/test_cpu_repro.py::CPUReproTests::test_module_buffer_mutation, test/inductor/test_cpu_repro.py::CPUReproTests::test_multihead_attention_cpu, test/inductor/test_cpu_repro.py::CPUReproTests::test_new_vec_op_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_nn_fold, test/inductor/test_cpu_repro.py::CPUReproTests::test_nn_param_assign, test/inductor/test_cpu_repro.py::CPUReproTests::test_nn_param_assign_wrapped, test/inductor/test_cpu_repro.py::CPUReproTests::test_no_op_squeeze, test/inductor/test_cpu_repro.py::CPUReproTests::test_no_redundant_to_dtypes_between_fused_scheduler_node, test/inductor/test_cpu_repro.py::CPUReproTests::test_non_contiguous_index_with_constant_stride, test/inductor/test_cpu_repro.py::CPUReproTests::test_non_contiguous_load_buf_quant_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_non_contiguous_load_buf_quant_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_non_contiguous_reduction_store, test/inductor/test_cpu_repro.py::CPUReproTests::test_ops_masked_with_bool_input, test/inductor/test_cpu_repro.py::CPUReproTests::test_outer_looop_fusion_with_local_buf, test/inductor/test_cpu_repro.py::CPUReproTests::test_outer_loop_fusion, test/inductor/test_cpu_repro.py::CPUReproTests::test_outer_loop_fusion_buffer_remove, test/inductor/test_cpu_repro.py::CPUReproTests::test_pack_padded_sequence_lstm, test/inductor/test_cpu_repro.py::CPUReproTests::test_pad_with_nan_value, test/inductor/test_cpu_repro.py::CPUReproTests::test_parallel_num_threads, test/inductor/test_cpu_repro.py::CPUReproTests::test_parallel_reduction_vectorization, test/inductor/test_cpu_repro.py::CPUReproTests::test_per_channel_fake_quant_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_per_channel_fake_quant_int8_bf16_input, test/inductor/test_cpu_repro.py::CPUReproTests::test_per_channel_fake_quant_module_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_per_channel_fake_quant_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_per_channel_fake_quant_uint8_bf16_input, test/inductor/test_cpu_repro.py::CPUReproTests::test_per_tensor_fake_quant_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_per_tensor_fake_quant_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_pow_cos, test/inductor/test_cpu_repro.py::CPUReproTests::test_randint_symint_input, test/inductor/test_cpu_repro.py::CPUReproTests::test_reduce_with_masked, test/inductor/test_cpu_repro.py::CPUReproTests::test_reduction_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_reduction_float_to_int64, test/inductor/test_cpu_repro.py::CPUReproTests::test_reduction_with_dynamic_threads, test/inductor/test_cpu_repro.py::CPUReproTests::test_redundant_to_node_elimination_lowp_fp, test/inductor/test_cpu_repro.py::CPUReproTests::test_relu_permute_reshape_reinterpret_view, test/inductor/test_cpu_repro.py::CPUReproTests::test_relu_with_inf_value, test/inductor/test_cpu_repro.py::CPUReproTests::test_repeat_interleave, test/inductor/test_cpu_repro.py::CPUReproTests::test_repeated_exp, test/inductor/test_cpu_repro.py::CPUReproTests::test_require_stride_order_non_owning, test/inductor/test_cpu_repro.py::CPUReproTests::test_scalar_mul_bfloat16, test/inductor/test_cpu_repro.py::CPUReproTests::test_scalar_sign_with_min, test/inductor/test_cpu_repro.py::CPUReproTests::test_scatter_using_atomic_add, test/inductor/test_cpu_repro.py::CPUReproTests::test_scatter_using_atomic_add_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_select_tiliing_with_index_expr, test/inductor/test_cpu_repro.py::CPUReproTests::test_set_source_Tensor, test/inductor/test_cpu_repro.py::CPUReproTests::test_share_local_buffers_in_outer_loop_fusion, test/inductor/test_cpu_repro.py::CPUReproTests::test_sigmoid_with_reduction, test/inductor/test_cpu_repro.py::CPUReproTests::test_sign_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_skip_cpp_codegen, test/inductor/test_cpu_repro.py::CPUReproTests::test_slice_scatter_default_end_value, test/inductor/test_cpu_repro.py::CPUReproTests::test_slice_scatter_issue122291, test/inductor/test_cpu_repro.py::CPUReproTests::test_store_reduction, test/inductor/test_cpu_repro.py::CPUReproTests::test_symbolic_shape_scalar_value_reduction, test/inductor/test_cpu_repro.py::CPUReproTests::test_tanh_atan2, test/inductor/test_cpu_repro.py::CPUReproTests::test_tanh_atan2_use_decompose_tanh, test/inductor/test_cpu_repro.py::CPUReproTests::test_tile2d_load_decomposed_dequant_add_relu_quant_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_tile2d_load_decomposed_dequant_add_relu_quant_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_tile2d_store_channel_shuffle_cl_quant_output_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_tile2d_store_channel_shuffle_cl_quant_output_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_timed_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_to_channels_last_lowp_fp, test/inductor/test_cpu_repro.py::CPUReproTests::test_to_dtype_bool_float, test/inductor/test_cpu_repro.py::CPUReproTests::test_to_dtype_float_bool, test/inductor/test_cpu_repro.py::CPUReproTests::test_to_uint8_rounding_method, test/inductor/test_cpu_repro.py::CPUReproTests::test_torch_linalg_qr_tuple_slice, test/inductor/test_cpu_repro.py::CPUReproTests::test_torch_logit, test/inductor/test_cpu_repro.py::CPUReproTests::test_transpose_copy, test/inductor/test_cpu_repro.py::CPUReproTests::test_transpose_mxn_16_16_bf16_fp16, test/inductor/test_cpu_repro.py::CPUReproTests::test_transpose_mxn_32_32_bf16_fp16, test/inductor/test_cpu_repro.py::CPUReproTests::test_transpose_non_contiguous, test/inductor/test_cpu_repro.py::CPUReproTests::test_transpose_sum2d_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_transpose_sum_outer, test/inductor/test_cpu_repro.py::CPUReproTests::test_transpose_vertical_sum_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_transpose_with_norm, test/inductor/test_cpu_repro.py::CPUReproTests::test_two_local_buffers_in_outer_loop_fusion, test/inductor/test_cpu_repro.py::CPUReproTests::test_two_local_buffers_in_outer_loop_fusion_case2, test/inductor/test_cpu_repro.py::CPUReproTests::test_uint32_pointwise_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_uint32_reduction_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_uint64_pointwise_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_uint64_reduction_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_uint8_add, test/inductor/test_cpu_repro.py::CPUReproTests::test_uint8_sub, test/inductor/test_cpu_repro.py::CPUReproTests::test_unrolled_bool_prod_vectorized, test/inductor/test_cpu_repro.py::CPUReproTests::test_unsupported_conv_transpose, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_bitwise, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_compare_op_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_contiguous_ModularIndexing, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_cpu_only_for_all_available_isa, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_dynamic_shapes, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_indirect_load_cse_cache, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_kernel_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_logical, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_randn, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_remainder, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_transpose_lowp_fp, test/inductor/test_cpu_repro.py::CPUReproTests::test_vector_norm_compile, test/inductor/test_cpu_repro.py::CPUReproTests::test_vertical_sum_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_view_dtype 2025-10-10T02:04:23.9016569Z 2025-10-10T02:04:25.5088353Z 2025-10-10T02:04:25.5089584Z dynamo/test_bytecode_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_bytecode_utils_1.1_67cc7b7c051e6119_.log 2025-10-10T02:04:25.5096185Z Running 20 items in this shard: test/dynamo/test_bytecode_utils.py::BytecodeTests::test_bytecode_analysis_jump_backward_no_interrupt, test/dynamo/test_bytecode_utils.py::BytecodeTests::test_bytecode_from_template, test/dynamo/test_bytecode_utils.py::BytecodeTests::test_bytecode_from_template_noprefix, test/dynamo/test_bytecode_utils.py::BytecodeTests::test_bytecode_from_template_noreturn1, test/dynamo/test_bytecode_utils.py::BytecodeTests::test_bytecode_from_template_noreturn2, test/dynamo/test_bytecode_utils.py::BytecodeTests::test_bytecode_from_template_noreturn_const, test/dynamo/test_bytecode_utils.py::BytecodeTests::test_compute_exception_table_nested, test/dynamo/test_bytecode_utils.py::BytecodeTests::test_exception_table_e2e, test/dynamo/test_bytecode_utils.py::BytecodeTests::test_exception_table_e2e_2, test/dynamo/test_bytecode_utils.py::BytecodeTests::test_exception_table_encode_varint, test/dynamo/test_bytecode_utils.py::BytecodeTests::test_exception_table_entry_propagation, test/dynamo/test_bytecode_utils.py::BytecodeTests::test_exception_table_parsing, test/dynamo/test_bytecode_utils.py::BytecodeTests::test_extended_args_starts_line, test/dynamo/test_bytecode_utils.py::BytecodeTests::test_if_tensor_is_none, test/dynamo/test_bytecode_utils.py::BytecodeTests::test_linetable_310_writer, test/dynamo/test_bytecode_utils.py::BytecodeTests::test_linetable_311_writer1, test/dynamo/test_bytecode_utils.py::BytecodeTests::test_linetable_311_writer2, test/dynamo/test_bytecode_utils.py::BytecodeTests::test_py311_jump_offset, test/dynamo/test_bytecode_utils.py::BytecodeTests::test_remove_dead_code_with_exn_table_entries, test/dynamo/test_bytecode_utils.py::BytecodeHookTests::test_bytecode_hook 2025-10-10T02:04:25.5102610Z 2025-10-10T02:04:27.4890287Z 2025-10-10T02:04:27.4891430Z export/test_pass_infra 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_pass_infra_1.1_10f22a096d017b60_.log 2025-10-10T02:04:27.4893754Z Running 5 items in this shard: test/export/test_pass_infra.py::TestPassInfra::test_cond, test/export/test_pass_infra.py::TestPassInfra::test_export_pass_base, test/export/test_pass_infra.py::TestPassInfra::test_graph_signature_updated_after_transformation, test/export/test_pass_infra.py::TestPassInfra::test_node_name_stability, test/export/test_pass_infra.py::TestPassInfra::test_replace_hook_basic 2025-10-10T02:04:27.4895262Z 2025-10-10T02:04:27.7403629Z Running dynamo/test_guard_manager 1/1 ... [2025-10-10 02:04:27.739742] 2025-10-10T02:04:27.7404224Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:04:27.7406390Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_guard_manager.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:04:27.740178] 2025-10-10T02:04:29.4276906Z Running dynamo/test_minifier 1/1 ... [2025-10-10 02:04:29.427066] 2025-10-10T02:04:29.4277380Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:04:29.4278390Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_minifier.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:04:29.427436] 2025-10-10T02:04:31.4236060Z Running export/test_converter 1/1 ... [2025-10-10 02:04:31.423007] 2025-10-10T02:04:31.4236686Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:04:31.4237676Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_converter.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:04:31.423378] 2025-10-10T02:04:32.3645732Z 2025-10-10T02:04:32.3647268Z dynamo/test_guard_manager 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_guard_manager_1.1_15db491b2515d834_.log 2025-10-10T02:04:32.3665130Z Running 37 items in this shard: test/dynamo/test_guard_manager.py::GuardManagerTests::test_attr_guard_manager, test/dynamo/test_guard_manager.py::GuardManagerTests::test_call_function_no_args_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_clone, test/dynamo/test_guard_manager.py::GuardManagerTests::test_default_device_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_dict_contains_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_dict_getitem_accessor, test/dynamo/test_guard_manager.py::GuardManagerTests::test_dict_guard_manager, test/dynamo/test_guard_manager.py::GuardManagerTests::test_dict_version_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_diff_guard_manager, test/dynamo/test_guard_manager.py::GuardManagerTests::test_dynamic_indices_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_equals_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_framelocals_accessor, test/dynamo/test_guard_manager.py::GuardManagerTests::test_framelocals_guard_e2e, test/dynamo/test_guard_manager.py::GuardManagerTests::test_global_state_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_global_state_reason, test/dynamo/test_guard_manager.py::GuardManagerTests::test_global_weakref, test/dynamo/test_guard_manager.py::GuardManagerTests::test_globals, test/dynamo/test_guard_manager.py::GuardManagerTests::test_guard_manager_leaf_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_id_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_item_guard_manager, test/dynamo/test_guard_manager.py::GuardManagerTests::test_lambda_manager, test/dynamo/test_guard_manager.py::GuardManagerTests::test_length_check_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_no_hasattr_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_no_tensor_aliasing_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_python_lambda_leaf_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_tensor_aliasing_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_tensor_match_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_tuple_iterator_getitem, test/dynamo/test_guard_manager.py::GuardManagerTests::test_type_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_type_manager, test/dynamo/test_guard_manager.py::GuardManagerTests::test_weakref_alive_guard, test/dynamo/test_guard_manager.py::TypePropagationTests::test_basic_types, test/dynamo/test_guard_manager.py::TagSafetyChecks::test_dict_tag_safe, test/dynamo/test_guard_manager.py::TagSafetyChecks::test_immutable_tag_safe, test/dynamo/test_guard_manager.py::TagSafetyChecks::test_nn_module_tag_overridden_getattr_safe, test/dynamo/test_guard_manager.py::TagSafetyChecks::test_nn_module_tag_safe, test/dynamo/test_guard_manager.py::RecursiveDictGuardTests::test_disabling 2025-10-10T02:04:32.3684241Z 2025-10-10T02:04:33.4002931Z 2025-10-10T02:04:33.4004138Z dynamo/test_minifier 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_minifier_1.1_2347d38b77264edb_.log 2025-10-10T02:04:33.4010111Z Running 15 items in this shard: test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cpu_accuracy_backend_passes_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cpu_accuracy_error_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cpu_compile_backend_passes_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cpu_compile_error_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cpu_runtime_backend_passes_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cpu_runtime_error_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cuda_accuracy_backend_passes_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cuda_accuracy_error_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cuda_compile_backend_passes_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cuda_compile_error_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cuda_runtime_backend_passes_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cuda_runtime_error_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_non_leaf_compile_error_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_cpu_cuda_module_after_dynamo_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_if_graph_minified_cuda 2025-10-10T02:04:33.4015537Z 2025-10-10T02:04:36.2272241Z Running export/test_experimental 1/1 ... [2025-10-10 02:04:36.226676] 2025-10-10T02:04:36.2272808Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:04:36.2275361Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_experimental.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:04:36.227095] 2025-10-10T02:04:37.2936296Z Running dynamo/test_input_attr_tracking 1/1 ... [2025-10-10 02:04:37.293078] 2025-10-10T02:04:37.2936942Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:04:37.2938870Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_input_attr_tracking.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:04:37.293459] 2025-10-10T02:04:40.2998532Z 2025-10-10T02:04:40.2999791Z export/test_experimental 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_experimental_1.1_d2bd020472bbf1da_.log 2025-10-10T02:04:40.3005408Z Running 10 items in this shard: test/export/test_experimental.py::TestExperiment::test_export_add_in_out_info, test/export/test_experimental.py::TestExperiment::test_export_leaf, test/export/test_experimental.py::TestExperiment::test_joint_basic, test/export/test_experimental.py::TestExperiment::test_joint_buffer_input_mutations, test/export/test_experimental.py::TestExperiment::test_joint_cifar10_backwards, test/export/test_experimental.py::TestExperiment::test_joint_dynamic, test/export/test_experimental.py::TestExperiment::test_joint_loss_index, test/export/test_experimental.py::TestExperiment::test_sticky_export, test/export/test_experimental.py::TestExperiment::test_sticky_export_dynamic, test/export/test_experimental.py::TestExperiment::test_sticky_export_nested_inp 2025-10-10T02:04:40.3010018Z 2025-10-10T02:04:41.4678073Z 2025-10-10T02:04:41.4679129Z dynamo/test_input_attr_tracking 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_input_attr_tracking_1.1_355e67848140ae83_.log 2025-10-10T02:04:41.4684770Z Running 12 items in this shard: test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_complex_attr_access_with_graph_breaks, test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_complex_attr_access_with_inline_reconstruct, test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_complex_attr_access_without_graph_breaks, test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_const_property_assigned_on_tensor, test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_const_property_on_tensor, test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_guards_correctly_property_assigned_on_tensor_type_change, test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_guards_correctly_property_assigned_on_tensor_type_change_inductor, test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_set_data_on_input_tensor, test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_set_data_on_scoped_tensor, test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_set_data_on_user_defined_class_input_tensor, test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_tensor_property_assigned_on_tensor, test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_tensor_property_on_tensor 2025-10-10T02:04:41.4689850Z 2025-10-10T02:04:44.2339198Z Running dynamo/test_exc 1/1 ... [2025-10-10 02:04:44.233268] 2025-10-10T02:04:44.2339895Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:04:44.2341830Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_exc.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:04:44.233642] 2025-10-10T02:04:45.3565958Z Running dynamo/test_hooks 1/1 ... [2025-10-10 02:04:45.356010] 2025-10-10T02:04:45.3566618Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:04:45.3569332Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_hooks.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:04:45.356456] 2025-10-10T02:04:48.1565477Z 2025-10-10T02:04:48.1566457Z dynamo/test_exc 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_exc_1.1_ec59563413aec021_.log 2025-10-10T02:04:48.1569605Z Running 10 items in this shard: test/dynamo/test_exc.py::ExcTests::test_backend_suppress_line, test/dynamo/test_exc.py::ExcTests::test_graph_break_log, test/dynamo/test_exc.py::ExcTests::test_graph_break_log_generic_jump, test/dynamo/test_exc.py::ExcTests::test_internal_error_no_suppress, test/dynamo/test_exc.py::ExcTests::test_internal_error_suppress_errors, test/dynamo/test_exc.py::ExcTests::test_not_implemented_error, test/dynamo/test_exc.py::ExcTests::test_trigger_bisect_on_error, test/dynamo/test_exc.py::ExcTests::test_trigger_on_error, test/dynamo/test_exc.py::ExcTests::test_unsupported_error, test/dynamo/test_exc.py::ExcTests::test_unsupported_real_stack 2025-10-10T02:04:48.1572210Z 2025-10-10T02:04:49.5796509Z 2025-10-10T02:04:49.5797517Z dynamo/test_hooks 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_hooks_1.1_b526d113aa1fb3f5_.log 2025-10-10T02:04:49.5814121Z Running 34 items in this shard: test/dynamo/test_hooks.py::HooksTests::test_complex_state_mutation_in_intermediary_hooks_same_on_inductor, test/dynamo/test_hooks.py::HooksTests::test_complex_state_mutation_in_intermediary_hooks_same_on_inductor_with_graph_break, test/dynamo/test_hooks.py::HooksTests::test_functools_arg_vary, test/dynamo/test_hooks.py::HooksTests::test_global_module_forward_pre_hook, test/dynamo/test_hooks.py::HooksTests::test_hook_with_closure, test/dynamo/test_hooks.py::HooksTests::test_hook_with_nested_closure, test/dynamo/test_hooks.py::HooksTests::test_input_hooks_same, test/dynamo/test_hooks.py::HooksTests::test_intermediary_hooks, test/dynamo/test_hooks.py::HooksTests::test_intermediary_hooks_same_on_aot_eager, test/dynamo/test_hooks.py::HooksTests::test_intermediary_hooks_same_on_inductor, test/dynamo/test_hooks.py::HooksTests::test_intermediate_hook_with_closure_aot, test/dynamo/test_hooks.py::HooksTests::test_intermediate_hook_with_closure_eager, test/dynamo/test_hooks.py::HooksTests::test_nnmodule_hook_guards, test/dynamo/test_hooks.py::HooksTests::test_no_recompile_on_hook_identity_change, test/dynamo/test_hooks.py::HooksTests::test_no_recompile_on_same_hook, test/dynamo/test_hooks.py::HooksTests::test_post_acc_grad_hook, test/dynamo/test_hooks.py::HooksTests::test_recompile, test/dynamo/test_hooks.py::HooksTests::test_register_hook_partial_guarding, test/dynamo/test_hooks.py::HooksTests::test_removed_handle_return, test/dynamo/test_hooks.py::HooksTests::test_tensor_only_register_hook_in_graph_lambda, test/dynamo/test_hooks.py::HooksTests::test_tensor_only_register_hook_in_graph_local, test/dynamo/test_hooks.py::HooksTests::test_tensor_only_register_hook_in_graph_local_inner, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_global_hook, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_global_hooks_handles_in_list, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_hook_in_graph_break_handle_lambda, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_hook_in_graph_break_handle_local, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_hook_in_graph_lambda, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_hook_in_graph_local, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_hook_multi_handle_return, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_hook_repeated_handle_not_local, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_hook_repeated_handle_return, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_multiple_hooks, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_multiple_hooks_handles_in_list, test/dynamo/test_hooks.py::HooksTests::test_wrap_top_frame_with_hooks 2025-10-10T02:04:49.5831796Z 2025-10-10T02:04:49.9755535Z 2025-10-10T02:04:49.9756611Z export/test_converter 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_converter_1.1_dd2e25c8f5e52a44_.log 2025-10-10T02:04:49.9770868Z Running 48 items in this shard: test/export/test_converter.py::TestConverter::test_aten___getitem___dict, test/export/test_converter.py::TestConverter::test_aten___getitem___list, test/export/test_converter.py::TestConverter::test_aten___is__, test/export/test_converter.py::TestConverter::test_aten___isnot__, test/export/test_converter.py::TestConverter::test_aten___not__, test/export/test_converter.py::TestConverter::test_aten_add_t, test/export/test_converter.py::TestConverter::test_aten_append_t, test/export/test_converter.py::TestConverter::test_aten_dim, test/export/test_converter.py::TestConverter::test_aten_floordiv, test/export/test_converter.py::TestConverter::test_aten_len, test/export/test_converter.py::TestConverter::test_aten_tensor_dtype_int, test/export/test_converter.py::TestConverter::test_aten_tensor_dynamic, test/export/test_converter.py::TestConverter::test_aten_tensor_prim_dtype, test/export/test_converter.py::TestConverter::test_aten_to_dtype_with_mutating_storage, test/export/test_converter.py::TestConverter::test_context_manager, test/export/test_converter.py::TestConverter::test_convert_func_without_param, test/export/test_converter.py::TestConverter::test_convert_if_basic, test/export/test_converter.py::TestConverter::test_convert_if_duplicate_attr_names, test/export/test_converter.py::TestConverter::test_convert_if_multiple_out, test/export/test_converter.py::TestConverter::test_convert_if_tuple_out, test/export/test_converter.py::TestConverter::test_convert_nn_module_with_nested_buffer, test/export/test_converter.py::TestConverter::test_convert_nn_module_with_nested_if_and_buffer, test/export/test_converter.py::TestConverter::test_convert_nn_module_with_nested_if_and_param, test/export/test_converter.py::TestConverter::test_convert_nn_module_with_nested_param, test/export/test_converter.py::TestConverter::test_convert_retrace_nested_scripted_modules, test/export/test_converter.py::TestConverter::test_convert_script_object, test/export/test_converter.py::TestConverter::test_get_tensor_constants, test/export/test_converter.py::TestConverter::test_hidden_input_name, test/export/test_converter.py::TestConverter::test_implicit_constant_to_tensor_handling, test/export/test_converter.py::TestConverter::test_prim_SetAttr, test/export/test_converter.py::TestConverter::test_prim_device, test/export/test_converter.py::TestConverter::test_prim_device_cuda, test/export/test_converter.py::TestConverter::test_prim_dtype, test/export/test_converter.py::TestConverter::test_prim_max, test/export/test_converter.py::TestConverter::test_prim_min, test/export/test_converter.py::TestConverter::test_prim_tolist, test/export/test_converter.py::TestConverter::test_profiler__record_function, test/export/test_converter.py::TestConverter::test_raise_exception, test/export/test_converter.py::TestConverter::test_ts2ep_convert_quantized_model, test/export/test_converter.py::TestConverter::test_ts2ep_convert_quantized_model_with_opcontext, test/export/test_converter.py::TestConverter::test_ts2ep_convert_quantized_model_with_opcontext_and_constant, test/export/test_converter.py::TestConverter::test_ts2ep_converter_basic, test/export/test_converter.py::TestConverter::test_ts2ep_converter_container_output, test/export/test_converter.py::TestConverter::test_ts2ep_converter_contains, test/export/test_converter.py::TestConverter::test_ts2ep_converter_custom_op, test/export/test_converter.py::TestConverter::test_ts2ep_converter_unpack, test/export/test_converter.py::TestConverter::test_ts2ep_multi_outputs_on_call_ops, test/export/test_converter.py::TestConverter::test_ts2ep_with_loop 2025-10-10T02:04:49.9783641Z 2025-10-10T02:04:52.1237461Z Running dynamo/test_trace_rules 1/1 ... [2025-10-10 02:04:52.123180] 2025-10-10T02:04:52.1238095Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:04:52.1239895Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_trace_rules.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:04:52.123586] 2025-10-10T02:04:53.4679316Z Running dynamo/test_exceptions 1/1 ... [2025-10-10 02:04:53.467213] 2025-10-10T02:04:53.4680238Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:04:53.4681917Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_exceptions.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:04:53.467582] 2025-10-10T02:04:53.8373338Z Running export/test_schema 1/1 ... [2025-10-10 02:04:53.836747] 2025-10-10T02:04:53.8373877Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:04:53.8375932Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_schema.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:04:53.837154] 2025-10-10T02:04:56.0964796Z 2025-10-10T02:04:56.0965884Z dynamo/test_trace_rules 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_trace_rules_1.1_b2b5e22c651d65d4_.log 2025-10-10T02:04:56.0969138Z Running 7 items in this shard: test/dynamo/test_trace_rules.py::TraceRuleTests::test_almost_impossible_missing_name, test/dynamo/test_trace_rules.py::TraceRuleTests::test_force_inline_custom_function, test/dynamo/test_trace_rules.py::TraceRuleTests::test_force_inline_torch_function, test/dynamo/test_trace_rules.py::TraceRuleTests::test_no_special_handlers_for_torch_non_c_bindings, test/dynamo/test_trace_rules.py::TraceRuleTests::test_skipfiles_inlinelist, test/dynamo/test_trace_rules.py::TraceRuleTests::test_torch_name_rule_map_updated, test/dynamo/test_trace_rules.py::TestModuleSurviveSkipFiles::test_module_survive_skip_files 2025-10-10T02:04:56.0971643Z 2025-10-10T02:04:57.7913238Z 2025-10-10T02:04:57.7914526Z dynamo/test_exceptions 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_exceptions_1.1_4112bd861ff8dc84_.log 2025-10-10T02:04:57.7936149Z Running 49 items in this shard: test/dynamo/test_exceptions.py::ExceptionTests::test_atrribute_error, test/dynamo/test_exceptions.py::ExceptionTests::test_attribute_error_from_getattr, test/dynamo/test_exceptions.py::ExceptionTests::test_autocast_with_exception, test/dynamo/test_exceptions.py::ExceptionTests::test_block_stack_cleanup, test/dynamo/test_exceptions.py::ExceptionTests::test_custom_getattr_on_module_exception, test/dynamo/test_exceptions.py::ExceptionTests::test_dict_pop, test/dynamo/test_exceptions.py::ExceptionTests::test_dynamo_undo_kw_names, test/dynamo/test_exceptions.py::ExceptionTests::test_ensure_exception_is_active_after_try_except_block, test/dynamo/test_exceptions.py::ExceptionTests::test_ensure_exception_is_active_inside_try_except_block, test/dynamo/test_exceptions.py::ExceptionTests::test_exception, test/dynamo/test_exceptions.py::ExceptionTests::test_exception2, test/dynamo/test_exceptions.py::ExceptionTests::test_exception3, test/dynamo/test_exceptions.py::ExceptionTests::test_exception4, test/dynamo/test_exceptions.py::ExceptionTests::test_exception_else, test/dynamo/test_exceptions.py::ExceptionTests::test_exception_kwargs, test/dynamo/test_exceptions.py::ExceptionTests::test_exception_raised_from_child, test/dynamo/test_exceptions.py::ExceptionTests::test_exception_with_another_exception, test/dynamo/test_exceptions.py::ExceptionTests::test_exception_with_another_exception2, test/dynamo/test_exceptions.py::ExceptionTests::test_exception_with_ctx_manager, test/dynamo/test_exceptions.py::ExceptionTests::test_exception_with_vars, test/dynamo/test_exceptions.py::ExceptionTests::test_handle_all_exceptions, test/dynamo/test_exceptions.py::ExceptionTests::test_isinstance_CustomException, test/dynamo/test_exceptions.py::ExceptionTests::test_key_error, test/dynamo/test_exceptions.py::ExceptionTests::test_nn_module_getattr, test/dynamo/test_exceptions.py::ExceptionTests::test_nn_reraise, test/dynamo/test_exceptions.py::ExceptionTests::test_propagate_exception_inside_ctx_manager, test/dynamo/test_exceptions.py::ExceptionTests::test_raise_GeneratorExit, test/dynamo/test_exceptions.py::ExceptionTests::test_raise_custom_exception, test/dynamo/test_exceptions.py::ExceptionTests::test_raise_custom_exception_with_args, test/dynamo/test_exceptions.py::ExceptionTests::test_raise_finally_simple, test/dynamo/test_exceptions.py::ExceptionTests::test_raise_from_None, test/dynamo/test_exceptions.py::ExceptionTests::test_raise_from_None_2, test/dynamo/test_exceptions.py::ExceptionTests::test_raise_from_other, test/dynamo/test_exceptions.py::ExceptionTests::test_raise_match, test/dynamo/test_exceptions.py::ExceptionTests::test_raise_set___context__, test/dynamo/test_exceptions.py::ExceptionTests::test_reconstruct___context__, test/dynamo/test_exceptions.py::ExceptionTests::test_reconstruct_exception_2, test/dynamo/test_exceptions.py::ExceptionTests::test_reraise, test/dynamo/test_exceptions.py::ExceptionTests::test_reraise_first_exc, test/dynamo/test_exceptions.py::ExceptionTests::test_set___cause___CustomException, test/dynamo/test_exceptions.py::ExceptionTests::test_set___cause___TypeError, test/dynamo/test_exceptions.py::ExceptionTests::test_set___cause___error_CustomException, test/dynamo/test_exceptions.py::ExceptionTests::test_set___cause___error_RuntimeError, test/dynamo/test_exceptions.py::ExceptionTests::test_set_cause_with_arg, test/dynamo/test_exceptions.py::ExceptionTests::test_set_cause_with_arg_error, test/dynamo/test_exceptions.py::ExceptionTests::test_speculation_exception, test/dynamo/test_exceptions.py::ExceptionTests::test_stop_iteration, test/dynamo/test_exceptions.py::ExceptionTests::test_user_defined_exception_variable, test/dynamo/test_exceptions.py::ExceptionTests::test_user_defined_exception_with_args 2025-10-10T02:04:57.7956737Z 2025-10-10T02:04:57.8603197Z 2025-10-10T02:04:57.8604087Z export/test_schema 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_schema_1.1_97e06027c4d60131_.log 2025-10-10T02:04:57.8606121Z Running 5 items in this shard: test/export/test_schema.py::TestSchema::test_schema_check, test/export/test_schema.py::TestSchema::test_schema_comparison, test/export/test_schema.py::TestSchema::test_schema_compatibility, test/export/test_schema.py::TestSchema::test_schema_diff, test/export/test_schema.py::TestSchema::test_thrift_schema_unchanged 2025-10-10T02:04:57.8607488Z 2025-10-10T02:05:00.0162728Z Running inductor/test_mps_basic 1/1 ... [2025-10-10 02:05:00.015616] 2025-10-10T02:05:00.0163470Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:05:00.0166208Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_mps_basic.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:05:00.016004] 2025-10-10T02:05:01.6817139Z Running inductor/test_cudagraph_trees_expandable_segments 1/1 ... [2025-10-10 02:05:01.681191] 2025-10-10T02:05:01.6817840Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:05:01.6819557Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cudagraph_trees_expandable_segments.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:05:01.681597] 2025-10-10T02:05:01.8250118Z Running dynamo/test_subclasses 1/1 ... [2025-10-10 02:05:01.824466] 2025-10-10T02:05:01.8250572Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:05:01.8253983Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_subclasses.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:05:01.824859] 2025-10-10T02:05:07.4955146Z 2025-10-10T02:05:07.4956001Z inductor/test_mps_basic 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_mps_basic_1.1_7bb770612b0fea89_.log 2025-10-10T02:05:07.4956628Z 2025-10-10T02:05:09.5558406Z 2025-10-10T02:05:09.5559373Z dynamo/test_subclasses 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_subclasses_1.1_14d5f88b9fc210d4_.log 2025-10-10T02:05:09.5601066Z Running 125 items in this shard: test/dynamo/test_subclasses.py::SubclassTests::test_as_subclass_attr_mutation, test/dynamo/test_subclasses.py::SubclassTests::test_base_torch_function_tracing, test/dynamo/test_subclasses.py::SubclassTests::test_compile_higher_order_with_functionalization, test/dynamo/test_subclasses.py::SubclassTests::test_compile_with_fake_tensor_automatic_dynamic, test/dynamo/test_subclasses.py::SubclassTests::test_compile_with_fake_tensor_dynamic_dim, test/dynamo/test_subclasses.py::SubclassTests::test_compile_with_functionalization, test/dynamo/test_subclasses.py::SubclassTests::test_disable_all_torch_function, test/dynamo/test_subclasses.py::SubclassTests::test_disable_all_torch_function_restore_values, test/dynamo/test_subclasses.py::SubclassTests::test_disable_all_torch_function_restore_values_graph_break, test/dynamo/test_subclasses.py::SubclassTests::test_has_torch_function, test/dynamo/test_subclasses.py::SubclassTests::test_make_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_mark_static_with_subclass_desugaring_dynamic_False, test/dynamo/test_subclasses.py::SubclassTests::test_mark_static_with_subclass_desugaring_dynamic_True, test/dynamo/test_subclasses.py::SubclassTests::test_newly_constructed_tensor_subclass_attr_mutation, test/dynamo/test_subclasses.py::SubclassTests::test_njt_subclass_from_buffer, test/dynamo/test_subclasses.py::SubclassTests::test_njt_subclass_from_cat, test/dynamo/test_subclasses.py::SubclassTests::test_njt_subclass_simple, test/dynamo/test_subclasses.py::SubclassTests::test_no_call_to_new, test/dynamo/test_subclasses.py::SubclassTests::test_no_torch_function_on_size_bytecode, test/dynamo/test_subclasses.py::SubclassTests::test_no_torch_function_recompiles, test/dynamo/test_subclasses.py::SubclassTests::test_nontraceable_tensor_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_overridden_method_guarding, test/dynamo/test_subclasses.py::SubclassTests::test_parameter_subclass_custom_torch_func_and_dynamic_attr, test/dynamo/test_subclasses.py::SubclassTests::test_parameter_subclass_with_old_torch_function, test/dynamo/test_subclasses.py::SubclassTests::test_recompile_with_symbool_inputs, test/dynamo/test_subclasses.py::SubclassTests::test_recompiles_with_optional_inner_tensor, test/dynamo/test_subclasses.py::SubclassTests::test_return_as_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_return_local_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_return_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_TwoTensor_TwoTensor_TwoTensor, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_TwoTensor_nested_diff_sizes, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_constructor_proxying, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_dont_invoke_torch_function_on_overridden_attr, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_dont_invoke_torch_function_on_overridden_method, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_override_shape_and_to, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_views_dynamic_False, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_views_dynamic_True, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_with_disabled_torch_function, test/dynamo/test_subclasses.py::SubclassTests::test_support_bases, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_automatic_dynamic_shapes, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_clone_view, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_different_shape, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_mark_dynamic_shapes, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_mul, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_nested, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_return_multiple, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_return_shape, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_return_tensor_and_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_simple, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_view, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_view_mul, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_attr_codegen_tos, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_ctx_custom_guards_error_arg_num, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_ctx_custom_guards_error_not_classmethod, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_ctx_custom_guards_override, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_ctx_guards, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_ctx_recursive_guards, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_custom_attr, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_with_non_classmethod_torch_function, test/dynamo/test_subclasses.py::SubclassTests::test_torch_dispatch_subclass_guard_recompile, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_call_on_attr, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_call_on_method, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_call_on_method_arg, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_list_args, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_state_graph_break, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_state_guards, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_state_nested, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_state_tracing, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_subclass_survives_into_aot_autograd, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_wrapper_class, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_wrapper_class_with_kwargs, test/dynamo/test_subclasses.py::SubclassTests::test_type_check_equality_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_type_check_equality_tensor, test/dynamo/test_subclasses.py::SubclassTests::test_type_check_identity_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_type_check_identity_tensor, test/dynamo/test_subclasses.py::SubclassTests::test_type_check_isinstance_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_type_check_isinstance_tensor, test/dynamo/test_subclasses.py::SubclassTests::test_user_overridden_attr_unsupported, test/dynamo/test_subclasses.py::SubclassTests::test_user_overridden_method_unsupported, test/dynamo/test_subclasses.py::SubclassTests::test_user_overridden_property_unsupported, test/dynamo/test_subclasses.py::SubclassTests::test_wrapper_subclass_dynamo_attribute_access_on_intermediate, test/dynamo/test_subclasses.py::SubclassTests::test_wrapper_subclass_guards_on_inner_tensor, test/dynamo/test_subclasses.py::SubclassTests::test_wrapper_subclass_with_differently_sized_inner_tensor, test/dynamo/test_subclasses.py::SubclassTests::test_wrapper_subclass_with_same_sized_inner_tensor, test/dynamo/test_subclasses.py::TestNestedTensor::test_basic_autograd, test/dynamo/test_subclasses.py::TestNestedTensor::test_basic_autograd_inductor, test/dynamo/test_subclasses.py::TestNestedTensor::test_binary_does_not_recompile, test/dynamo/test_subclasses.py::TestNestedTensor::test_binary_recompiles, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_input, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_input_2, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_input_4, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_input_5, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_input_6, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_intermediate, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_intermediate_2, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_intermediate_3, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_intermediate_4, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_intermediate_5, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_mixed, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_mixed_2, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_mixed_3, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_is_nested_call, test/dynamo/test_subclasses.py::TestNestedTensor::test_inference_tensor, test/dynamo/test_subclasses.py::TestNestedTensor::test_inline_nested_tensor_from_jagged, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_False_basic, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_False_leaf_False_False, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_False_leaf_False_True, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_False_leaf_True_False, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_False_leaf_True_True, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_False_obscure, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_True_basic, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_True_leaf_False_False, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_True_leaf_False_True, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_True_leaf_True_False, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_True_leaf_True_True, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_True_obscure, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_dense_subclass_dense_subclass, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_subclass_dense, test/dynamo/test_subclasses.py::TestNestedTensor::test_param_subclass_isinstance_input, test/dynamo/test_subclasses.py::TestNestedTensor::test_return_shape, test/dynamo/test_subclasses.py::TestNestedTensor::test_subclass_dense_subclass_dense_view, test/dynamo/test_subclasses.py::TestNestedTensor::test_subclass_gives_static_shapes_when_dynamic_false, test/dynamo/test_subclasses.py::TestNestedTensor::test_subclass_with_mutation_in_graph, test/dynamo/test_subclasses.py::TestNestedTensor::test_unary_does_not_recompile, test/dynamo/test_subclasses.py::TestNestedTensor::test_unbind 2025-10-10T02:05:09.5641030Z 2025-10-10T02:05:11.4690521Z Running dynamo/test_repros 1/1 ... [2025-10-10 02:05:11.468092] 2025-10-10T02:05:11.4690964Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:05:11.4691956Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_repros.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:05:11.468507] 2025-10-10T02:05:11.5163477Z 2025-10-10T02:05:11.5164739Z inductor/test_cudagraph_trees_expandable_segments 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cudagraph_trees_expandable_segments_1.1_bdc9ffac270e9589_.log 2025-10-10T02:05:11.5230306Z Running 147 items in this shard: test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_accumulate_grad, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_accumulate_multiple_recordings, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_alias_of_parameter, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_aliased_output_checkpoint, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_aliased_static_parameter, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_aliased_storage_single_weakref, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_aliasing_static_ref, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_amp_cache_disabled, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_backward_gets_cached_cudagraphs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_cache_hit_forward_miss_backward, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_cached_boxed_forward_device_index, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_cached_forward_backward, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_checkpoint_shared_output_storage_deallocation, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_checkpointing_resets_persistent_refs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_cleanup, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_compiled_autograd_static_input_params, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_constant_output, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_conv_benchmark, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_cpp_wrapper, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_cudagraph_capture_sizes, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_cudagraph_capture_sizes1, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_cudagraph_capture_sizes2, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_cudagraph_or_error, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_dynamic_backward, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_dynamic_warmup, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_empty_cpu_tensor, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_empty_storage, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_end_recording_early, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_error_on_dealloc_use, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_error_on_dealloc_use2, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_execution_into_recording, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_expanded_inputs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_fallback_to_eager_if_recompiling_too_many_times, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_fallback_to_eager_if_recompiling_too_many_times_due_to_cudagraph_managed_tensor, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_fallback_to_eager_if_recompiling_too_many_times_warn_only_once, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_forward_backward, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_forward_backward_not_called_backend_cudagraphs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_forward_backward_not_called_backend_inductor, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_forward_generation, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_forward_with_skipped_cudagraphed_backward, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_frozen_fn, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_function_compiled_multiple_times, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_buffer_reuse, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_condition_op, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_cpu_only, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_cpu_op_and_dynamic_shapes, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar1, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar2, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar3, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar4, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar_device_put, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar_multiple, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar_mutation, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_cpu_tensor_symints, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_custom_op, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_custom_op_dynamoc_shapes, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_custom_op_mutation, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_custom_op_mutation_late_free, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_custom_op_no_split, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_custom_rule, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_dynamic_scalar_inputs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_dynamic_shapes, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_foreach_op, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_forward_backward, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_forward_backward_not_called, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_forward_with_skipped_cudagraphed_backward, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_fused_scheduler_node, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_gc, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_item, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_log_message, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_multiple_devices_msg, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_reduce_overhead_mode_effectiveness, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_reorder_cpu_and_gpu, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_reorder_cpu_and_gpu_interleave, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_reorder_custom_op_with_no_dependency, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_reorder_custom_op_with_no_dependency1, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_simple, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_symint, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_symint_cat_backward, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_symint_from_mutation_index, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_symint_from_nested_indirect_indexing, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_unbacked_symint, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_unbacked_symint_multi_output_layout, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_incompatible_cudagraph_ops_item, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_incompatible_cudagraph_ops_nonzero, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_incompatible_cudagraph_ops_nonzero_backend, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_incompatible_cudagraph_ops_nonzero_graph_breaks, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_index_put, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_live_outputs_multiple_graphs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_manager_per_device, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mark_step, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_meta_tensor, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_multi_dispatch_child_node, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_multi_dispatch_custom_module, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_multi_dispatch_custom_module_buffer, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_multi_dispatch_parent_node, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_multi_dispatch_single_compile_builtin_module, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_multi_dispatch_single_compile_builtin_module_buffers, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_multi_dispatch_single_compile_param_inputs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_multinomial, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_multiple_devices_msg_backend_cudagraphs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_multiple_devices_msg_backend_inductor, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_multiple_insert_removal_caching, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensor_warn_backend_cudagraphs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensor_warn_backend_inductor, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensor_warn_only_once_backend_cudagraphs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensor_warn_only_once_backend_inductor, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensors_backend_cudagraphs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensors_backend_inductor, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensors_config_backend_cudagraphs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensors_config_backend_inductor, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mutation_on_inp_backend_cudagraphs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mutation_on_inp_backend_inductor, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mutation_reinplaced, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_no_rerecord_with_mark_static_address, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_not_fallback_to_eager_if_have_not_recompiling_too_many_times, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_output_alias, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_peristed_output_livenes, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_remove_hooks_on_cached_tensors, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_rerecord_if_static_input_address_changed, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_rng_non_trees, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_rng_trees, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_run_simple, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_separate_recordings, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_side_stream_memory_allocation, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_single_stream_use, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_skip_cpp_wrapper, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_skip_cudagraph_unsafe_ops, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_skip_if_dynamic_shape_limit_reached1, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_skip_if_dynamic_shape_limit_reached2, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_skip_symbolic, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_sparsity, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_static_inputs_address_mutation_log, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_storage_access_error, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_tensor_constant_mutation, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_tensor_dies_between_checkpoint, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_tensor_no_longer_in_pool, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_unaligned_static_input_no_cudagraphs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_unaligned_static_input_non_trees, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_unaligned_static_input_trees, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_unaligned_static_parameter, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_unstable_ptr, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_warmup_stream_sync, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_warn_on_pending_backward, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_warn_once_if_dynamic_shape_limit_reached, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_workspace_allocation_error 2025-10-10T02:05:11.5292267Z 2025-10-10T02:05:13.4875137Z Running dynamo/test_reorder_logs 1/1 ... [2025-10-10 02:05:13.486947] 2025-10-10T02:05:13.4875581Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:05:13.4878599Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_reorder_logs.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:05:13.487342] 2025-10-10T02:05:15.3829205Z Running dynamo/test_generator 1/1 ... [2025-10-10 02:05:15.382406] 2025-10-10T02:05:15.3829656Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:05:15.3831179Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_generator.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:05:15.382781] 2025-10-10T02:05:16.3936887Z 2025-10-10T02:05:16.3938174Z dynamo/test_repros 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_repros_1.1_a2f8d394a273405c_.log 2025-10-10T02:05:16.4021641Z Running 337 items in this shard: test/dynamo/test_repros.py::LRUCacheWarningTests::test_lru_cache_warning_issued_during_tracing, test/dynamo/test_repros.py::ReproTests::test_311_resume_block_keyerror, test/dynamo/test_repros.py::ReproTests::test_312_local_cell_overlap, test/dynamo/test_repros.py::ReproTests::test_Size, test/dynamo/test_repros.py::ReproTests::test_abc_setattr, test/dynamo/test_repros.py::ReproTests::test_add_complex_conj, test/dynamo/test_repros.py::ReproTests::test_add_sub_alpha_out, test/dynamo/test_repros.py::ReproTests::test_addr_alpha_beta_out, test/dynamo/test_repros.py::ReproTests::test_amp_foreach_fake_impl, test/dynamo/test_repros.py::ReproTests::test_aot_autograd_runtime_wrapper_prologue_profiled, test/dynamo/test_repros.py::ReproTests::test_as_strided_on_base_with_mutation_works, test/dynamo/test_repros.py::ReproTests::test_as_strided_on_existing_view_banned, test/dynamo/test_repros.py::ReproTests::test_attached_attribute_in_dir, test/dynamo/test_repros.py::ReproTests::test_autograd_function_graph_break, test/dynamo/test_repros.py::ReproTests::test_avoid_dupe_specialization, test/dynamo/test_repros.py::ReproTests::test_batch_encoding_clone_inputs, test/dynamo/test_repros.py::ReproTests::test_batch_norm_act, test/dynamo/test_repros.py::ReproTests::test_batchnorm_e2e, test/dynamo/test_repros.py::ReproTests::test_bigbird_unsqueeze_inplace, test/dynamo/test_repros.py::ReproTests::test_bitwise_op_guard, test/dynamo/test_repros.py::ReproTests::test_bitwise_print_precedence, test/dynamo/test_repros.py::ReproTests::test_boxes_len, test/dynamo/test_repros.py::ReproTests::test_build_map_unpack_with_call, test/dynamo/test_repros.py::ReproTests::test_c_defined_metaclass, test/dynamo/test_repros.py::ReproTests::test_cells_unsupported_step_exception, test/dynamo/test_repros.py::ReproTests::test_changing_stride, test/dynamo/test_repros.py::ReproTests::test_chunk_reformer_ff, test/dynamo/test_repros.py::ReproTests::test_class_member, test/dynamo/test_repros.py::ReproTests::test_classmethod_with_slots, test/dynamo/test_repros.py::ReproTests::test_clone_not_memory_dense, test/dynamo/test_repros.py::ReproTests::test_compilation_metrics_on_error, test/dynamo/test_repros.py::ReproTests::test_compile_complex_conj, test/dynamo/test_repros.py::ReproTests::test_compile_copy__int_overload, test/dynamo/test_repros.py::ReproTests::test_const_dict_keyerror, test/dynamo/test_repros.py::ReproTests::test_contains_range_constprop, test/dynamo/test_repros.py::ReproTests::test_convert_boxes_to_pooler_format, test/dynamo/test_repros.py::ReproTests::test_copy_weird_strides, test/dynamo/test_repros.py::ReproTests::test_create_rand_mask_from_inputs, test/dynamo/test_repros.py::ReproTests::test_dalle2_maybe, test/dynamo/test_repros.py::ReproTests::test_data_attr_mutation_after_saved_for_bw, test/dynamo/test_repros.py::ReproTests::test_dataclass_in_module, test/dynamo/test_repros.py::ReproTests::test_dataclass_init_with_default_factory_with_inputs, test/dynamo/test_repros.py::ReproTests::test_ddp_checkpoint, test/dynamo/test_repros.py::ReproTests::test_dedup_global, test/dynamo/test_repros.py::ReproTests::test_deferred_runtime_asserts, test/dynamo/test_repros.py::ReproTests::test_delattr, test/dynamo/test_repros.py::ReproTests::test_delattr_raises, test/dynamo/test_repros.py::ReproTests::test_delattr_return, test/dynamo/test_repros.py::ReproTests::test_delete_local_error, test/dynamo/test_repros.py::ReproTests::test_deleted_compile_wrapper_segfault, test/dynamo/test_repros.py::ReproTests::test_delsubscr, test/dynamo/test_repros.py::ReproTests::test_delsubscr_raises, test/dynamo/test_repros.py::ReproTests::test_detectron2_instances_cat, test/dynamo/test_repros.py::ReproTests::test_disabling_unpack_hooks_within_compiled_region, test/dynamo/test_repros.py::ReproTests::test_distributions_subclass, test/dynamo/test_repros.py::ReproTests::test_do_paste_mask, test/dynamo/test_repros.py::ReproTests::test_dont_aggressively_write_assert, test/dynamo/test_repros.py::ReproTests::test_dropout_inline, test/dynamo/test_repros.py::ReproTests::test_dynamic_shape_disable_duck_size, test/dynamo/test_repros.py::ReproTests::test_dynamic_shapes_double_not_equal, test/dynamo/test_repros.py::ReproTests::test_dynamic_shapes_float_guard, test/dynamo/test_repros.py::ReproTests::test_dynamic_shapes_implicit_guard, test/dynamo/test_repros.py::ReproTests::test_dynamic_shapes_right_side, test/dynamo/test_repros.py::ReproTests::test_ellipsis, test/dynamo/test_repros.py::ReproTests::test_embedding_backward_broadcasting_decomp, test/dynamo/test_repros.py::ReproTests::test_empty_graph_nested_calls_fullgraph_False, test/dynamo/test_repros.py::ReproTests::test_empty_graph_nested_calls_fullgraph_True, test/dynamo/test_repros.py::ReproTests::test_empty_list_contains_with_jump, test/dynamo/test_repros.py::ReproTests::test_empty_out_dynamic, test/dynamo/test_repros.py::ReproTests::test_enum, test/dynamo/test_repros.py::ReproTests::test_ephemeral_module, test/dynamo/test_repros.py::ReproTests::test_error_return_without_exception_set, test/dynamo/test_repros.py::ReproTests::test_exception_in_dynamo_handling, test/dynamo/test_repros.py::ReproTests::test_exec_import, test/dynamo/test_repros.py::ReproTests::test_exec_wildcard_import, test/dynamo/test_repros.py::ReproTests::test_export_vs_dynamo_for_multiheadattention, test/dynamo/test_repros.py::ReproTests::test_flip_bad_accuracy, test/dynamo/test_repros.py::ReproTests::test_for_loop_graph_break, test/dynamo/test_repros.py::ReproTests::test_for_loop_graph_break_before, test/dynamo/test_repros.py::ReproTests::test_foreach_decomp_arg_names, test/dynamo/test_repros.py::ReproTests::test_fsdp_set_input_mutation_applied_when_input_gets_no_gradients, test/dynamo/test_repros.py::ReproTests::test_function_in_skipfiles, test/dynamo/test_repros.py::ReproTests::test_functools_wraps, test/dynamo/test_repros.py::ReproTests::test_gan_repro_trying_to_backward_through_the_graph_a_second_time, test/dynamo/test_repros.py::ReproTests::test_generator_dealloc, test/dynamo/test_repros.py::ReproTests::test_get_parameter_dtype, test/dynamo/test_repros.py::ReproTests::test_get_type_hints, test/dynamo/test_repros.py::ReproTests::test_global_fn_mutation, test/dynamo/test_repros.py::ReproTests::test_grad, test/dynamo/test_repros.py::ReproTests::test_grad_mode_carrying_correct_state_after_graph_break, test/dynamo/test_repros.py::ReproTests::test_grad_references_cleared, test/dynamo/test_repros.py::ReproTests::test_graph_break_on_jit_isinstance, test/dynamo/test_repros.py::ReproTests::test_graph_break_on_jit_isinstance_pep585, test/dynamo/test_repros.py::ReproTests::test_graph_break_unsupported_fake, test/dynamo/test_repros.py::ReproTests::test_guard_default_device, test/dynamo/test_repros.py::ReproTests::test_guard_fail_nested_tuple, test/dynamo/test_repros.py::ReproTests::test_guard_fail_tensor_bool, test/dynamo/test_repros.py::ReproTests::test_guard_ordering_shape_fail, test/dynamo/test_repros.py::ReproTests::test_guard_with_tuple_mutation, test/dynamo/test_repros.py::ReproTests::test_hasattr_builtin, test/dynamo/test_repros.py::ReproTests::test_hf_bigbird_unsqueeze, test/dynamo/test_repros.py::ReproTests::test_hf_classinstantier, test/dynamo/test_repros.py::ReproTests::test_hf_gelu_inline, test/dynamo/test_repros.py::ReproTests::test_hf_model_output, test/dynamo/test_repros.py::ReproTests::test_hf_t5_forward, test/dynamo/test_repros.py::ReproTests::test_hf_xsoftmax_inference, test/dynamo/test_repros.py::ReproTests::test_hf_xsoftmax_training, test/dynamo/test_repros.py::ReproTests::test_iadd_graph_break, test/dynamo/test_repros.py::ReproTests::test_incompatible_configs, test/dynamo/test_repros.py::ReproTests::test_indexing_with_list, test/dynamo/test_repros.py::ReproTests::test_inductor_dynamic_shapes_broadcasting, test/dynamo/test_repros.py::ReproTests::test_inductor_no_recursionerror_on_for_loops, test/dynamo/test_repros.py::ReproTests::test_inductor_rng_default_dtype, test/dynamo/test_repros.py::ReproTests::test_inference_mode_dynamic_shapes, test/dynamo/test_repros.py::ReproTests::test_inlining_cornercase, test/dynamo/test_repros.py::ReproTests::test_inplace_unsqueeze_input, test/dynamo/test_repros.py::ReproTests::test_int_format, test/dynamo/test_repros.py::ReproTests::test_intermediate_leaf_requires_grad, test/dynamo/test_repros.py::ReproTests::test_invalid_seq_unpack, test/dynamo/test_repros.py::ReproTests::test_is_make_fx_tracing, test/dynamo/test_repros.py::ReproTests::test_is_symbolic_tracing, test/dynamo/test_repros.py::ReproTests::test_isinstance_dtype, test/dynamo/test_repros.py::ReproTests::test_isinstance_storage, test/dynamo/test_repros.py::ReproTests::test_issue111522, test/dynamo/test_repros.py::ReproTests::test_issue111918, test/dynamo/test_repros.py::ReproTests::test_issue114171, test/dynamo/test_repros.py::ReproTests::test_issue126128, test/dynamo/test_repros.py::ReproTests::test_issue134451, test/dynamo/test_repros.py::ReproTests::test_issue1466_size_aot_autograd, test/dynamo/test_repros.py::ReproTests::test_issue175, test/dynamo/test_repros.py::ReproTests::test_jit_script_defaults, test/dynamo/test_repros.py::ReproTests::test_jit_trace_errors, test/dynamo/test_repros.py::ReproTests::test_kwargs_out_list_variable, test/dynamo/test_repros.py::ReproTests::test_list_aliasing, test/dynamo/test_repros.py::ReproTests::test_list_index, test/dynamo/test_repros.py::ReproTests::test_list_index_not_found, test/dynamo/test_repros.py::ReproTests::test_list_index_tensor_unsupported, test/dynamo/test_repros.py::ReproTests::test_list_reverse, test/dynamo/test_repros.py::ReproTests::test_list_self_reference, test/dynamo/test_repros.py::ReproTests::test_listcomp, test/dynamo/test_repros.py::ReproTests::test_longformer_chunk, test/dynamo/test_repros.py::ReproTests::test_longtensor_list, test/dynamo/test_repros.py::ReproTests::test_lru_cache_tracing, test/dynamo/test_repros.py::ReproTests::test_maml_item_capture, test/dynamo/test_repros.py::ReproTests::test_maml_no_item_capture, test/dynamo/test_repros.py::ReproTests::test_many_overlapping_inputs_does_not_explode_guards, test/dynamo/test_repros.py::ReproTests::test_many_views_with_mutation, test/dynamo/test_repros.py::ReproTests::test_map_with_multiple_args, test/dynamo/test_repros.py::ReproTests::test_maybe_multiply_symint, test/dynamo/test_repros.py::ReproTests::test_merge_criteria_processor_list1, test/dynamo/test_repros.py::ReproTests::test_merge_criteria_processor_list2, test/dynamo/test_repros.py::ReproTests::test_method_overriding, test/dynamo/test_repros.py::ReproTests::test_module_in_skipfiles, test/dynamo/test_repros.py::ReproTests::test_modules, test/dynamo/test_repros.py::ReproTests::test_multi_dot_import, test/dynamo/test_repros.py::ReproTests::test_multi_import, test/dynamo/test_repros.py::ReproTests::test_named_buffers, test/dynamo/test_repros.py::ReproTests::test_nanmean_out, test/dynamo/test_repros.py::ReproTests::test_negative_floor_div_solve, test/dynamo/test_repros.py::ReproTests::test_negative_shape_guard, test/dynamo/test_repros.py::ReproTests::test_nested_while_loop_graph_break, test/dynamo/test_repros.py::ReproTests::test_nn_module_callable, test/dynamo/test_repros.py::ReproTests::test_nn_module_property_closure, test/dynamo/test_repros.py::ReproTests::test_nn_module_stack_bc, test/dynamo/test_repros.py::ReproTests::test_nn_param_freevar_codegen, test/dynamo/test_repros.py::ReproTests::test_nn_parameter, test/dynamo/test_repros.py::ReproTests::test_nn_parameter_ctor_graph_breaks, test/dynamo/test_repros.py::ReproTests::test_nn_parametrize, test/dynamo/test_repros.py::ReproTests::test_no_grad_inline, test/dynamo/test_repros.py::ReproTests::test_no_tracing_into_eval_frame, test/dynamo/test_repros.py::ReproTests::test_no_tracing_into_eval_frame_ctx_manager, test/dynamo/test_repros.py::ReproTests::test_nonconst_issubclass, test/dynamo/test_repros.py::ReproTests::test_not_rewrite_assert_for_other_errors, test/dynamo/test_repros.py::ReproTests::test_nullcontext1, test/dynamo/test_repros.py::ReproTests::test_nullcontext2, test/dynamo/test_repros.py::ReproTests::test_numpy_not_ndarray_recompiles, test/dynamo/test_repros.py::ReproTests::test_numpy_tobytes_no_error, test/dynamo/test_repros.py::ReproTests::test_odict_get_item_index_name, test/dynamo/test_repros.py::ReproTests::test_omegaconf_dictconfig, test/dynamo/test_repros.py::ReproTests::test_omegaconf_listconfig_contains, test/dynamo/test_repros.py::ReproTests::test_omegaconf_listconfig_iter, test/dynamo/test_repros.py::ReproTests::test_ones_out_dynamic, test/dynamo/test_repros.py::ReproTests::test_optim_state_references_cleared, test/dynamo/test_repros.py::ReproTests::test_optimized_deepcopy, test/dynamo/test_repros.py::ReproTests::test_optimized_module_patched_init, test/dynamo/test_repros.py::ReproTests::test_optimized_module_training, test/dynamo/test_repros.py::ReproTests::test_os_fspath, test/dynamo/test_repros.py::ReproTests::test_out_nested_cell_shape_change, test/dynamo/test_repros.py::ReproTests::test_out_nested_cell_tuple_shape_change, test/dynamo/test_repros.py::ReproTests::test_out_none, test/dynamo/test_repros.py::ReproTests::test_out_overload_non_contiguous, test/dynamo/test_repros.py::ReproTests::test_out_root_cell_shape_change, test/dynamo/test_repros.py::ReproTests::test_out_root_cell_tuple_shape_change, test/dynamo/test_repros.py::ReproTests::test_output_aliases_intermediate, test/dynamo/test_repros.py::ReproTests::test_overlapping_inputs_with_dynamic_shapes_error, test/dynamo/test_repros.py::ReproTests::test_overwriting_params, test/dynamo/test_repros.py::ReproTests::test_partially_initialized_module_property, test/dynamo/test_repros.py::ReproTests::test_partitioner_activation_memory_budget_with_unbacked_symints, test/dynamo/test_repros.py::ReproTests::test_partitioner_cse_respects_mutation_boundaries, test/dynamo/test_repros.py::ReproTests::test_pointless_graph_removal, test/dynamo/test_repros.py::ReproTests::test_preserve_stride_with_clone, test/dynamo/test_repros.py::ReproTests::test_primtorch, test/dynamo/test_repros.py::ReproTests::test_primtorch_no_graph_break, test/dynamo/test_repros.py::ReproTests::test_randint_out_dynamic, test/dynamo/test_repros.py::ReproTests::test_recursive_map, test/dynamo/test_repros.py::ReproTests::test_reformer_eval, test/dynamo/test_repros.py::ReproTests::test_reformer_min_chunk_len, test/dynamo/test_repros.py::ReproTests::test_reformer_sorting, test/dynamo/test_repros.py::ReproTests::test_reformer_train, test/dynamo/test_repros.py::ReproTests::test_reinplacing, test/dynamo/test_repros.py::ReproTests::test_relative_import, test/dynamo/test_repros.py::ReproTests::test_relative_import_no_modulename, test/dynamo/test_repros.py::ReproTests::test_requires_grad_guards_with_grad_mode1, test/dynamo/test_repros.py::ReproTests::test_requires_grad_guards_with_grad_mode2, test/dynamo/test_repros.py::ReproTests::test_restricted_list_subclass1, test/dynamo/test_repros.py::ReproTests::test_restricted_list_subclass2, test/dynamo/test_repros.py::ReproTests::test_restricted_list_subclass3, test/dynamo/test_repros.py::ReproTests::test_return_value_duplication_mixed_grad, test/dynamo/test_repros.py::ReproTests::test_return_value_duplication_scalar, test/dynamo/test_repros.py::ReproTests::test_return_value_duplication_tensor, test/dynamo/test_repros.py::ReproTests::test_return_weakref, test/dynamo/test_repros.py::ReproTests::test_rewrite_assert_dont_change_bytecode, test/dynamo/test_repros.py::ReproTests::test_rewrite_assert_noop, test/dynamo/test_repros.py::ReproTests::test_rewrite_assert_with_msg, test/dynamo/test_repros.py::ReproTests::test_rewrite_assert_with_non_string_msg, test/dynamo/test_repros.py::ReproTests::test_rewrite_assert_without_msg, test/dynamo/test_repros.py::ReproTests::test_rng_state, test/dynamo/test_repros.py::ReproTests::test_seq_append_list, test/dynamo/test_repros.py::ReproTests::test_setattr_requires_grad_graph_breaks, test/dynamo/test_repros.py::ReproTests::test_setitem_boolean_mask_diff, test/dynamo/test_repros.py::ReproTests::test_setitem_tensor_prop, test/dynamo/test_repros.py::ReproTests::test_setitem_tuple_boolean_mask_diff, test/dynamo/test_repros.py::ReproTests::test_sigmoid_out, test/dynamo/test_repros.py::ReproTests::test_sigmoid_out2, test/dynamo/test_repros.py::ReproTests::test_size_typematch, test/dynamo/test_repros.py::ReproTests::test_slice_into_list_mutable, test/dynamo/test_repros.py::ReproTests::test_slicing_dynamic_shape, test/dynamo/test_repros.py::ReproTests::test_slicing_dynamic_shape_setitem, test/dynamo/test_repros.py::ReproTests::test_sort_out, test/dynamo/test_repros.py::ReproTests::test_sort_out2, test/dynamo/test_repros.py::ReproTests::test_specialized_stride, test/dynamo/test_repros.py::ReproTests::test_split_with_sizes_aot_autograd, test/dynamo/test_repros.py::ReproTests::test_staticmethod_allow_in_graph, test/dynamo/test_repros.py::ReproTests::test_stk_sdd_is_transposed, test/dynamo/test_repros.py::ReproTests::test_stop_iteration_reconstruct, test/dynamo/test_repros.py::ReproTests::test_str_isalnum, test/dynamo/test_repros.py::ReproTests::test_string_format, test/dynamo/test_repros.py::ReproTests::test_subclass_graph_output_repro, test/dynamo/test_repros.py::ReproTests::test_super_classmethod, test/dynamo/test_repros.py::ReproTests::test_super_classmethod_inheritance, test/dynamo/test_repros.py::ReproTests::test_super_diamond, test/dynamo/test_repros.py::ReproTests::test_super_in_staticmethod, test/dynamo/test_repros.py::ReproTests::test_super_staticmethod, test/dynamo/test_repros.py::ReproTests::test_swin_base_tensor_attr, test/dynamo/test_repros.py::ReproTests::test_symint_bitwise, test/dynamo/test_repros.py::ReproTests::test_symnode_is_not_op, test/dynamo/test_repros.py::ReproTests::test_symnode_is_op, test/dynamo/test_repros.py::ReproTests::test_sys_monitoring, test/dynamo/test_repros.py::ReproTests::test_tensor_data_kwarg, test/dynamo/test_repros.py::ReproTests::test_tensor_isinstance_tuple, test/dynamo/test_repros.py::ReproTests::test_tensor_item, test/dynamo/test_repros.py::ReproTests::test_tensor_random, test/dynamo/test_repros.py::ReproTests::test_tensor_set_data_backend_aot_eager_func_name_func1, test/dynamo/test_repros.py::ReproTests::test_tensor_set_data_backend_aot_eager_func_name_func2, test/dynamo/test_repros.py::ReproTests::test_tensor_set_data_backend_aot_eager_func_name_func3, test/dynamo/test_repros.py::ReproTests::test_tensor_set_data_backend_eager_func_name_func1, test/dynamo/test_repros.py::ReproTests::test_tensor_set_data_backend_eager_func_name_func2, test/dynamo/test_repros.py::ReproTests::test_tensor_set_data_backend_eager_func_name_func3, test/dynamo/test_repros.py::ReproTests::test_tensor_set_data_backend_inductor_func_name_func1, test/dynamo/test_repros.py::ReproTests::test_tensor_set_data_backend_inductor_func_name_func2, test/dynamo/test_repros.py::ReproTests::test_tensor_set_data_backend_inductor_func_name_func3, test/dynamo/test_repros.py::ReproTests::test_tensor_set_data_mismatched_dtype, test/dynamo/test_repros.py::ReproTests::test_tensor_split, test/dynamo/test_repros.py::ReproTests::test_tensor_split_within_device_cm, test/dynamo/test_repros.py::ReproTests::test_tensor_uniform, test/dynamo/test_repros.py::ReproTests::test_threading_local, test/dynamo/test_repros.py::ReproTests::test_tokenization, test/dynamo/test_repros.py::ReproTests::test_torch_compile_in_compile_frame, test/dynamo/test_repros.py::ReproTests::test_torch_ops_aten, test/dynamo/test_repros.py::ReproTests::test_torch_tensor_ops, test/dynamo/test_repros.py::ReproTests::test_torch_tensor_ops_no_graph_break, test/dynamo/test_repros.py::ReproTests::test_torch_variable_type, test/dynamo/test_repros.py::ReproTests::test_torchname, test/dynamo/test_repros.py::ReproTests::test_trace_functional_tensor_with, test/dynamo/test_repros.py::ReproTests::test_tuple_enum_as_key_dict, test/dynamo/test_repros.py::ReproTests::test_typed_dict, test/dynamo/test_repros.py::ReproTests::test_typed_dict_total, test/dynamo/test_repros.py::ReproTests::test_udf_classes_reconstruction, test/dynamo/test_repros.py::ReproTests::test_unbacked_arange_in_bounds, test/dynamo/test_repros.py::ReproTests::test_unbind_copy_out, test/dynamo/test_repros.py::ReproTests::test_unpack_hooks_can_be_disabled, test/dynamo/test_repros.py::ReproTests::test_unpack_hooks_dont_run_during_tracing, test/dynamo/test_repros.py::ReproTests::test_unspecialized_nn_module_with_torch_variable_attribute, test/dynamo/test_repros.py::ReproTests::test_unsqueeze_mul_strides, test/dynamo/test_repros.py::ReproTests::test_user_ctor_ctx_manager, test/dynamo/test_repros.py::ReproTests::test_user_ctor_ctx_manager_custom_init, test/dynamo/test_repros.py::ReproTests::test_user_ctor_ctx_manager_custom_init_graph_break, test/dynamo/test_repros.py::ReproTests::test_user_defined_iter, test/dynamo/test_repros.py::ReproTests::test_user_defined_object_callable, test/dynamo/test_repros.py::ReproTests::test_validate_model_kwargs, test/dynamo/test_repros.py::ReproTests::test_vc_bumped_in_inference_graph, test/dynamo/test_repros.py::ReproTests::test_vdd_duplicate_error, test/dynamo/test_repros.py::ReproTests::test_view_dtype_overload, test/dynamo/test_repros.py::ReproTests::test_weakref, test/dynamo/test_repros.py::ReproTests::test_weakref_callback, test/dynamo/test_repros.py::ReproTests::test_weakref_construction, test/dynamo/test_repros.py::ReproTests::test_weakref_del, test/dynamo/test_repros.py::ReproTests::test_weakref_proxy, test/dynamo/test_repros.py::ReproTests::test_weakref_reconstruct, test/dynamo/test_repros.py::ReproTests::test_while_loop_graph_break, test/dynamo/test_repros.py::ReproTests::test_while_loop_graph_break_inside_call_function, test/dynamo/test_repros.py::ReproTests::test_with_on_graph_break_inst, test/dynamo/test_repros.py::ReproTests::test_with_on_graph_break_nested, test/dynamo/test_repros.py::ReproTests::test_zeros_out_dynamic, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_cuda_sync_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_data_dependent_error_log_no_print_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_deepcopy_constant_tensor_in_aot_bwd_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_filter_safe_grad_warning_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_filter_user_warnings_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_filter_warnings_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_flash_attn_backward_mixed_strides_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_getattr_return_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_guard_default_device_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_megablocks_moe_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_memleak_when_graph_input_has_tensor_attr_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_module_attribute_error_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_named_tuple_vt_clone_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_norm_dtype_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_partial_export_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_partitioner_saves_weights_for_bw_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_sdpa_dynamic_shapes_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_sub_alpha_scalar_repro_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_tensor_size_hasattr_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_torch_cuda_is_initialized_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_truthiness_of_symints_no_recompiles_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_udf_class_source_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_zero_dim_param_mixed_device_grad_cuda 2025-10-10T02:05:16.4103141Z 2025-10-10T02:05:17.6606152Z 2025-10-10T02:05:17.6606963Z dynamo/test_reorder_logs 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_reorder_logs_1.1_16aea957efe8413a_.log 2025-10-10T02:05:17.6612937Z Running 14 items in this shard: test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method0_fn0_should_ignore_logger_False, test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method1_fn1_should_ignore_logger_False, test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method2_fn2_should_ignore_logger_False, test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method3_fn3_should_ignore_logger_False, test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method4_fn4_should_ignore_logger_True, test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method5_fn5_should_ignore_logger_True, test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method6_fn6_should_ignore_logger_True, test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method7_fn7_should_ignore_logger_True, test/dynamo/test_reorder_logs.py::ReorderLogsTests::test_constant_mutation, test/dynamo/test_reorder_logs.py::ReorderLogsTests::test_dont_reorder_print, test/dynamo/test_reorder_logs.py::ReorderLogsTests::test_reorder_custom_log_fn, test/dynamo/test_reorder_logs.py::ReorderLogsTests::test_reorder_print, test/dynamo/test_reorder_logs.py::ReorderLogsTests::test_reorder_print_graph_break, test/dynamo/test_reorder_logs.py::ReorderLogsTests::test_reorder_warnings 2025-10-10T02:05:17.6618217Z 2025-10-10T02:05:19.7062400Z 2025-10-10T02:05:19.7063548Z dynamo/test_generator 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_generator_1.1_3a0ec19c03d8910f_.log 2025-10-10T02:05:19.7087883Z Running 78 items in this shard: test/dynamo/test_generator.py::GeneratorTests::test_cleanup_throw, test/dynamo/test_generator.py::GeneratorTests::test_deque_extendleft, test/dynamo/test_generator.py::GeneratorTests::test_dict_tuple_list_generator_container0, test/dynamo/test_generator.py::GeneratorTests::test_dict_tuple_list_generator_container1, test/dynamo/test_generator.py::GeneratorTests::test_dict_tuple_list_generator_container2, test/dynamo/test_generator.py::GeneratorTests::test_dict_tuple_list_generator_container3, test/dynamo/test_generator.py::GeneratorTests::test_dynamo_disable_generator, test/dynamo/test_generator.py::GeneratorTests::test_dynamo_disable_sub_generator, test/dynamo/test_generator.py::GeneratorTests::test_generator___contains__, test/dynamo/test_generator.py::GeneratorTests::test_generator___contains___side_effects, test/dynamo/test_generator.py::GeneratorTests::test_generator_as_argument, test/dynamo/test_generator.py::GeneratorTests::test_generator_as_argument_2, test/dynamo/test_generator.py::GeneratorTests::test_generator_as_argument_3, test/dynamo/test_generator.py::GeneratorTests::test_generator_as_argument_4, test/dynamo/test_generator.py::GeneratorTests::test_generator_simple, test/dynamo/test_generator.py::GeneratorTests::test_generator_with_side_effects, test/dynamo/test_generator.py::GeneratorTests::test_generator_with_side_effects_graph_break, test/dynamo/test_generator.py::GeneratorTests::test_generator_with_side_effects_graph_break_2, test/dynamo/test_generator.py::GeneratorTests::test_graph_break_and_reconstruct_generator, test/dynamo/test_generator.py::GeneratorTests::test_graph_break_before_calling_generator, test/dynamo/test_generator.py::GeneratorTests::test_graph_break_in_generator, test/dynamo/test_generator.py::GeneratorTests::test_graph_break_in_generator_2, test/dynamo/test_generator.py::GeneratorTests::test_graph_break_in_generator_while_reconstructing, test/dynamo/test_generator.py::GeneratorTests::test_graph_break_outside_generator, test/dynamo/test_generator.py::GeneratorTests::test_infinite_generator, test/dynamo/test_generator.py::GeneratorTests::test_infinite_generator_2, test/dynamo/test_generator.py::GeneratorTests::test_infinite_generator_3, test/dynamo/test_generator.py::GeneratorTests::test_islice_chain, test/dynamo/test_generator.py::GeneratorTests::test_iter, test/dynamo/test_generator.py::GeneratorTests::test_list_extend, test/dynamo/test_generator.py::GeneratorTests::test_list_zip_generator, test/dynamo/test_generator.py::GeneratorTests::test_reconstruct_generator_tensor_mutation, test/dynamo/test_generator.py::GeneratorTests::test_reconstruct_generator_with_dict_mutation, test/dynamo/test_generator.py::GeneratorTests::test_reconstruct_generator_with_dict_mutation_before, test/dynamo/test_generator.py::GeneratorTests::test_reconstruct_generator_with_local_var_mutation, test/dynamo/test_generator.py::GeneratorTests::test_reconstruct_generator_with_object_mutation, test/dynamo/test_generator.py::GeneratorTests::test_reconstruct_generator_with_object_mutation_before, test/dynamo/test_generator.py::GeneratorTests::test_return_advanced_generator, test/dynamo/test_generator.py::GeneratorTests::test_return_exhaust_generator, test/dynamo/test_generator.py::GeneratorTests::test_return_generator, test/dynamo/test_generator.py::GeneratorTests::test_return_subgenerator, test/dynamo/test_generator.py::GeneratorTests::test_return_tuple_generator, test/dynamo/test_generator.py::GeneratorTests::test_subgenerator, test/dynamo/test_generator.py::GeneratorTests::test_subgenerator_with_side_effects, test/dynamo/test_generator.py::GeneratorTests::test_zip_generator, test/dynamo/test_generator.py::GeneratorTests::test_zip_generator_2, test/dynamo/test_generator.py::GeneratorTests::test_zip_infinite_generator, test/dynamo/test_generator.py::GeneratorTests::test_zip_subgenerator, test/dynamo/test_generator.py::TestGeneratorSend::test_send, test/dynamo/test_generator.py::TestGeneratorSend::test_send_stop_iteration_fullgraph_False, test/dynamo/test_generator.py::TestGeneratorSend::test_send_stop_iteration_fullgraph_True, test/dynamo/test_generator.py::TestGeneratorClose::test_close, test/dynamo/test_generator.py::TestGeneratorClose::test_close_after_close, test/dynamo/test_generator.py::TestGeneratorClose::test_close_after_exception, test/dynamo/test_generator.py::TestGeneratorClose::test_close_capture_GeneratorExit_fullgraph_False, test/dynamo/test_generator.py::TestGeneratorClose::test_close_capture_GeneratorExit_fullgraph_True, test/dynamo/test_generator.py::TestGeneratorClose::test_close_capture_GeneratorExit_return, test/dynamo/test_generator.py::TestGeneratorClose::test_close_capture_and_reraise_GeneratorExit, test/dynamo/test_generator.py::TestGeneratorClose::test_close_capture_and_reraise_exc_exc0, test/dynamo/test_generator.py::TestGeneratorClose::test_close_capture_and_reraise_exc_exc1, test/dynamo/test_generator.py::TestGeneratorClose::test_close_handling_finally, test/dynamo/test_generator.py::TestGeneratorClose::test_close_subgen, test/dynamo/test_generator.py::TestGeneratorClose::test_close_with_side_effects, test/dynamo/test_generator.py::TestGeneratorClose::test_close_with_subgen, test/dynamo/test_generator.py::TestGeneratorClose::test_next_after_close_fullgraph_False, test/dynamo/test_generator.py::TestGeneratorClose::test_next_after_close_fullgraph_True, test/dynamo/test_generator.py::TestGeneratorThrow::test_exception_context_with_yield, test/dynamo/test_generator.py::TestGeneratorThrow::test_return_None_in_except_and_finally, test/dynamo/test_generator.py::TestGeneratorThrow::test_return_const_value_in_except_and_finally, test/dynamo/test_generator.py::TestGeneratorThrow::test_return_value_in_except_and_finally, test/dynamo/test_generator.py::TestGeneratorThrow::test_throw, test/dynamo/test_generator.py::TestGeneratorThrow::test_throw_no_yield_after_throw, test/dynamo/test_generator.py::TestGeneratorThrow::test_throw_not_catch, test/dynamo/test_generator.py::TestGeneratorThrow::test_throw_raise_difference_exc, test/dynamo/test_generator.py::TestGeneratorThrow::test_throw_try_except_finally, test/dynamo/test_generator.py::TestGeneratorThrow::test_throw_with_finally, test/dynamo/test_generator.py::TestGeneratorThrow::test_throw_without_finally, test/dynamo/test_generator.py::TestGeneratorThrow::test_throw_yield_finally 2025-10-10T02:05:19.7111010Z 2025-10-10T02:05:20.3218562Z Running export/test_lift_unlift 1/1 ... [2025-10-10 02:05:20.321316] 2025-10-10T02:05:20.3219157Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:05:20.3221051Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_lift_unlift.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:05:20.321750] 2025-10-10T02:05:21.5927788Z Running export/test_verifier 1/1 ... [2025-10-10 02:05:21.592231] 2025-10-10T02:05:21.5928578Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:05:21.5930145Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_verifier.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:05:21.592656] 2025-10-10T02:05:23.6161352Z Running profiler/test_profiler 1/1 ... [2025-10-10 02:05:23.615400] 2025-10-10T02:05:23.6162323Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:05:23.6163528Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_profiler.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:05:23.615818] 2025-10-10T02:05:24.1949580Z 2025-10-10T02:05:24.1950879Z export/test_lift_unlift 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_lift_unlift_1.1_0661475e62a3ad5d_.log 2025-10-10T02:05:24.1953472Z Running 5 items in this shard: test/export/test_lift_unlift.py::TestLift::test_duplicate_constant_access, test/export/test_lift_unlift.py::TestLift::test_lift_basic, test/export/test_lift_unlift.py::TestLift::test_lift_nested, test/export/test_lift_unlift.py::TestLift::test_unlift_nonpersistent_buffer, test/export/test_lift_unlift.py::ConstantAttrMapTest::test_dict_api 2025-10-10T02:05:24.1955335Z 2025-10-10T02:05:25.5663482Z 2025-10-10T02:05:25.5669941Z export/test_verifier 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_verifier_1.1_960dacb2e6b526fd_.log 2025-10-10T02:05:25.5683454Z Running 10 items in this shard: test/export/test_verifier.py::TestVerifier::test_ep_verifier_basic, test/export/test_verifier.py::TestVerifier::test_ep_verifier_buffer_mutate, test/export/test_verifier.py::TestVerifier::test_ep_verifier_invalid_buffer, test/export/test_verifier.py::TestVerifier::test_ep_verifier_invalid_output, test/export/test_verifier.py::TestVerifier::test_ep_verifier_invalid_param, test/export/test_verifier.py::TestVerifier::test_verifier_basic, test/export/test_verifier.py::TestVerifier::test_verifier_call_module, test/export/test_verifier.py::TestVerifier::test_verifier_higher_order, test/export/test_verifier.py::TestVerifier::test_verifier_nested_invalid_module, test/export/test_verifier.py::TestVerifier::test_verifier_no_functional 2025-10-10T02:05:25.5693102Z 2025-10-10T02:05:27.7396742Z 2025-10-10T02:05:27.7397705Z profiler/test_profiler 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_profiler_1.1_3bfa04acb9efaa7e_.log 2025-10-10T02:05:27.7420155Z Running 73 items in this shard: test/profiler/test_profiler.py::TestProfilerCUDA::test_cudagraph_profiling_workaround, test/profiler/test_profiler.py::TestProfilerCUDA::test_custom_module_input_op_ids, test/profiler/test_profiler.py::TestProfilerCUDA::test_mem_leak, test/profiler/test_profiler.py::TestProfilerITT::test_custom_module_input_op_ids, test/profiler/test_profiler.py::TestProfiler::test_basic_chrome_trace, test/profiler/test_profiler.py::TestProfiler::test_basic_profile, test/profiler/test_profiler.py::TestProfiler::test_concrete_inputs_profiling, test/profiler/test_profiler.py::TestProfiler::test_concrete_inputs_profiling_toggling, test/profiler/test_profiler.py::TestProfiler::test_cpu_annotation_overlap, test/profiler/test_profiler.py::TestProfiler::test_disable_external_correlation, test/profiler/test_profiler.py::TestProfiler::test_dynamic_toggle, test/profiler/test_profiler.py::TestProfiler::test_event_list, test/profiler/test_profiler.py::TestProfiler::test_export_stacks, test/profiler/test_profiler.py::TestProfiler::test_flops, test/profiler/test_profiler.py::TestProfiler::test_forked_process, test/profiler/test_profiler.py::TestProfiler::test_guarded_record_function_fast, test/profiler/test_profiler.py::TestProfiler::test_high_level_trace, test/profiler/test_profiler.py::TestProfiler::test_is_profiler_enabled, test/profiler/test_profiler.py::TestProfiler::test_kineto, test/profiler/test_profiler.py::TestProfiler::test_kineto_multigpu, test/profiler/test_profiler.py::TestProfiler::test_kineto_profiler_api, test/profiler/test_profiler.py::TestProfiler::test_kineto_profiler_multiple_steppers, test/profiler/test_profiler.py::TestProfiler::test_kineto_profiler_with_environment_variable, test/profiler/test_profiler.py::TestProfiler::test_lazy_build_tree, test/profiler/test_profiler.py::TestProfiler::test_memory_profiler, test/profiler/test_profiler.py::TestProfiler::test_module_hierarchy, test/profiler/test_profiler.py::TestProfiler::test_nested_tensor_with_shapes, test/profiler/test_profiler.py::TestProfiler::test_oom_tracing, test/profiler/test_profiler.py::TestProfiler::test_override_time_units, test/profiler/test_profiler.py::TestProfiler::test_profile_all_threads, test/profiler/test_profiler.py::TestProfiler::test_profiler_correlation_id, test/profiler/test_profiler.py::TestProfiler::test_profiler_cuda_sync_events, test/profiler/test_profiler.py::TestProfiler::test_profiler_disable_fwd_bwd_link, test/profiler/test_profiler.py::TestProfiler::test_profiler_fwd_bwd_link, test/profiler/test_profiler.py::TestProfiler::test_profiler_metadata, test/profiler/test_profiler.py::TestProfiler::test_profiler_op_event_args, test/profiler/test_profiler.py::TestProfiler::test_profiler_op_event_kwargs, test/profiler/test_profiler.py::TestProfiler::test_profiler_op_event_kwargs_list_of_strings, test/profiler/test_profiler.py::TestProfiler::test_profiler_strides, test/profiler/test_profiler.py::TestProfiler::test_profiler_time_scale, test/profiler/test_profiler.py::TestProfiler::test_profiler_tracing, test/profiler/test_profiler.py::TestProfiler::test_profiler_type, test/profiler/test_profiler.py::TestProfiler::test_python_gc_event, test/profiler/test_profiler.py::TestProfiler::test_record_function_fast, test/profiler/test_profiler.py::TestProfiler::test_schedule_function_count, test/profiler/test_profiler.py::TestProfiler::test_skip_first_wait, test/profiler/test_profiler.py::TestProfiler::test_source, test/profiler/test_profiler.py::TestProfiler::test_tensorboard_trace_handler, test/profiler/test_profiler.py::TestProfiler::test_user_annotation, test/profiler/test_profiler.py::TestExperimentalUtils::test_bfs, test/profiler/test_profiler.py::TestExperimentalUtils::test_dfs, test/profiler/test_profiler.py::TestExperimentalUtils::test_expose_kineto_event_metadata, test/profiler/test_profiler.py::TestExperimentalUtils::test_fuzz_symbolize, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_conv2d_bias_followed_by_batchnorm2d_pattern, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_debug_autotuner, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_extra_cuda_copy_pattern, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_extra_cuda_copy_pattern_benchmark, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_for_loop_indexing_pattern, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_fp32_matmul_pattern, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_grad_not_set_to_none_pattern, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_matmul_dim_fp16_pattern, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_name_pattern, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_optimizer_single_tensor_pattern, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_overload_names, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_pattern_match_helper, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_pattern_matcher_json_report, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_synchronized_dataloader_pattern, test/profiler/test_profiler.py::TestExperimentalUtils::test_utils_compute_idle_time, test/profiler/test_profiler.py::TestExperimentalUtils::test_utils_compute_queue_depth, test/profiler/test_profiler.py::TestExperimentalUtils::test_utils_compute_queue_depth_when_no_cuda_events, test/profiler/test_profiler.py::TestExperimentalUtils::test_utils_compute_self_time, test/profiler/test_profiler.py::TestExperimentalUtils::test_utils_get_optimizable_events, test/profiler/test_profiler.py::TestExperimentalUtils::test_utils_intervals_overlap 2025-10-10T02:05:27.7441045Z 2025-10-10T02:05:28.1539055Z Running dynamo/test_misc 1/1 ... [2025-10-10 02:05:28.153338] 2025-10-10T02:05:28.1539796Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:05:28.1541623Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_misc.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:05:28.153785] 2025-10-10T02:05:29.4834090Z Running export/test_draft_export 1/1 ... [2025-10-10 02:05:29.482838] 2025-10-10T02:05:29.4834689Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:05:29.4836541Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_draft_export.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:05:29.483274] 2025-10-10T02:05:31.6700000Z Running export/test_sparse 1/1 ... [2025-10-10 02:05:31.669456] 2025-10-10T02:05:31.6700464Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:05:31.6703533Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_sparse.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:05:31.669825] 2025-10-10T02:05:33.6063932Z 2025-10-10T02:05:33.6064920Z export/test_draft_export 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_draft_export_1.1_a4ad47099ea70073_.log 2025-10-10T02:05:33.6072046Z Running 21 items in this shard: test/export/test_draft_export.py::TestDraftExport::test_complex_data_dependent_expr, test/export/test_draft_export.py::TestDraftExport::test_constantify_unbacked_symbol, test/export/test_draft_export.py::TestDraftExport::test_cuda_memory_usage, test/export/test_draft_export.py::TestDraftExport::test_data_dependent_failure, test/export/test_draft_export.py::TestDraftExport::test_dedup_data_dependent_failure, test/export/test_draft_export.py::TestDraftExport::test_fake_infer_dense_in_memory_check, test/export/test_draft_export.py::TestDraftExport::test_masked_linear, test/export/test_draft_export.py::TestDraftExport::test_missing_meta_kernel_custom_op_basic, test/export/test_draft_export.py::TestDraftExport::test_missing_meta_kernel_custom_op_multiple_profiles, test/export/test_draft_export.py::TestDraftExport::test_missing_meta_kernel_custom_op_update_profile, test/export/test_draft_export.py::TestDraftExport::test_missing_meta_kernel_guard, test/export/test_draft_export.py::TestDraftExport::test_missing_meta_kernel_impl, test/export/test_draft_export.py::TestDraftExport::test_offsets, test/export/test_draft_export.py::TestDraftExport::test_override_incorrectly_aliasing_kernel, test/export/test_draft_export.py::TestDraftExport::test_override_mismatched_fake_kernel_with_unbacked_symbols, test/export/test_draft_export.py::TestDraftExport::test_override_size_and_dtype_mismatched_fake_kernels, test/export/test_draft_export.py::TestDraftExport::test_shape_failure, test/export/test_draft_export.py::TestDraftExport::test_side_effect1, test/export/test_draft_export.py::TestDraftExport::test_side_effect_inps, test/export/test_draft_export.py::TestDraftExport::test_torchbind, test/export/test_draft_export.py::TestDraftExport::test_unbacked_div_mod_replacement 2025-10-10T02:05:33.6078494Z 2025-10-10T02:05:34.7825069Z 2025-10-10T02:05:34.7826440Z dynamo/test_misc 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_misc_1.1_4ba039d9e46540fc_.log 2025-10-10T02:05:34.7969887Z Running 617 items in this shard: test/dynamo/test_misc.py::MiscTests::test_312_binary_slice_with_graph_break1, test/dynamo/test_misc.py::MiscTests::test_312_binary_slice_with_graph_break2, test/dynamo/test_misc.py::MiscTests::test_RAISE_VARARGS_0, test/dynamo/test_misc.py::MiscTests::test_T_tensor_attribute, test/dynamo/test_misc.py::MiscTests::test_add_sizes, test/dynamo/test_misc.py::MiscTests::test_add_to_set, test/dynamo/test_misc.py::MiscTests::test_anomaly_aot_autograd, test/dynamo/test_misc.py::MiscTests::test_any_all_symnode, test/dynamo/test_misc.py::MiscTests::test_aot_autograd_propagate_unbacked_symints_shape, test/dynamo/test_misc.py::MiscTests::test_arange_length_with_float32_dtype, test/dynamo/test_misc.py::MiscTests::test_argwhere_with_dynamic_shapes, test/dynamo/test_misc.py::MiscTests::test_assert, test/dynamo/test_misc.py::MiscTests::test_assert_size_stride, test/dynamo/test_misc.py::MiscTests::test_assigning_function_to_class_attribute, test/dynamo/test_misc.py::MiscTests::test_assigning_function_to_object_attribute, test/dynamo/test_misc.py::MiscTests::test_backend_match_guard, test/dynamo/test_misc.py::MiscTests::test_backend_match_guard_multi_threads, test/dynamo/test_misc.py::MiscTests::test_backward_deterministic_mode_mismatch_warning, test/dynamo/test_misc.py::MiscTests::test_boolarg, test/dynamo/test_misc.py::MiscTests::test_bound_shape_checks, test/dynamo/test_misc.py::MiscTests::test_build_tuple_unpack, test/dynamo/test_misc.py::MiscTests::test_builder_for_class_with_metaclass, test/dynamo/test_misc.py::MiscTests::test_builtin_abs, test/dynamo/test_misc.py::MiscTests::test_builtin_bool_on_symbool, test/dynamo/test_misc.py::MiscTests::test_builtin_bool_on_symfloat, test/dynamo/test_misc.py::MiscTests::test_builtin_bool_on_symint, test/dynamo/test_misc.py::MiscTests::test_builtin_complex, test/dynamo/test_misc.py::MiscTests::test_builtin_complex_args, test/dynamo/test_misc.py::MiscTests::test_builtin_isinstance, test/dynamo/test_misc.py::MiscTests::test_builtin_str_on_user_defined_function, test/dynamo/test_misc.py::MiscTests::test_builtin_subclasses_as_method_on_class_type, test/dynamo/test_misc.py::MiscTests::test_builtin_subclasses_as_method_on_var, test/dynamo/test_misc.py::MiscTests::test_call_parent_non_class_methods_from_child, test/dynamo/test_misc.py::MiscTests::test_callpacked, test/dynamo/test_misc.py::MiscTests::test_cannot_trace_mark_dynamic, test/dynamo/test_misc.py::MiscTests::test_cannot_trace_mark_dynamic_safe_unreached, test/dynamo/test_misc.py::MiscTests::test_cast, test/dynamo/test_misc.py::MiscTests::test_cat_unbacked, test/dynamo/test_misc.py::MiscTests::test_catch_watchings1, test/dynamo/test_misc.py::MiscTests::test_catch_watchings2, test/dynamo/test_misc.py::MiscTests::test_cell_captured_by_existing_func_but_not_root_frame, test/dynamo/test_misc.py::MiscTests::test_cell_output1, test/dynamo/test_misc.py::MiscTests::test_cell_output2, test/dynamo/test_misc.py::MiscTests::test_check_simplification, test/dynamo/test_misc.py::MiscTests::test_class_binop, test/dynamo/test_misc.py::MiscTests::test_class_duner_flags, test/dynamo/test_misc.py::MiscTests::test_class_duner_mro, test/dynamo/test_misc.py::MiscTests::test_class_has_instancecheck_method, test/dynamo/test_misc.py::MiscTests::test_clone_sparse_input, test/dynamo/test_misc.py::MiscTests::test_closure_out_of_scope_cell, test/dynamo/test_misc.py::MiscTests::test_closure_out_of_scope_cell_with_cond, test/dynamo/test_misc.py::MiscTests::test_closure_out_of_scope_cell_with_mutation, test/dynamo/test_misc.py::MiscTests::test_closure_recompiles, test/dynamo/test_misc.py::MiscTests::test_closure_with_mutation_and_graph_break, test/dynamo/test_misc.py::MiscTests::test_closure_write_across_functions, test/dynamo/test_misc.py::MiscTests::test_compare_shapes_eq, test/dynamo/test_misc.py::MiscTests::test_compare_shapes_neq, test/dynamo/test_misc.py::MiscTests::test_compare_shapes_tuple_eq, test/dynamo/test_misc.py::MiscTests::test_compare_shapes_tuple_neq, test/dynamo/test_misc.py::MiscTests::test_compare_shapes_with_constant, test/dynamo/test_misc.py::MiscTests::test_compare_tensor_with_none, test/dynamo/test_misc.py::MiscTests::test_compilation_metrics_size_limit, test/dynamo/test_misc.py::MiscTests::test_compiled_class_graph_break, test/dynamo/test_misc.py::MiscTests::test_cond, test/dynamo/test_misc.py::MiscTests::test_cond_export, test/dynamo/test_misc.py::MiscTests::test_cond_export_single_arg, test/dynamo/test_misc.py::MiscTests::test_cond_nested, test/dynamo/test_misc.py::MiscTests::test_cond_side_effects, test/dynamo/test_misc.py::MiscTests::test_cond_with_quantization, test/dynamo/test_misc.py::MiscTests::test_conditional_list_comp_in_context, test/dynamo/test_misc.py::MiscTests::test_config_getattr_default, test/dynamo/test_misc.py::MiscTests::test_config_obj, test/dynamo/test_misc.py::MiscTests::test_const_dict_variable_python_type, test/dynamo/test_misc.py::MiscTests::test_constant_getattr, test/dynamo/test_misc.py::MiscTests::test_cross_entropy_loss_fancy_ctor1, test/dynamo/test_misc.py::MiscTests::test_cross_entropy_loss_fancy_ctor2, test/dynamo/test_misc.py::MiscTests::test_cross_entropy_loss_simple_ctor, test/dynamo/test_misc.py::MiscTests::test_custom_dict, test/dynamo/test_misc.py::MiscTests::test_custom_module_free, test/dynamo/test_misc.py::MiscTests::test_data_access_in_inference_mode, test/dynamo/test_misc.py::MiscTests::test_data_ptr_graph_break_aten, test/dynamo/test_misc.py::MiscTests::test_data_ptr_graph_break_builtin, test/dynamo/test_misc.py::MiscTests::test_dataclass, test/dynamo/test_misc.py::MiscTests::test_dataclass_fields, test/dynamo/test_misc.py::MiscTests::test_dataclass_local_hasattr, test/dynamo/test_misc.py::MiscTests::test_default_args_device_dtype, test/dynamo/test_misc.py::MiscTests::test_default_dtype_change, test/dynamo/test_misc.py::MiscTests::test_defaultdict, test/dynamo/test_misc.py::MiscTests::test_deque_append_left, test/dynamo/test_misc.py::MiscTests::test_deque_input, test/dynamo/test_misc.py::MiscTests::test_derpy_nn_module_usage, test/dynamo/test_misc.py::MiscTests::test_descriptor, test/dynamo/test_misc.py::MiscTests::test_descriptor_side_effect, test/dynamo/test_misc.py::MiscTests::test_deterministic_algorithms_mutated, test/dynamo/test_misc.py::MiscTests::test_dictcomp, test/dynamo/test_misc.py::MiscTests::test_dim_order, test/dynamo/test_misc.py::MiscTests::test_disable_flag, test/dynamo/test_misc.py::MiscTests::test_dtypes_no_graphbreaks, test/dynamo/test_misc.py::MiscTests::test_dunder_methods, test/dynamo/test_misc.py::MiscTests::test_dunder_new_function_inlining, test/dynamo/test_misc.py::MiscTests::test_dunder_new_function_inlining1, test/dynamo/test_misc.py::MiscTests::test_dunder_new_function_inlining2, test/dynamo/test_misc.py::MiscTests::test_dunder_new_function_inlining3, test/dynamo/test_misc.py::MiscTests::test_dunder_new_function_inlining4, test/dynamo/test_misc.py::MiscTests::test_dunder_weakref, test/dynamo/test_misc.py::MiscTests::test_duplicate_graph_break_log, test/dynamo/test_misc.py::MiscTests::test_dynamic_one_hot, test/dynamo/test_misc.py::MiscTests::test_dynamic_shapes_as_strided, test/dynamo/test_misc.py::MiscTests::test_dynamic_sources_dynamic_override, test/dynamo/test_misc.py::MiscTests::test_dynamic_sources_dynamic_override_regex, test/dynamo/test_misc.py::MiscTests::test_dynamic_sources_force_parameter_static_shapes_and_property_static_shapes_override, test/dynamo/test_misc.py::MiscTests::test_dynamic_sources_graph_break, test/dynamo/test_misc.py::MiscTests::test_dynamic_sources_int, test/dynamo/test_misc.py::MiscTests::test_dynamic_sources_precedence_over_int_specialization, test/dynamo/test_misc.py::MiscTests::test_dynamic_sources_tensor, test/dynamo/test_misc.py::MiscTests::test_dynamo_cache_invalidate, test/dynamo/test_misc.py::MiscTests::test_dynamo_cache_move_to_front, test/dynamo/test_misc.py::MiscTests::test_dynamo_compiling_fake_tensor_to_vararg_int, test/dynamo/test_misc.py::MiscTests::test_dynamo_disabled_in_custom_op_kernels, test/dynamo/test_misc.py::MiscTests::test_dynamo_min_operator_with_shape, test/dynamo/test_misc.py::MiscTests::test_dynamo_reset_clears_cache, test/dynamo/test_misc.py::MiscTests::test_empty_list, test/dynamo/test_misc.py::MiscTests::test_enum_as_dict_key, test/dynamo/test_misc.py::MiscTests::test_enum_as_dict_key_with_overloaded_str, test/dynamo/test_misc.py::MiscTests::test_enum_guards, test/dynamo/test_misc.py::MiscTests::test_enum_method, test/dynamo/test_misc.py::MiscTests::test_enum_no_graphbreaks, test/dynamo/test_misc.py::MiscTests::test_enum_subclass, test/dynamo/test_misc.py::MiscTests::test_error_on_nested_fx_trace, test/dynamo/test_misc.py::MiscTests::test_error_on_recompile, test/dynamo/test_misc.py::MiscTests::test_escaping_closure_var_with_backward_hook, test/dynamo/test_misc.py::MiscTests::test_escaping_closure_var_with_nonlocal_var, test/dynamo/test_misc.py::MiscTests::test_existing_func_that_creates_capturing_nested_func, test/dynamo/test_misc.py::MiscTests::test_fail_on_recompile_error_message, test/dynamo/test_misc.py::MiscTests::test_flat_name_to_original_fqn, test/dynamo/test_misc.py::MiscTests::test_float_speculation_log_divergence, test/dynamo/test_misc.py::MiscTests::test_fn_hasattr__name__1, test/dynamo/test_misc.py::MiscTests::test_fn_hasattr__name__2, test/dynamo/test_misc.py::MiscTests::test_fn_hasattr__name__3, test/dynamo/test_misc.py::MiscTests::test_fold, test/dynamo/test_misc.py::MiscTests::test_free_var_and_local_name_collision, test/dynamo/test_misc.py::MiscTests::test_frozen_dataclass_attr_access, test/dynamo/test_misc.py::MiscTests::test_frozen_dataclass_default_factory, test/dynamo/test_misc.py::MiscTests::test_frozen_dataclass_default_value, test/dynamo/test_misc.py::MiscTests::test_frozen_dataclass_hashable, test/dynamo/test_misc.py::MiscTests::test_frozen_dataclass_kw_only, test/dynamo/test_misc.py::MiscTests::test_frozen_dict, test/dynamo/test_misc.py::MiscTests::test_frozenset_of_non_literals, test/dynamo/test_misc.py::MiscTests::test_frozenset_torch_func_contains, test/dynamo/test_misc.py::MiscTests::test_fullgraph_capture, test/dynamo/test_misc.py::MiscTests::test_funcname_cache, test/dynamo/test_misc.py::MiscTests::test_function_annotation, test/dynamo/test_misc.py::MiscTests::test_function_generic_alias_annotation, test/dynamo/test_misc.py::MiscTests::test_generate_tensor_from_list_of_numpy_primitive_type, test/dynamo/test_misc.py::MiscTests::test_generate_trivial_abstract_impl, test/dynamo/test_misc.py::MiscTests::test_get_attr_function, test/dynamo/test_misc.py::MiscTests::test_get_cache_entry, test/dynamo/test_misc.py::MiscTests::test_get_custom_tensor_attribute, test/dynamo/test_misc.py::MiscTests::test_get_instruction_source_311, test/dynamo/test_misc.py::MiscTests::test_getattr_dict, test/dynamo/test_misc.py::MiscTests::test_getattrvariable_as_python_constant, test/dynamo/test_misc.py::MiscTests::test_getset_descriptor, test/dynamo/test_misc.py::MiscTests::test_global_state_guard_serialization, test/dynamo/test_misc.py::MiscTests::test_grad, test/dynamo/test_misc.py::MiscTests::test_grad_non_none, test/dynamo/test_misc.py::MiscTests::test_grad_none, test/dynamo/test_misc.py::MiscTests::test_grad_state_mutated, test/dynamo/test_misc.py::MiscTests::test_graph_break_compilation_metrics, test/dynamo/test_misc.py::MiscTests::test_graph_break_compilation_metrics_on_failure, test/dynamo/test_misc.py::MiscTests::test_graph_break_correctly_when_passing_numpy_ndarray_to_torch_function, test/dynamo/test_misc.py::MiscTests::test_guard_failure_fn, test/dynamo/test_misc.py::MiscTests::test_guard_failure_fn2, test/dynamo/test_misc.py::MiscTests::test_guard_failure_fn_shape_control, test/dynamo/test_misc.py::MiscTests::test_guard_failure_fn_tensor_iter, test/dynamo/test_misc.py::MiscTests::test_guard_filter_fn_by_id, test/dynamo/test_misc.py::MiscTests::test_guard_filter_fn_by_is_global, test/dynamo/test_misc.py::MiscTests::test_guard_filter_fn_by_name_and_value, test/dynamo/test_misc.py::MiscTests::test_guard_filter_globals, test/dynamo/test_misc.py::MiscTests::test_guard_filter_inbuilt_nn_modules, test/dynamo/test_misc.py::MiscTests::test_guard_filter_nn_modules, test/dynamo/test_misc.py::MiscTests::test_guard_filter_tensors, test/dynamo/test_misc.py::MiscTests::test_guard_function_builder_with_cse, test/dynamo/test_misc.py::MiscTests::test_guard_size_oblivious, test/dynamo/test_misc.py::MiscTests::test_guard_size_oblivious_backed, test/dynamo/test_misc.py::MiscTests::test_guard_sym_node_fstring_when_used, test/dynamo/test_misc.py::MiscTests::test_guards_cse_pass_multiple, test/dynamo/test_misc.py::MiscTests::test_guards_cse_pass_single, test/dynamo/test_misc.py::MiscTests::test_guards_strip_function_call, test/dynamo/test_misc.py::MiscTests::test_hasattr_nn_module_guard, test/dynamo/test_misc.py::MiscTests::test_hash_getitem_slice, test/dynamo/test_misc.py::MiscTests::test_hash_hop, test/dynamo/test_misc.py::MiscTests::test_id_guarded_class, test/dynamo/test_misc.py::MiscTests::test_id_guarded_module, test/dynamo/test_misc.py::MiscTests::test_id_guarded_object, test/dynamo/test_misc.py::MiscTests::test_id_of_nn_module, test/dynamo/test_misc.py::MiscTests::test_id_tensor, test/dynamo/test_misc.py::MiscTests::test_if_cond_nn_mod1, test/dynamo/test_misc.py::MiscTests::test_if_cond_nn_mod2, test/dynamo/test_misc.py::MiscTests::test_if_cond_nn_mod3, test/dynamo/test_misc.py::MiscTests::test_if_cond_user_defined_object, test/dynamo/test_misc.py::MiscTests::test_if_cond_user_defined_object2, test/dynamo/test_misc.py::MiscTests::test_if_cond_user_defined_object3, test/dynamo/test_misc.py::MiscTests::test_inference_mode, test/dynamo/test_misc.py::MiscTests::test_inference_mode_param, test/dynamo/test_misc.py::MiscTests::test_inline_closure_not_loaded_by_parent, test/dynamo/test_misc.py::MiscTests::test_inline_closure_returned_by_another_function_and_captures, test/dynamo/test_misc.py::MiscTests::test_inline_dict_function, test/dynamo/test_misc.py::MiscTests::test_inline_dict_function_passed_as_arg, test/dynamo/test_misc.py::MiscTests::test_inline_dict_mutation, test/dynamo/test_misc.py::MiscTests::test_inline_func_jump_on_tensor_condition, test/dynamo/test_misc.py::MiscTests::test_inline_list_mutation, test/dynamo/test_misc.py::MiscTests::test_inline_local_dict_clear, test/dynamo/test_misc.py::MiscTests::test_inline_module_attr_dict_clear, test/dynamo/test_misc.py::MiscTests::test_inline_user_defined_dict_attr_clear, test/dynamo/test_misc.py::MiscTests::test_inplace, test/dynamo/test_misc.py::MiscTests::test_inplace_desugaring, test/dynamo/test_misc.py::MiscTests::test_inplace_param_update, test/dynamo/test_misc.py::MiscTests::test_inplace_view_on_graph_input, test/dynamo/test_misc.py::MiscTests::test_input_cell_mutation, test/dynamo/test_misc.py::MiscTests::test_inspect_signature_bind, test/dynamo/test_misc.py::MiscTests::test_inspect_signature_bind_non_user_function, test/dynamo/test_misc.py::MiscTests::test_inspect_signature_parameters, test/dynamo/test_misc.py::MiscTests::test_int_int_comparisons, test/dynamo/test_misc.py::MiscTests::test_int_list, test/dynamo/test_misc.py::MiscTests::test_int_neg, test/dynamo/test_misc.py::MiscTests::test_int_shape_binops, test/dynamo/test_misc.py::MiscTests::test_int_shape_comparisons, test/dynamo/test_misc.py::MiscTests::test_int_shape_inplace_binops, test/dynamo/test_misc.py::MiscTests::test_intermediary_tensor_grad_access, test/dynamo/test_misc.py::MiscTests::test_invalid_args_builtin, test/dynamo/test_misc.py::MiscTests::test_is_compiling, test/dynamo/test_misc.py::MiscTests::test_is_floating_point, test/dynamo/test_misc.py::MiscTests::test_is_floating_point2, test/dynamo/test_misc.py::MiscTests::test_is_tensor, test/dynamo/test_misc.py::MiscTests::test_is_tensor2, test/dynamo/test_misc.py::MiscTests::test_is_tensor_like, test/dynamo/test_misc.py::MiscTests::test_is_tensor_like2, test/dynamo/test_misc.py::MiscTests::test_item, test/dynamo/test_misc.py::MiscTests::test_item_changes, test/dynamo/test_misc.py::MiscTests::test_item_changes_new_shape, test/dynamo/test_misc.py::MiscTests::test_iter_set, test/dynamo/test_misc.py::MiscTests::test_iter_type, test/dynamo/test_misc.py::MiscTests::test_iterator_limit, test/dynamo/test_misc.py::MiscTests::test_itertools_accumulate_symint_default_sum, test/dynamo/test_misc.py::MiscTests::test_itertools_accumulate_tensors_builtins, test/dynamo/test_misc.py::MiscTests::test_itertools_accumulate_tensors_default_sum, test/dynamo/test_misc.py::MiscTests::test_itertools_accumulate_tensors_kwargs, test/dynamo/test_misc.py::MiscTests::test_itertools_accumulate_tensors_user_defined, test/dynamo/test_misc.py::MiscTests::test_itertools_groupby_pure_python_default_identify_func, test/dynamo/test_misc.py::MiscTests::test_itertools_groupby_pure_python_key_func, test/dynamo/test_misc.py::MiscTests::test_itertools_infinite_count, test/dynamo/test_misc.py::MiscTests::test_itertools_infinite_cycle, test/dynamo/test_misc.py::MiscTests::test_itertools_infinite_repeat, test/dynamo/test_misc.py::MiscTests::test_itertools_infinite_repeat_mutation, test/dynamo/test_misc.py::MiscTests::test_itertools_islice, test/dynamo/test_misc.py::MiscTests::test_itertools_islice_default_end, test/dynamo/test_misc.py::MiscTests::test_itertools_islice_default_step, test/dynamo/test_misc.py::MiscTests::test_itertools_repeat, test/dynamo/test_misc.py::MiscTests::test_itertools_tee, test/dynamo/test_misc.py::MiscTests::test_large_reduction_list, test/dynamo/test_misc.py::MiscTests::test_linear_module_free, test/dynamo/test_misc.py::MiscTests::test_list_append_return_none, test/dynamo/test_misc.py::MiscTests::test_list_class, test/dynamo/test_misc.py::MiscTests::test_list_hasattr1, test/dynamo/test_misc.py::MiscTests::test_list_hasattr2, test/dynamo/test_misc.py::MiscTests::test_list_iadd_side_effect, test/dynamo/test_misc.py::MiscTests::test_list_iadd_with_shape, test/dynamo/test_misc.py::MiscTests::test_list_iterator_contains, test/dynamo/test_misc.py::MiscTests::test_list_mul, test/dynamo/test_misc.py::MiscTests::test_list_slice_mul, test/dynamo/test_misc.py::MiscTests::test_listcomp, test/dynamo/test_misc.py::MiscTests::test_load_fast_and_clear_graph_break, test/dynamo/test_misc.py::MiscTests::test_mandelbrot_numpy, test/dynamo/test_misc.py::MiscTests::test_map_side_effects, test/dynamo/test_misc.py::MiscTests::test_map_with_quantization, test/dynamo/test_misc.py::MiscTests::test_mark_dynamic_with_ranges, test/dynamo/test_misc.py::MiscTests::test_mark_static, test/dynamo/test_misc.py::MiscTests::test_mark_unbacked_strict, test/dynamo/test_misc.py::MiscTests::test_matmul1, test/dynamo/test_misc.py::MiscTests::test_min_max_over_iterable, test/dynamo/test_misc.py::MiscTests::test_module_complex_iter, test/dynamo/test_misc.py::MiscTests::test_module_deepcopy, test/dynamo/test_misc.py::MiscTests::test_module_not_callable, test/dynamo/test_misc.py::MiscTests::test_mro_type_tensor_no_source, test/dynamo/test_misc.py::MiscTests::test_multiple_inheritance, test/dynamo/test_misc.py::MiscTests::test_mutable_mapping_multiple_inheritance, test/dynamo/test_misc.py::MiscTests::test_named_parameters, test/dynamo/test_misc.py::MiscTests::test_namedtuple1, test/dynamo/test_misc.py::MiscTests::test_namedtuple2, test/dynamo/test_misc.py::MiscTests::test_namedtuple3, test/dynamo/test_misc.py::MiscTests::test_namedtuple_class, test/dynamo/test_misc.py::MiscTests::test_namedtuple_source_dynamic_attributes, test/dynamo/test_misc.py::MiscTests::test_namedtuple_sourceless_dynamic_attributes, test/dynamo/test_misc.py::MiscTests::test_namedtuple_with_custom_getitem, test/dynamo/test_misc.py::MiscTests::test_nan, test/dynamo/test_misc.py::MiscTests::test_ne_operator_with_custom_eq, test/dynamo/test_misc.py::MiscTests::test_ne_operator_with_custom_graphbreak_eq, test/dynamo/test_misc.py::MiscTests::test_ne_operator_with_custom_ne, test/dynamo/test_misc.py::MiscTests::test_nested_closure, test/dynamo/test_misc.py::MiscTests::test_nested_closure_mutation, test/dynamo/test_misc.py::MiscTests::test_nested_dataclass_reconstruct, test/dynamo/test_misc.py::MiscTests::test_nested_frozen_dataclass_hashable, test/dynamo/test_misc.py::MiscTests::test_nested_function_resuming_with_correct_globals, test/dynamo/test_misc.py::MiscTests::test_nested_optimize, test/dynamo/test_misc.py::MiscTests::test_nested_optimize_decorator, test/dynamo/test_misc.py::MiscTests::test_nested_optimize_run, test/dynamo/test_misc.py::MiscTests::test_nested_sequential_try, test/dynamo/test_misc.py::MiscTests::test_nested_sequential_try_with, test/dynamo/test_misc.py::MiscTests::test_nested_sequential_try_with_graph_break, test/dynamo/test_misc.py::MiscTests::test_nested_sequential_with, test/dynamo/test_misc.py::MiscTests::test_nested_wraps, test/dynamo/test_misc.py::MiscTests::test_nesteduserfunction_setattr, test/dynamo/test_misc.py::MiscTests::test_new_with_int_list, test/dynamo/test_misc.py::MiscTests::test_newly_constructed_tensor_attr_mutation, test/dynamo/test_misc.py::MiscTests::test_nn_functional_reduction, test/dynamo/test_misc.py::MiscTests::test_nn_module_getattr, test/dynamo/test_misc.py::MiscTests::test_nn_module_getattribute, test/dynamo/test_misc.py::MiscTests::test_nn_sequential_invocation, test/dynamo/test_misc.py::MiscTests::test_nn_sequential_invocation_reposition_indices, test/dynamo/test_misc.py::MiscTests::test_no_error_on_nested_fx_trace, test/dynamo/test_misc.py::MiscTests::test_no_guard_for_unused_sym_node_fstring, test/dynamo/test_misc.py::MiscTests::test_no_raise_guard_partial_constraint, test/dynamo/test_misc.py::MiscTests::test_no_raise_guard_partial_constraint_across_break, test/dynamo/test_misc.py::MiscTests::test_non_pt2_compliant_ops_graph_break, test/dynamo/test_misc.py::MiscTests::test_not_dynamic_scope, test/dynamo/test_misc.py::MiscTests::test_numel, test/dynamo/test_misc.py::MiscTests::test_numpy_array_of_arrays, test/dynamo/test_misc.py::MiscTests::test_numpy_as_global, test/dynamo/test_misc.py::MiscTests::test_numpy_fallback_on_eager, test/dynamo/test_misc.py::MiscTests::test_numpy_force, test/dynamo/test_misc.py::MiscTests::test_numpy_gt, test/dynamo/test_misc.py::MiscTests::test_numpy_int_constant, test/dynamo/test_misc.py::MiscTests::test_numpy_iter, test/dynamo/test_misc.py::MiscTests::test_numpy_min, test/dynamo/test_misc.py::MiscTests::test_numpy_ndarray_graph_break, test/dynamo/test_misc.py::MiscTests::test_numpy_ndarray_graph_break_with_multiple_outputs, test/dynamo/test_misc.py::MiscTests::test_numpy_ndarray_works_with_builtin_function, test/dynamo/test_misc.py::MiscTests::test_numpy_no_raise, test/dynamo/test_misc.py::MiscTests::test_numpy_non_torch_dtype, test/dynamo/test_misc.py::MiscTests::test_numpy_random_config_to_numpy, test/dynamo/test_misc.py::MiscTests::test_numpy_readonly, test/dynamo/test_misc.py::MiscTests::test_numpy_recompilation_scalar, test/dynamo/test_misc.py::MiscTests::test_numpy_size_attr, test/dynamo/test_misc.py::MiscTests::test_numpy_subdtype, test/dynamo/test_misc.py::MiscTests::test_numpy_take_along_axis, test/dynamo/test_misc.py::MiscTests::test_numpy_tolist, test/dynamo/test_misc.py::MiscTests::test_numpy_torch_operators, test/dynamo/test_misc.py::MiscTests::test_numpy_ufunc_out, test/dynamo/test_misc.py::MiscTests::test_numpy_ufunc_out_graph_break, test/dynamo/test_misc.py::MiscTests::test_numpy_unique_f16, test/dynamo/test_misc.py::MiscTests::test_numpy_variable_isinstance, test/dynamo/test_misc.py::MiscTests::test_numpy_with_builtin_type, test/dynamo/test_misc.py::MiscTests::test_object_classmethod, test/dynamo/test_misc.py::MiscTests::test_object_setattr, test/dynamo/test_misc.py::MiscTests::test_object_staticmethod, test/dynamo/test_misc.py::MiscTests::test_onnx_shape_as_tensor, test/dynamo/test_misc.py::MiscTests::test_optimize_on_module, test/dynamo/test_misc.py::MiscTests::test_ordered_dict_alias_reconstruct, test/dynamo/test_misc.py::MiscTests::test_ordered_dict_move_to_end, test/dynamo/test_misc.py::MiscTests::test_os_environ_get, test/dynamo/test_misc.py::MiscTests::test_os_environ_set_graph_break, test/dynamo/test_misc.py::MiscTests::test_out_variant_custom_op, test/dynamo/test_misc.py::MiscTests::test_out_variants_with_resizing_on_graph_inputs, test/dynamo/test_misc.py::MiscTests::test_out_variants_with_resizing_on_graph_inputs_with_dynamic, test/dynamo/test_misc.py::MiscTests::test_out_variants_with_resizing_on_graph_inputs_with_dynamic1, test/dynamo/test_misc.py::MiscTests::test_outside_linear_module_free, test/dynamo/test_misc.py::MiscTests::test_overridden_getattribute, test/dynamo/test_misc.py::MiscTests::test_packaging_version_parse, test/dynamo/test_misc.py::MiscTests::test_pair, test/dynamo/test_misc.py::MiscTests::test_param_shape_binops, test/dynamo/test_misc.py::MiscTests::test_parameter_free, test/dynamo/test_misc.py::MiscTests::test_patched_builtin_functions, test/dynamo/test_misc.py::MiscTests::test_pep0479_convert_stopiteration, test/dynamo/test_misc.py::MiscTests::test_precompile_entries, test/dynamo/test_misc.py::MiscTests::test_precompile_entry_hit, test/dynamo/test_misc.py::MiscTests::test_precompile_entry_miss, test/dynamo/test_misc.py::MiscTests::test_precompile_fail_on_recompile, test/dynamo/test_misc.py::MiscTests::test_proxy_frozen_dataclass, test/dynamo/test_misc.py::MiscTests::test_pt2_compliant_ops_are_allowed, test/dynamo/test_misc.py::MiscTests::test_pt2_compliant_overload, test/dynamo/test_misc.py::MiscTests::test_pure_python_accumulate, test/dynamo/test_misc.py::MiscTests::test_py_guards_mark_dynamic, test/dynamo/test_misc.py::MiscTests::test_python_slice, test/dynamo/test_misc.py::MiscTests::test_raise_guard_full_constraint, test/dynamo/test_misc.py::MiscTests::test_raise_guard_indirect_full_constraint, test/dynamo/test_misc.py::MiscTests::test_raise_guard_partial_constraint_across_break, test/dynamo/test_misc.py::MiscTests::test_raise_guard_partial_constraint_no_graph_break, test/dynamo/test_misc.py::MiscTests::test_raise_on_backend_error, test/dynamo/test_misc.py::MiscTests::test_raises, test/dynamo/test_misc.py::MiscTests::test_raises_importerror1, test/dynamo/test_misc.py::MiscTests::test_raises_importerror2, test/dynamo/test_misc.py::MiscTests::test_range_input, test/dynamo/test_misc.py::MiscTests::test_range_iter_guards, test/dynamo/test_misc.py::MiscTests::test_range_iter_side_effects, test/dynamo/test_misc.py::MiscTests::test_range_with_shape, test/dynamo/test_misc.py::MiscTests::test_real_imag_tensor_attribute, test/dynamo/test_misc.py::MiscTests::test_recompile_message_on_parameter, test/dynamo/test_misc.py::MiscTests::test_recompile_on_disable_1, test/dynamo/test_misc.py::MiscTests::test_recompile_on_disable_2, test/dynamo/test_misc.py::MiscTests::test_recompile_on_global_state_change, test/dynamo/test_misc.py::MiscTests::test_reconstruct_frozen_dataclass, test/dynamo/test_misc.py::MiscTests::test_reconstruct_set_across_graph_break, test/dynamo/test_misc.py::MiscTests::test_recursion_depth_guards, test/dynamo/test_misc.py::MiscTests::test_recursive_inline_list_mutation, test/dynamo/test_misc.py::MiscTests::test_recursive_tensor_attribute, test/dynamo/test_misc.py::MiscTests::test_release_input_memory, test/dynamo/test_misc.py::MiscTests::test_release_module_memory, test/dynamo/test_misc.py::MiscTests::test_release_scope_memory, test/dynamo/test_misc.py::MiscTests::test_remove_set, test/dynamo/test_misc.py::MiscTests::test_repeat_interleave_graphbreaks, test/dynamo/test_misc.py::MiscTests::test_repro_graph_breaks_in__get_item_by_idx, test/dynamo/test_misc.py::MiscTests::test_restore_graphstate, test/dynamo/test_misc.py::MiscTests::test_return_dict_with_graph_break_and_update, test/dynamo/test_misc.py::MiscTests::test_return_nested_function, test/dynamo/test_misc.py::MiscTests::test_returning_func_with_captured_func_and_tensor, test/dynamo/test_misc.py::MiscTests::test_returning_nested_func_with_captured_tensor, test/dynamo/test_misc.py::MiscTests::test_running_func_with_captured_func_and_tensor, test/dynamo/test_misc.py::MiscTests::test_running_nested_func_with_captured_tensor, test/dynamo/test_misc.py::MiscTests::test_runtime_assert_replacement, test/dynamo/test_misc.py::MiscTests::test_sample_input, test/dynamo/test_misc.py::MiscTests::test_scalar_device_movement, test/dynamo/test_misc.py::MiscTests::test_scalar_tensor_is_equivalent_to_int_list_argument, test/dynamo/test_misc.py::MiscTests::test_scalar_tensor_is_equivalent_to_symint_argument, test/dynamo/test_misc.py::MiscTests::test_scalar_tensor_is_equivalent_to_symint_list_argument, test/dynamo/test_misc.py::MiscTests::test_sequential_module_free, test/dynamo/test_misc.py::MiscTests::test_set_aliasing_recompiles, test/dynamo/test_misc.py::MiscTests::test_set_custom_tensor_attribute, test/dynamo/test_misc.py::MiscTests::test_set_descriptor, test/dynamo/test_misc.py::MiscTests::test_set_discard, test/dynamo/test_misc.py::MiscTests::test_set_update, test/dynamo/test_misc.py::MiscTests::test_setattr_mutation1, test/dynamo/test_misc.py::MiscTests::test_setattr_mutation2, test/dynamo/test_misc.py::MiscTests::test_setattr_mutation3, test/dynamo/test_misc.py::MiscTests::test_shape_and_tuple_equality, test/dynamo/test_misc.py::MiscTests::test_shape_env_equal_constructor, test/dynamo/test_misc.py::MiscTests::test_shape_env_equal_create_symbolic_sizes_strides_storage_offset, test/dynamo/test_misc.py::MiscTests::test_shape_env_equal_empty, test/dynamo/test_misc.py::MiscTests::test_shape_env_equal_evaluate_expr_divisible, test/dynamo/test_misc.py::MiscTests::test_shape_env_equal_evaluate_expr_refinement, test/dynamo/test_misc.py::MiscTests::test_shape_env_equal_evaluate_expr_replacement, test/dynamo/test_misc.py::MiscTests::test_shape_env_equal_runtime_assert, test/dynamo/test_misc.py::MiscTests::test_shape_env_equal_unbacked, test/dynamo/test_misc.py::MiscTests::test_shape_env_no_recording, test/dynamo/test_misc.py::MiscTests::test_shape_env_recorded_function_fallback, test/dynamo/test_misc.py::MiscTests::test_shape_int_comparisons, test/dynamo/test_misc.py::MiscTests::test_shape_int_inplace_binops, test/dynamo/test_misc.py::MiscTests::test_shape_type, test/dynamo/test_misc.py::MiscTests::test_shape_unpack, test/dynamo/test_misc.py::MiscTests::test_side_effects_codegen_update_mutated, test/dynamo/test_misc.py::MiscTests::test_simple_set_usage, test/dynamo/test_misc.py::MiscTests::test_size_dim, test/dynamo/test_misc.py::MiscTests::test_size_input, test/dynamo/test_misc.py::MiscTests::test_slice_input, test/dynamo/test_misc.py::MiscTests::test_source_non_input_grad_access, test/dynamo/test_misc.py::MiscTests::test_sourceless_namedtuple, test/dynamo/test_misc.py::MiscTests::test_storage_return, test/dynamo/test_misc.py::MiscTests::test_str_format_assert1, test/dynamo/test_misc.py::MiscTests::test_str_format_assert2, test/dynamo/test_misc.py::MiscTests::test_str_format_return1, test/dynamo/test_misc.py::MiscTests::test_str_format_return2, test/dynamo/test_misc.py::MiscTests::test_stride_dim, test/dynamo/test_misc.py::MiscTests::test_structseq1, test/dynamo/test_misc.py::MiscTests::test_structseq2, test/dynamo/test_misc.py::MiscTests::test_super_after_graph_break, test/dynamo/test_misc.py::MiscTests::test_super_calling_with_metaclass, test/dynamo/test_misc.py::MiscTests::test_sym_and_terms, test/dynamo/test_misc.py::MiscTests::test_sym_constrain_range_on_replaced_unbacked_symbol, test/dynamo/test_misc.py::MiscTests::test_symint_as_device_kwarg_multi_gpu, test/dynamo/test_misc.py::MiscTests::test_symint_as_device_kwarg_non_strict_export, test/dynamo/test_misc.py::MiscTests::test_symint_copy_into_unbacked_slice, test/dynamo/test_misc.py::MiscTests::test_symint_fold_nontrivial_product_modulo, test/dynamo/test_misc.py::MiscTests::test_sys_modules, test/dynamo/test_misc.py::MiscTests::test_tagging_tensors_mix_used_unused_structure, test/dynamo/test_misc.py::MiscTests::test_tagging_tensors_simple, test/dynamo/test_misc.py::MiscTests::test_tensor_build_list_unpack, test/dynamo/test_misc.py::MiscTests::test_tensor_ctor_list_of_tensor, test/dynamo/test_misc.py::MiscTests::test_tensor_data, test/dynamo/test_misc.py::MiscTests::test_tensor_dict1, test/dynamo/test_misc.py::MiscTests::test_tensor_dict2, test/dynamo/test_misc.py::MiscTests::test_tensor_dict3, test/dynamo/test_misc.py::MiscTests::test_tensor_dot_grad_no_graph_break, test/dynamo/test_misc.py::MiscTests::test_tensor_dynamic_method, test/dynamo/test_misc.py::MiscTests::test_tensor_hasattr, test/dynamo/test_misc.py::MiscTests::test_tensor_interacts_with_numpy_ndarray, test/dynamo/test_misc.py::MiscTests::test_tensor_is_contiguous, test/dynamo/test_misc.py::MiscTests::test_tensor_item_capture, test/dynamo/test_misc.py::MiscTests::test_tensor_item_no_capture, test/dynamo/test_misc.py::MiscTests::test_tensor_iter, test/dynamo/test_misc.py::MiscTests::test_tensor_layout, test/dynamo/test_misc.py::MiscTests::test_tensor_setattr_getset_descriptor, test/dynamo/test_misc.py::MiscTests::test_tensor_types, test/dynamo/test_misc.py::MiscTests::test_thread_local_setattr, test/dynamo/test_misc.py::MiscTests::test_tolist, test/dynamo/test_misc.py::MiscTests::test_tolist_0d, test/dynamo/test_misc.py::MiscTests::test_tolist_1d, test/dynamo/test_misc.py::MiscTests::test_tolist_float, test/dynamo/test_misc.py::MiscTests::test_tolist_kd, test/dynamo/test_misc.py::MiscTests::test_tolist_kd_dynamic, test/dynamo/test_misc.py::MiscTests::test_tolist_scalar, test/dynamo/test_misc.py::MiscTests::test_top_package_import, test/dynamo/test_misc.py::MiscTests::test_torch_check, test/dynamo/test_misc.py::MiscTests::test_torch_check_nonnegative, test/dynamo/test_misc.py::MiscTests::test_torch_check_symbolic_shape_rel, test/dynamo/test_misc.py::MiscTests::test_torch_compile_ctx_on_forward_and_training_step, test/dynamo/test_misc.py::MiscTests::test_torch_distributions_lazy_property, test/dynamo/test_misc.py::MiscTests::test_torch_dtype_python_type, test/dynamo/test_misc.py::MiscTests::test_torch_dynamo_codegen_pow, test/dynamo/test_misc.py::MiscTests::test_torch_generator_set_state, test/dynamo/test_misc.py::MiscTests::test_torch_guards_stack_frame_register_inlining, test/dynamo/test_misc.py::MiscTests::test_torch_guards_stack_frame_register_inlining_deep, test/dynamo/test_misc.py::MiscTests::test_torch_nn_parameter_isinstance, test/dynamo/test_misc.py::MiscTests::test_torch_objects_as_keys, test/dynamo/test_misc.py::MiscTests::test_torch_package_working_with_trace, test/dynamo/test_misc.py::MiscTests::test_torch_seed, test/dynamo/test_misc.py::MiscTests::test_torch_size, test/dynamo/test_misc.py::MiscTests::test_torch_size_numel, test/dynamo/test_misc.py::MiscTests::test_torch_size_numel_dynamic, test/dynamo/test_misc.py::MiscTests::test_torch_variable_hasattr, test/dynamo/test_misc.py::MiscTests::test_trace_ndarray_frame, test/dynamo/test_misc.py::MiscTests::test_trace_ndarray_frame_2, test/dynamo/test_misc.py::MiscTests::test_tuple_class, test/dynamo/test_misc.py::MiscTests::test_tuple_from_tuple_iter, test/dynamo/test_misc.py::MiscTests::test_tuple_hasattr, test/dynamo/test_misc.py::MiscTests::test_tuple_iadd_with_shape, test/dynamo/test_misc.py::MiscTests::test_tuple_mul, test/dynamo/test_misc.py::MiscTests::test_tuple_mul_with_shape, test/dynamo/test_misc.py::MiscTests::test_type_copy, test/dynamo/test_misc.py::MiscTests::test_typing_dict, test/dynamo/test_misc.py::MiscTests::test_typing_typevar, test/dynamo/test_misc.py::MiscTests::test_typing_union_and_optional, test/dynamo/test_misc.py::MiscTests::test_typing_variable_isinstance, test/dynamo/test_misc.py::MiscTests::test_unbacked_2d_expand, test/dynamo/test_misc.py::MiscTests::test_unbacked_empty_tensor, test/dynamo/test_misc.py::MiscTests::test_unbacked_repeat_cat, test/dynamo/test_misc.py::MiscTests::test_unbacked_sources_scalar, test/dynamo/test_misc.py::MiscTests::test_unbacked_sources_tensor, test/dynamo/test_misc.py::MiscTests::test_unbacked_strict_mode, test/dynamo/test_misc.py::MiscTests::test_unbacked_symint_split, test/dynamo/test_misc.py::MiscTests::test_unhandled_exception_in_dynamo, test/dynamo/test_misc.py::MiscTests::test_unhandled_exception_in_dynamo2, test/dynamo/test_misc.py::MiscTests::test_unique_consecutive, test/dynamo/test_misc.py::MiscTests::test_unpack4, test/dynamo/test_misc.py::MiscTests::test_unpack5, test/dynamo/test_misc.py::MiscTests::test_unpack_tensor_shape_mismatch, test/dynamo/test_misc.py::MiscTests::test_update_locals_and_stack_uses_shared_cache, test/dynamo/test_misc.py::MiscTests::test_user_code_statically_known, test/dynamo/test_misc.py::MiscTests::test_user_defined_binop, test/dynamo/test_misc.py::MiscTests::test_user_defined_class_name, test/dynamo/test_misc.py::MiscTests::test_user_defined_class_python_type, test/dynamo/test_misc.py::MiscTests::test_user_defined_iter, test/dynamo/test_misc.py::MiscTests::test_user_defined_object_class_interaction, test/dynamo/test_misc.py::MiscTests::test_user_defined_setattr1, test/dynamo/test_misc.py::MiscTests::test_user_defined_setattr2, test/dynamo/test_misc.py::MiscTests::test_user_function_variable_supports_enum_argument, test/dynamo/test_misc.py::MiscTests::test_user_function_variable_supports_function_argument, test/dynamo/test_misc.py::MiscTests::test_user_function_variable_supports_type_abcmeta_argument, test/dynamo/test_misc.py::MiscTests::test_user_getattr1, test/dynamo/test_misc.py::MiscTests::test_user_getattr2, test/dynamo/test_misc.py::MiscTests::test_user_getattribute, test/dynamo/test_misc.py::MiscTests::test_user_property, test/dynamo/test_misc.py::MiscTests::test_usr_cls_classmethod, test/dynamo/test_misc.py::MiscTests::test_usr_cls_staticmethod, test/dynamo/test_misc.py::MiscTests::test_validate_outputs_unbacked, test/dynamo/test_misc.py::MiscTests::test_validate_outputs_unbacked_by_custom_op, test/dynamo/test_misc.py::MiscTests::test_variable_access_in_exception, test/dynamo/test_misc.py::MiscTests::test_variable_tracker_recursively_contains, test/dynamo/test_misc.py::MiscTests::test_version_ci, test/dynamo/test_misc.py::MiscTests::test_with_builtin_type, test/dynamo/test_misc.py::MiscTests::test_write_to_cells_with_name_shadowing, test/dynamo/test_misc.py::MiscTests::test_write_to_closures_in_inlining, test/dynamo/test_misc.py::MiscTests::test_writes_to_cells_across_frames1, test/dynamo/test_misc.py::MiscTests::test_writes_to_cells_across_frames2, test/dynamo/test_misc.py::MiscTests::test_yield_from, test/dynamo/test_misc.py::MiscTests::test_yield_from_in_a_loop, test/dynamo/test_misc.py::MiscTests::test_yield_from_user_stop_iteration, test/dynamo/test_misc.py::MiscTests::test_yield_gen_and_from, test/dynamo/test_misc.py::MiscTests::test_yield_send_to_subgenerator_graph_break, test/dynamo/test_misc.py::MiscTestsPyTree::test_pytree_tree_flatten_unflatten_cxx, test/dynamo/test_misc.py::MiscTestsPyTree::test_pytree_tree_flatten_unflatten_python, test/dynamo/test_misc.py::MiscTestsPyTree::test_pytree_tree_leaves_cxx, test/dynamo/test_misc.py::MiscTestsPyTree::test_pytree_tree_leaves_python, test/dynamo/test_misc.py::MiscTestsPyTree::test_pytree_tree_map_cxx, test/dynamo/test_misc.py::MiscTestsPyTree::test_pytree_tree_map_only_cxx, test/dynamo/test_misc.py::MiscTestsPyTree::test_pytree_tree_map_only_python, test/dynamo/test_misc.py::MiscTestsPyTree::test_pytree_tree_map_python, test/dynamo/test_misc.py::MiscTestsPyTree::test_tracing_nested_dicts_cxx, test/dynamo/test_misc.py::MiscTestsPyTree::test_tracing_nested_dicts_python, test/dynamo/test_misc.py::MiscTestsPyTree::test_tracing_nested_mixed_all_cxx, test/dynamo/test_misc.py::MiscTestsPyTree::test_tracing_nested_mixed_all_python, test/dynamo/test_misc.py::MiscTestsPyTree::test_tracing_nested_pytree_cxx, test/dynamo/test_misc.py::MiscTestsPyTree::test_tracing_nested_pytree_python, test/dynamo/test_misc.py::MiscTestsPyTree::test_tracing_nested_tensor_subclass_cxx, test/dynamo/test_misc.py::MiscTestsPyTree::test_tracing_nested_tensor_subclass_python, test/dynamo/test_misc.py::MiscTestsPyTree::test_tracing_nested_tuples_cxx, test/dynamo/test_misc.py::MiscTestsPyTree::test_tracing_nested_tuples_python, test/dynamo/test_misc.py::MiscTestsPyTree::test_tracing_pytree_cxx, test/dynamo/test_misc.py::MiscTestsPyTree::test_tracing_pytree_python, test/dynamo/test_misc.py::TestTracer::test_jit_save, test/dynamo/test_misc.py::TestCustomFunction::test_autograd_function_with_matmul_folding_at_output, test/dynamo/test_misc.py::TestCustomFunction::test_retain_grad, test/dynamo/test_misc.py::MiscTestsDeviceCUDA::test_dynamic_fill_diagonal__cuda, test/dynamo/test_misc.py::MiscTestsDeviceCUDA::test_dynamic_float_scalar_tensor_coersion_cuda, test/dynamo/test_misc.py::MiscTestsDeviceCUDA::test_full_graph_capture_dynamic_output_shape_ops_cuda, test/dynamo/test_misc.py::MiscTestsDeviceCUDA::test_full_graph_capture_scalar_outputs_cuda, test/dynamo/test_misc.py::MiscTestsDeviceCUDA::test_get_device_cuda, test/dynamo/test_misc.py::MiscTestsDeviceCUDA::test_gpu_set_device_cuda, test/dynamo/test_misc.py::MiscTestsDeviceCUDA::test_interpolate_propagate_real_tensors_cuda, test/dynamo/test_misc.py::MiscTestsDeviceCUDA::test_legacy_cuda_tensor_cuda, test/dynamo/test_misc.py::MiscTestsDeviceCUDA::test_parsing_sdpa_cuda, test/dynamo/test_misc.py::MiscTestsDeviceCUDA::test_rand_cuda, test/dynamo/test_misc.py::MiscTestsDeviceCUDA::test_randint_no_graphbreak_cuda, test/dynamo/test_misc.py::MiscTestsDeviceCUDA::test_scalar_isin_decomposition_cuda, test/dynamo/test_misc.py::MiscTestsDeviceCUDA::test_symint_as_device_kwarg_cuda, test/dynamo/test_misc.py::MiscTestsDeviceCUDA::test_torch_cudnn_is_acceptable_bad_inputs_cuda, test/dynamo/test_misc.py::MiscTestsDeviceCUDA::test_torch_cudnn_is_acceptable_cuda, test/dynamo/test_misc.py::MiscTestsDeviceCUDA::test_torch_device_is_available_cuda, test/dynamo/test_misc.py::MiscTestsDeviceCUDA::test_torch_device_python_type_cuda 2025-10-10T02:05:34.8108322Z 2025-10-10T02:05:35.9946646Z 2025-10-10T02:05:35.9947768Z export/test_sparse 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_sparse_1.1_468d13a07751fdee_.log 2025-10-10T02:05:36.0007887Z Running 203 items in this shard: test/export/test_sparse.py::TestSparseProp::test_activation_coo, test/export/test_sparse.py::TestSparseProp::test_activation_csr, test/export/test_sparse.py::TestSparseProp::test_add, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int64_SparseCSR 2025-10-10T02:05:36.0067109Z 2025-10-10T02:05:37.6301561Z Running dynamo/test_comptime 1/1 ... [2025-10-10 02:05:37.629547] 2025-10-10T02:05:37.6301999Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:05:37.6303292Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_comptime.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:05:37.629936] 2025-10-10T02:05:38.7266010Z Running dynamo/test_python_autograd 1/1 ... [2025-10-10 02:05:38.725981] 2025-10-10T02:05:38.7266655Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:05:38.7267887Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_python_autograd.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:05:38.726408] 2025-10-10T02:05:39.8799268Z Running functorch/test_rearrange 1/1 ... [2025-10-10 02:05:39.879433] 2025-10-10T02:05:39.8799873Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:05:39.8801729Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_rearrange.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:05:39.879808] 2025-10-10T02:05:41.8535525Z 2025-10-10T02:05:41.8536626Z dynamo/test_comptime 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_comptime_1.1_12f0992c6518ae4b_.log 2025-10-10T02:05:41.8540374Z Running 12 items in this shard: test/dynamo/test_comptime.py::ComptimeTests::test_get_local, test/dynamo/test_comptime.py::ComptimeTests::test_get_local_closure_variable, test/dynamo/test_comptime.py::ComptimeTests::test_graph_break, test/dynamo/test_comptime.py::ComptimeTests::test_print_bt, test/dynamo/test_comptime.py::ComptimeTests::test_print_direct, test/dynamo/test_comptime.py::ComptimeTests::test_print_disas, test/dynamo/test_comptime.py::ComptimeTests::test_print_graph, test/dynamo/test_comptime.py::ComptimeTests::test_print_guards, test/dynamo/test_comptime.py::ComptimeTests::test_print_locals, test/dynamo/test_comptime.py::ComptimeTests::test_print_single, test/dynamo/test_comptime.py::ComptimeTests::test_print_value_stack, test/dynamo/test_comptime.py::ComptimeTests::test_sleep 2025-10-10T02:05:41.8543281Z 2025-10-10T02:05:42.9003917Z 2025-10-10T02:05:42.9005200Z dynamo/test_python_autograd 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_python_autograd_1.1_baf474ecf16b23fa_.log 2025-10-10T02:05:42.9008122Z Running 5 items in this shard: test/dynamo/test_python_autograd.py::TestPythonAutograd::test_backwards1, test/dynamo/test_python_autograd.py::TestPythonAutograd::test_backwards2, test/dynamo/test_python_autograd.py::TestPythonAutograd::test_forwards1, test/dynamo/test_python_autograd.py::TestPythonAutograd::test_forwards2, test/dynamo/test_python_autograd.py::TestPythonAutograd::test_split 2025-10-10T02:05:42.9009973Z 2025-10-10T02:05:43.8530361Z 2025-10-10T02:05:43.8531181Z functorch/test_rearrange 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_rearrange_1.1_0bdbdf8367fdfb7a_.log 2025-10-10T02:05:43.8534606Z Running 10 items in this shard: test/functorch/test_rearrange.py::TestRearrange::test_0_dim_tensor, test/functorch/test_rearrange.py::TestRearrange::test_collapsed_ellipsis_errors_out, test/functorch/test_rearrange.py::TestRearrange::test_concatenations_and_stacking, test/functorch/test_rearrange.py::TestRearrange::test_dimension_mismatch_no_ellipsis, test/functorch/test_rearrange.py::TestRearrange::test_dimension_mismatch_with_ellipsis, test/functorch/test_rearrange.py::TestRearrange::test_ellipsis_ops, test/functorch/test_rearrange.py::TestRearrange::test_rearrange_consistency, test/functorch/test_rearrange.py::TestRearrange::test_rearrange_permutations, test/functorch/test_rearrange.py::TestRearrange::test_squeeze, test/functorch/test_rearrange.py::TestRearrange::test_unsqueeze 2025-10-10T02:05:43.8537515Z 2025-10-10T02:05:45.8238021Z Running functorch/test_parsing 1/1 ... [2025-10-10 02:05:45.823093] 2025-10-10T02:05:45.8238461Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:05:45.8240532Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_parsing.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:05:45.823471] 2025-10-10T02:05:46.8639727Z Running test_package 1/1 ... [2025-10-10 02:05:46.863493] 2025-10-10T02:05:46.8640161Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:05:46.8643524Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_package.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:05:46.863882] 2025-10-10T02:05:47.7861661Z Running test_comparison_utils 1/1 ... [2025-10-10 02:05:47.785596] 2025-10-10T02:05:47.7862106Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:05:47.7863919Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_comparison_utils.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:05:47.786000] 2025-10-10T02:05:49.7968581Z 2025-10-10T02:05:49.7969377Z functorch/test_parsing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_parsing_1.1_5f2548c18dc309d7_.log 2025-10-10T02:05:49.7974264Z Running 12 items in this shard: test/functorch/test_parsing.py::TestAnonymousAxis::test_anonymous_axes, test/functorch/test_parsing.py::TestParsedExpression::test_elementary_axis_name, test/functorch/test_parsing.py::TestParsedExpression::test_invalid_expressions, test/functorch/test_parsing.py::TestParsedExpression::test_parse_expression, test/functorch/test_parsing.py::TestParsingUtils::test_ellipsis_invalid_identifier, test/functorch/test_parsing.py::TestParsingUtils::test_ellipsis_matching, test/functorch/test_parsing.py::TestParsingUtils::test_left_parenthesized_ellipsis, test/functorch/test_parsing.py::TestParsingUtils::test_parse_pattern_number_of_arrows, test/functorch/test_parsing.py::TestValidateRearrangeExpressions::test_identifier_mismatch, test/functorch/test_parsing.py::TestValidateRearrangeExpressions::test_non_unitary_anonymous_axes_raises_error, test/functorch/test_parsing.py::TestValidateRearrangeExpressions::test_unexpected_axes_lengths, test/functorch/test_parsing.py::TestValidateRearrangeExpressions::test_validate_axes_lengths_are_integers 2025-10-10T02:05:49.7978660Z 2025-10-10T02:05:51.2877567Z 2025-10-10T02:05:51.2878749Z test_package 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_package_1.1_cbd9eae8a849bf4d_.log 2025-10-10T02:05:51.2911180Z Running 137 items in this shard: test/test_package.py::TestAnalyze::test_trace_dependencies, test/test_package.py::TestDependencyAPI::test_allow_empty_with_error, test/test_package.py::TestDependencyAPI::test_broken_dependency, test/test_package.py::TestDependencyAPI::test_deny, test/test_package.py::TestDependencyAPI::test_deny_glob, test/test_package.py::TestDependencyAPI::test_extern, test/test_package.py::TestDependencyAPI::test_extern_glob, test/test_package.py::TestDependencyAPI::test_extern_glob_allow_empty, test/test_package.py::TestDependencyAPI::test_externing_c_extension, test/test_package.py::TestDependencyAPI::test_implicit_intern, test/test_package.py::TestDependencyAPI::test_intern_error, test/test_package.py::TestDependencyAPI::test_invalid_import, test/test_package.py::TestDependencyAPI::test_mock, test/test_package.py::TestDependencyAPI::test_mock_glob, test/test_package.py::TestDependencyAPI::test_mock_glob_allow_empty, test/test_package.py::TestDependencyAPI::test_pickle_mocked, test/test_package.py::TestDependencyAPI::test_pickle_mocked_all, test/test_package.py::TestDependencyAPI::test_repackage_mocked_module, test/test_package.py::TestDependencyHooks::test_extern_and_mock_hook, test/test_package.py::TestDependencyHooks::test_multiple_extern_hooks, test/test_package.py::TestDependencyHooks::test_multiple_mock_hooks, test/test_package.py::TestDependencyHooks::test_remove_hooks, test/test_package.py::TestDependencyHooks::test_single_hook, test/test_package.py::TestDiGraph::test_all_paths, test/test_package.py::TestDiGraph::test_contains, test/test_package.py::TestDiGraph::test_contains_non_hashable, test/test_package.py::TestDiGraph::test_edges, test/test_package.py::TestDiGraph::test_forward_closure, test/test_package.py::TestDiGraph::test_iter, test/test_package.py::TestDiGraph::test_node_attr_update, test/test_package.py::TestDiGraph::test_node_attrs, test/test_package.py::TestDiGraph::test_predecessor_not_in_graph, test/test_package.py::TestDiGraph::test_predecessors, test/test_package.py::TestDiGraph::test_successor_not_in_graph, test/test_package.py::TestDiGraph::test_successors, test/test_package.py::DirectoryReaderTest::test_importer_access, test/test_package.py::DirectoryReaderTest::test_loading_has_record, test/test_package.py::DirectoryReaderTest::test_loading_module, test/test_package.py::DirectoryReaderTest::test_loading_pickle, test/test_package.py::DirectoryReaderTest::test_package_resource_access, test/test_package.py::DirectoryReaderTest::test_resource_access_by_path, test/test_package.py::DirectoryReaderTest::test_resource_reader, test/test_package.py::DirectoryReaderTest::test_scriptobject_failure_message, test/test_package.py::TestGlobGroup::test_exclude, test/test_package.py::TestGlobGroup::test_exclude_from_all, test/test_package.py::TestGlobGroup::test_invalid_raw, test/test_package.py::TestGlobGroup::test_list_include_exclude, test/test_package.py::TestGlobGroup::test_one_star, test/test_package.py::TestGlobGroup::test_one_star_middle, test/test_package.py::TestGlobGroup::test_one_star_multiple_in_component, test/test_package.py::TestGlobGroup::test_one_star_partial, test/test_package.py::TestGlobGroup::test_one_star_partial_extension, test/test_package.py::TestGlobGroup::test_raw_two_star, test/test_package.py::TestGlobGroup::test_two_star, test/test_package.py::TestGlobGroup::test_two_star_end, test/test_package.py::TestGlobGroup::test_two_star_middle, test/test_package.py::TestGlobGroup::test_two_star_multiple, test/test_package.py::TestImporter::test_ordered_importer_basic, test/test_package.py::TestImporter::test_ordered_importer_whichmodule, test/test_package.py::TestImporter::test_package_importer_whichmodule_no_dunder_module, test/test_package.py::TestImporter::test_single_ordered_importer, test/test_package.py::TestImporter::test_sys_importer, test/test_package.py::TestImporter::test_sys_importer_roundtrip, test/test_package.py::TestLoadBCPackages::test_load_bc_packages_fx_module, test/test_package.py::TestLoadBCPackages::test_load_bc_packages_nn_module, test/test_package.py::TestLoadBCPackages::test_load_bc_packages_torchscript_module, test/test_package.py::TestMangling::test_demangle_base, test/test_package.py::TestMangling::test_demangler_multiple_manglers, test/test_package.py::TestMangling::test_is_mangled, test/test_package.py::TestMangling::test_mangle_empty_errors, test/test_package.py::TestMangling::test_mangle_prefix, test/test_package.py::TestMangling::test_mangler_is_consistent, test/test_package.py::TestMangling::test_package_mangler, test/test_package.py::TestMangling::test_roundtrip_mangling, test/test_package.py::TestMangling::test_unique_manglers, test/test_package.py::TestMangling::test_unique_module_names, test/test_package.py::TestMisc::test_dunder_package_present, test/test_package.py::TestMisc::test_dunder_package_works_from_package, test/test_package.py::TestMisc::test_exporter_content_lists, test/test_package.py::TestMisc::test_file_structure, test/test_package.py::TestMisc::test_file_structure_has_file, test/test_package.py::TestMisc::test_inspect_class, test/test_package.py::TestMisc::test_is_from_package, test/test_package.py::TestMisc::test_load_python_version_from_package, test/test_package.py::TestMisc::test_loaders_that_remap_files_work_ok, test/test_package.py::TestMisc::test_python_version, test/test_package.py::TestMisc::test_std_lib_sys_hackery_checks, test/test_package.py::ModelTest::test_model_save, test/test_package.py::ModelTest::test_resnet, test/test_package.py::ModelTest::test_script_resnet, test/test_package.py::TestPackageFX::test_package_fx_custom_tracer, test/test_package.py::TestPackageFX::test_package_fx_package, test/test_package.py::TestPackageFX::test_package_fx_simple, test/test_package.py::TestPackageFX::test_package_fx_with_imports, test/test_package.py::TestPackageFX::test_package_fx_wrap, test/test_package.py::TestPackageFX::test_package_gm_preserve_stack_trace, test/test_package.py::TestPackageFX::test_package_then_fx, test/test_package.py::TestPackageScript::test_different_package_interface, test/test_package.py::TestPackageScript::test_different_package_script_class, test/test_package.py::TestPackageScript::test_load_shared_scriptmodules, test/test_package.py::TestPackageScript::test_load_shared_tensors, test/test_package.py::TestPackageScript::test_load_shared_tensors_repackaged, test/test_package.py::TestPackageScript::test_mixing_packaged_and_inline_modules, test/test_package.py::TestPackageScript::test_mixing_packaged_and_inline_modules_shared_code, test/test_package.py::TestPackageScript::test_package_interface, test/test_package.py::TestPackageScript::test_package_script_class, test/test_package.py::TestPackageScript::test_package_script_class_referencing_self, test/test_package.py::TestPackageScript::test_save_eager_mods_sharing_scriptmodule, test/test_package.py::TestPackageScript::test_save_independent_scriptmodules, test/test_package.py::TestPackageScript::test_save_repeat_scriptmodules, test/test_package.py::TestPackageScript::test_save_scriptmodule, test/test_package.py::TestPackageScript::test_save_scriptmodule_file, test/test_package.py::TestPackageScript::test_save_scriptmodule_only_necessary_code, test/test_package.py::TestPackageScript::test_save_scriptmodule_with_submods, test/test_package.py::TestPackageScript::test_save_scriptmodules_in_container, test/test_package.py::TestPackageScript::test_save_scriptmodules_submod_redefinition, test/test_package.py::TestPackageScript::test_save_shared_tensors, test/test_package.py::TestPackageScript::test_saving_and_scripting_packaged_mod, test/test_package.py::TestPackageScript::test_scriptmodules_repeat_save, test/test_package.py::TestPackageScript::test_tensor_sharing_pickle, test/test_package.py::TestRepackage::test_repackage_import_indirectly_via_parent_module, test/test_package.py::TestResources::test_importer_access, test/test_package.py::TestResources::test_package_resource_access, test/test_package.py::TestResources::test_resource_access_by_path, test/test_package.py::TestResources::test_resource_reader, test/test_package.py::TestSaveLoad::test_bad_dunder_imports, test/test_package.py::TestSaveLoad::test_dunder_imports, test/test_package.py::TestSaveLoad::test_exporting_mismatched_code, test/test_package.py::TestSaveLoad::test_pickle, test/test_package.py::TestSaveLoad::test_pickle_long_name_with_protocol_4, test/test_package.py::TestSaveLoad::test_save_imported_module, test/test_package.py::TestSaveLoad::test_save_imported_module_using_package_importer, test/test_package.py::TestSaveLoad::test_save_load_fp8, test/test_package.py::TestSaveLoad::test_save_module, test/test_package.py::TestSaveLoad::test_save_module_binary, test/test_package.py::TestSaveLoad::test_saving_source, test/test_package.py::TestSaveLoad::test_saving_string 2025-10-10T02:05:51.2941491Z 2025-10-10T02:05:51.6217464Z 2025-10-10T02:05:51.6218361Z test_comparison_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_comparison_utils_1.1_abb85c565d058c55_.log 2025-10-10T02:05:51.6221184Z Running 7 items in this shard: test/test_comparison_utils.py::TestComparisonUtils::test_all_equal_no_assert, test/test_comparison_utils.py::TestComparisonUtils::test_all_equal_no_assert_nones, test/test_comparison_utils.py::TestComparisonUtils::test_assert_device, test/test_comparison_utils.py::TestComparisonUtils::test_assert_dtype, test/test_comparison_utils.py::TestComparisonUtils::test_assert_layout, test/test_comparison_utils.py::TestComparisonUtils::test_assert_sizes, test/test_comparison_utils.py::TestComparisonUtils::test_assert_strides 2025-10-10T02:05:51.6223170Z 2025-10-10T02:05:53.7455757Z Running test_mkl_verbose 1/1 ... [2025-10-10 02:05:53.745092] 2025-10-10T02:05:53.7456328Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:05:53.7458196Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_mkl_verbose.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:05:53.745466] 2025-10-10T02:05:55.2301920Z Running functorch/test_ac_logging 1/1 ... [2025-10-10 02:05:55.229572] 2025-10-10T02:05:55.2302559Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:05:55.2304202Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_ac_logging.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:05:55.230041] 2025-10-10T02:05:55.5472635Z Running test_mkldnn_verbose 1/1 ... [2025-10-10 02:05:55.546715] 2025-10-10T02:05:55.5473223Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:05:55.5474759Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_mkldnn_verbose.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:05:55.547087] 2025-10-10T02:05:57.6183713Z 2025-10-10T02:05:57.6184434Z test_mkl_verbose 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_mkl_verbose_1.1_e7c198c0f1e698db_.log 2025-10-10T02:05:57.6185733Z Running 2 items in this shard: test/test_mkl_verbose.py::TestMKLVerbose::test_verbose_off, test/test_mkl_verbose.py::TestMKLVerbose::test_verbose_on 2025-10-10T02:05:57.6186373Z 2025-10-10T02:05:58.8531346Z 2025-10-10T02:05:58.8532600Z functorch/test_ac_logging 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_ac_logging_1.1_d0124ae9a3552e21_.log 2025-10-10T02:05:58.8534749Z Running 4 items in this shard: test/functorch/test_ac_logging.py::TestAcLogging::test_create_activation_checkpointing_logging_structure_payload, test/functorch/test_ac_logging.py::TestAcLogging::test_create_joint_graph_edges, test/functorch/test_ac_logging.py::TestAcLogging::test_create_joint_graph_node_information, test/functorch/test_ac_logging.py::TestAcLogging::test_create_structured_trace_for_min_cut_info 2025-10-10T02:05:58.8536510Z 2025-10-10T02:05:59.4205336Z 2025-10-10T02:05:59.4206386Z test_mkldnn_verbose 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_mkldnn_verbose_1.1_202119cb08eecedf_.log 2025-10-10T02:05:59.4207835Z Running 2 items in this shard: test/test_mkldnn_verbose.py::TestMKLDNNVerbose::test_verbose_off, test/test_mkldnn_verbose.py::TestMKLDNNVerbose::test_verbose_on 2025-10-10T02:05:59.4208501Z 2025-10-10T02:06:01.5683633Z Running profiler/test_kineto 1/1 ... [2025-10-10 02:06:01.567776] 2025-10-10T02:06:01.5684203Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:06:01.5686039Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_kineto.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:06:01.568223] 2025-10-10T02:06:02.7427267Z Running test_matmul_cuda 1/1 ... [2025-10-10 02:06:02.742110] 2025-10-10T02:06:02.7427878Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:06:02.7429863Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_matmul_cuda.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:06:02.742598] 2025-10-10T02:06:03.3380783Z Running test_transformers 1/1 ... [2025-10-10 02:06:03.337571] 2025-10-10T02:06:03.3381236Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:06:03.3383013Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_transformers.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:06:03.337951] 2025-10-10T02:06:05.3911099Z 2025-10-10T02:06:05.3912316Z profiler/test_kineto 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_kineto_1.1_bd1f24c6684fac74_.log 2025-10-10T02:06:05.3913522Z Running 1 items in this shard: test/profiler/test_kineto.py::SimpleKinetoInitializationTest::test_kineto_profiler_with_environment_variable 2025-10-10T02:06:05.3914130Z 2025-10-10T02:06:08.8200096Z 2025-10-10T02:06:08.8201022Z test_matmul_cuda 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_matmul_cuda_1.1_4558612648566e2a_.log 2025-10-10T02:06:08.8736130Z Running 1197 items in this shard: test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_alignment_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_no_reduced_precision_small_size_4_size_32768_backend_cublas_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_no_reduced_precision_small_size_4_size_32768_backend_cublaslt_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_no_reduced_precision_small_size_8_size_32768_backend_cublas_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_no_reduced_precision_small_size_8_size_32768_backend_cublaslt_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_10000_backend_cublas_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_10000_backend_cublas_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_10000_backend_cublaslt_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_10000_backend_cublaslt_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_1000_backend_cublas_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_1000_backend_cublas_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_1000_backend_cublaslt_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_1000_backend_cublaslt_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_100_backend_cublas_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_100_backend_cublas_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_100_backend_cublaslt_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_100_backend_cublaslt_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_10000_backend_cublas_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_10000_backend_cublas_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_10000_backend_cublaslt_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_10000_backend_cublaslt_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_1000_backend_cublas_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_1000_backend_cublas_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_1000_backend_cublaslt_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_1000_backend_cublaslt_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_100_backend_cublas_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_100_backend_cublas_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_100_backend_cublaslt_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_100_backend_cublaslt_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_10000_backend_cublas_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_10000_backend_cublas_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_10000_backend_cublas_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_10000_backend_cublaslt_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_10000_backend_cublaslt_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_10000_backend_cublaslt_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_1000_backend_cublas_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_1000_backend_cublas_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_1000_backend_cublas_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_1000_backend_cublaslt_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_1000_backend_cublaslt_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_1000_backend_cublaslt_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_100_backend_cublas_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_100_backend_cublas_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_100_backend_cublas_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_100_backend_cublaslt_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_100_backend_cublaslt_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_100_backend_cublaslt_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_and_lt_reduced_precision_fp16_accumulate_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_1_10000_10000_10000_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_1_10000_10000_10000_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_1_10000_10000_10000_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_1_10000_1000_10000_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_1_10000_1000_10000_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_1_10000_1000_10000_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_2_1000_1000_1000_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_2_1000_1000_1000_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_2_1000_1000_1000_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_2_100_100_100_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_2_100_100_100_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_2_100_100_100_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_1024_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_1024_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_1024_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_128_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_128_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_128_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_2048_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_2048_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_2048_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_256_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_256_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_256_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_32_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_32_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_32_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_4096_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_4096_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_4096_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_512_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_512_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_512_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_64_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_64_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_64_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_8192_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_8192_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_deterministic_shape_8192_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_fp16_accum_and_fp32_out_failure_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_fp16_accum_and_fp32_out_failure_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_fp16_accum_and_fp32_out_failure_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_fp16_accum_and_fp32_out_failure_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_False_a_row_major_False_b_row_major_False_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_False_a_row_major_False_b_row_major_False_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_False_a_row_major_False_b_row_major_False_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_False_a_row_major_False_b_row_major_True_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_False_a_row_major_False_b_row_major_True_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_False_a_row_major_False_b_row_major_True_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_False_a_row_major_True_b_row_major_False_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_False_a_row_major_True_b_row_major_False_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_False_a_row_major_True_b_row_major_False_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_False_a_row_major_True_b_row_major_True_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_False_a_row_major_True_b_row_major_True_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_False_a_row_major_True_b_row_major_True_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_True_a_row_major_False_b_row_major_False_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_True_a_row_major_False_b_row_major_False_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_True_a_row_major_False_b_row_major_False_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_True_a_row_major_False_b_row_major_True_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_True_a_row_major_False_b_row_major_True_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_True_a_row_major_False_b_row_major_True_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_True_a_row_major_True_b_row_major_False_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_True_a_row_major_True_b_row_major_False_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_True_a_row_major_True_b_row_major_False_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_True_a_row_major_True_b_row_major_True_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_True_a_row_major_True_b_row_major_True_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_True_a_row_major_True_b_row_major_True_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_False_a_row_major_False_b_row_major_False_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_False_a_row_major_False_b_row_major_False_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_False_a_row_major_False_b_row_major_False_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_False_a_row_major_False_b_row_major_True_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_False_a_row_major_False_b_row_major_True_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_False_a_row_major_False_b_row_major_True_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_False_a_row_major_True_b_row_major_False_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_False_a_row_major_True_b_row_major_False_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_False_a_row_major_True_b_row_major_False_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_False_a_row_major_True_b_row_major_True_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_False_a_row_major_True_b_row_major_True_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_False_a_row_major_True_b_row_major_True_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_True_a_row_major_False_b_row_major_False_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_True_a_row_major_False_b_row_major_False_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_True_a_row_major_False_b_row_major_False_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_True_a_row_major_False_b_row_major_True_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_True_a_row_major_False_b_row_major_True_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_True_a_row_major_False_b_row_major_True_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_True_a_row_major_True_b_row_major_False_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_True_a_row_major_True_b_row_major_False_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_True_a_row_major_True_b_row_major_False_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_True_a_row_major_True_b_row_major_True_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_True_a_row_major_True_b_row_major_True_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_True_a_row_major_True_b_row_major_True_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_False_a_row_major_False_b_row_major_False_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_False_a_row_major_False_b_row_major_False_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_False_a_row_major_False_b_row_major_False_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_False_a_row_major_False_b_row_major_True_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_False_a_row_major_False_b_row_major_True_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_False_a_row_major_False_b_row_major_True_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_False_a_row_major_True_b_row_major_False_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_False_a_row_major_True_b_row_major_False_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_False_a_row_major_True_b_row_major_False_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_False_a_row_major_True_b_row_major_True_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_False_a_row_major_True_b_row_major_True_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_False_a_row_major_True_b_row_major_True_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_True_a_row_major_False_b_row_major_False_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_True_a_row_major_False_b_row_major_False_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_True_a_row_major_False_b_row_major_False_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_True_a_row_major_False_b_row_major_True_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_True_a_row_major_False_b_row_major_True_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_True_a_row_major_False_b_row_major_True_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_True_a_row_major_True_b_row_major_False_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_True_a_row_major_True_b_row_major_False_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_True_a_row_major_True_b_row_major_False_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_True_a_row_major_True_b_row_major_True_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_True_a_row_major_True_b_row_major_True_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_True_a_row_major_True_b_row_major_True_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_False_a_row_major_False_b_row_major_False_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_False_a_row_major_False_b_row_major_False_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_False_a_row_major_False_b_row_major_False_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_False_a_row_major_False_b_row_major_True_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_False_a_row_major_False_b_row_major_True_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_False_a_row_major_False_b_row_major_True_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_False_a_row_major_True_b_row_major_False_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_False_a_row_major_True_b_row_major_False_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_False_a_row_major_True_b_row_major_False_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_False_a_row_major_True_b_row_major_True_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_False_a_row_major_True_b_row_major_True_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_False_a_row_major_True_b_row_major_True_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_True_a_row_major_False_b_row_major_False_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_True_a_row_major_False_b_row_major_False_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_True_a_row_major_False_b_row_major_False_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_True_a_row_major_False_b_row_major_True_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_True_a_row_major_False_b_row_major_True_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_True_a_row_major_False_b_row_major_True_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_True_a_row_major_True_b_row_major_False_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_True_a_row_major_True_b_row_major_False_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_True_a_row_major_True_b_row_major_False_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_True_a_row_major_True_b_row_major_True_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_True_a_row_major_True_b_row_major_True_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_True_a_row_major_True_b_row_major_True_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/2d_a_row_major_False_b_row_major_False_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/2d_a_row_major_False_b_row_major_False_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/2d_a_row_major_False_b_row_major_True_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/2d_a_row_major_False_b_row_major_True_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/2d_a_row_major_True_b_row_major_False_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/2d_a_row_major_True_b_row_major_False_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/2d_a_row_major_True_b_row_major_True_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/2d_a_row_major_True_b_row_major_True_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/3d_a_row_major_False_b_row_major_False_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/3d_a_row_major_False_b_row_major_False_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/3d_a_row_major_False_b_row_major_True_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/3d_a_row_major_False_b_row_major_True_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/3d_a_row_major_True_b_row_major_False_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/3d_a_row_major_True_b_row_major_False_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/3d_a_row_major_True_b_row_major_True_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/3d_a_row_major_True_b_row_major_True_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/2d_a_row_major_False_b_row_major_False_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/2d_a_row_major_False_b_row_major_False_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/2d_a_row_major_False_b_row_major_True_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/2d_a_row_major_False_b_row_major_True_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/2d_a_row_major_True_b_row_major_False_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/2d_a_row_major_True_b_row_major_False_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/2d_a_row_major_True_b_row_major_True_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/2d_a_row_major_True_b_row_major_True_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/3d_a_row_major_False_b_row_major_False_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/3d_a_row_major_False_b_row_major_False_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/3d_a_row_major_False_b_row_major_True_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/3d_a_row_major_False_b_row_major_True_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/3d_a_row_major_True_b_row_major_False_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/3d_a_row_major_True_b_row_major_False_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/3d_a_row_major_True_b_row_major_True_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/3d_a_row_major_True_b_row_major_True_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_input_dimension_checking_out_dtype_ops0_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_input_dimension_checking_out_dtype_ops1_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_input_dimension_checking_out_dtype_ops2_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_input_dimension_checking_out_dtype_ops3_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMixedDtypesLinearCudaCUDA::test_mixed_dtypes_linear_cuda_bfloat16, test/test_matmul_cuda.py::TestMixedDtypesLinearCudaCUDA::test_mixed_dtypes_linear_cuda_float16 2025-10-10T02:06:08.9242947Z 2025-10-10T02:06:09.3335675Z Running test_meta 1/1 ... [2025-10-10 02:06:09.333041] 2025-10-10T02:06:09.3336155Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:06:09.3337740Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_meta.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:06:09.333415] 2025-10-10T02:06:12.6933080Z Running test_license 1/1 ... [2025-10-10 02:06:12.692749] 2025-10-10T02:06:12.6933736Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:06:12.6937619Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_license.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:06:12.693154] 2025-10-10T02:06:16.5650318Z 2025-10-10T02:06:16.5659952Z test_license 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_license_1.1_b3d15cbbd76cd71b_.log 2025-10-10T02:06:16.5661038Z Running 2 items in this shard: test/test_license.py::TestLicense::test_distinfo_license, test/test_license.py::TestLicense::test_license_for_wheel 2025-10-10T02:06:16.5661619Z 2025-10-10T02:06:20.4158811Z Running test_utils_config_module 1/1 ... [2025-10-10 02:06:20.415349] 2025-10-10T02:06:20.4159268Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:06:20.4161144Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_utils_config_module.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:06:20.415716] 2025-10-10T02:06:24.3880484Z 2025-10-10T02:06:24.3881483Z test_utils_config_module 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_utils_config_module_1.1_86078f5a5cf42ecc_.log 2025-10-10T02:06:24.3888100Z Running 22 items in this shard: test/test_utils_config_module.py::TestConfigModule::test_alias, test/test_utils_config_module.py::TestConfigModule::test_bad_jk_type, test/test_utils_config_module.py::TestConfigModule::test_base_value_loading, test/test_utils_config_module.py::TestConfigModule::test_codegen_config, test/test_utils_config_module.py::TestConfigModule::test_codegen_config_function, test/test_utils_config_module.py::TestConfigModule::test_dict_copy_semantics, test/test_utils_config_module.py::TestConfigModule::test_env_name_semantics, test/test_utils_config_module.py::TestConfigModule::test_env_name_string_semantics, test/test_utils_config_module.py::TestConfigModule::test_get_hash, test/test_utils_config_module.py::TestConfigModule::test_invalid_config_float, test/test_utils_config_module.py::TestConfigModule::test_invalid_config_int, test/test_utils_config_module.py::TestConfigModule::test_make_closur_patcher, test/test_utils_config_module.py::TestConfigModule::test_multi_env, test/test_utils_config_module.py::TestConfigModule::test_none_override_semantics, test/test_utils_config_module.py::TestConfigModule::test_overrides, test/test_utils_config_module.py::TestConfigModule::test_patch, test/test_utils_config_module.py::TestConfigModule::test_reference_is_default, test/test_utils_config_module.py::TestConfigModule::test_reference_semantics, test/test_utils_config_module.py::TestConfigModule::test_save_config, test/test_utils_config_module.py::TestConfigModule::test_save_config_portable, test/test_utils_config_module.py::TestConfigModule::test_type_loading, test/test_utils_config_module.py::TestConfigModule::test_unittest_patch 2025-10-10T02:06:24.3894086Z 2025-10-10T02:06:28.1920253Z Running test_decomp 1/16 ... [2025-10-10 02:06:28.191445] 2025-10-10T02:06:28.1921110Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:06:28.1922856Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=1', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:06:28.191836] 2025-10-10T02:06:32.5198104Z 2025-10-10T02:06:32.5199297Z test_transformers 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_transformers_1.1_1181ea920dbf7d81_.log 2025-10-10T02:06:33.4501879Z Running 12384 items in this shard: test/test_transformers.py::TestTransformersCUDA::test_bias_is_none_cuda, test/test_transformers.py::TestTransformersCUDA::test_decoder_only_layer_cuda, test/test_transformers.py::TestTransformersCUDA::test_decoder_padding_and_src_mask_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_disable_fastpath_cuda, test/test_transformers.py::TestTransformersCUDA::test_encoder_is_causal_cuda, test/test_transformers.py::TestTransformersCUDA::test_encoder_padding_and_src_mask_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_is_causal_gpu_cuda, test/test_transformers.py::TestTransformersCUDA::test_kpm_mask_trailing_column_with_nested_tensor_cuda, test/test_transformers.py::TestTransformersCUDA::test_mask_check_fastpath_cuda, test/test_transformers.py::TestTransformersCUDA::test_math_backend_high_precision_cuda, test/test_transformers.py::TestTransformersCUDA::test_mha_native_args_nb_heads_1_bias_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_mha_native_args_nb_heads_1_bias_True_cuda, test/test_transformers.py::TestTransformersCUDA::test_mha_native_args_nb_heads_8_bias_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_mha_native_args_nb_heads_8_bias_True_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim2_key_padding_mask_dim1_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim2_key_padding_mask_dim1_float32_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim2_key_padding_mask_dim_2_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim2_key_padding_mask_dim_2_float32_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_2_key_padding_mask_dim1_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_2_key_padding_mask_dim1_float32_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_2_key_padding_mask_dim_2_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_2_key_padding_mask_dim_2_float32_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_3_key_padding_mask_dim1_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_3_key_padding_mask_dim1_float32_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_3_key_padding_mask_dim_2_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_3_key_padding_mask_dim_2_float32_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_2D_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_2D_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_2D_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_2D_causal_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_2D_causal_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_2D_causal_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_3D_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_3D_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_3D_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_3D_causal_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_3D_causal_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_3D_causal_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_no_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_no_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_no_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_2D_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_2D_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_2D_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_2D_causal_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_2D_causal_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_2D_causal_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_4D_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_4D_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_4D_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_4D_causal_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_4D_causal_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_4D_causal_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_no_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_no_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_no_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_script_encoder_subclass_cuda, test/test_transformers.py::TestTransformersCUDA::test_script_mha_in_proj_weight_none_cuda, test/test_transformers.py::TestTransformersCUDA::test_self_attn_TxT_attn_mask_cuda, test/test_transformers.py::TestTransformersCUDA::test_train_with_is_causal_cuda, test/test_transformers.py::TestTransformersCUDA::test_train_with_pad_and_catch_error_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformer_bias_is_none_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_False_training_False_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_False_training_False_enable_nested_tensor_True_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_False_training_True_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_False_training_True_enable_nested_tensor_True_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_True_training_False_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_True_training_False_enable_nested_tensor_True_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_True_training_True_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_True_training_True_enable_nested_tensor_True_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_False_use_autocast_False_d_model_12_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_False_use_autocast_False_d_model_256_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_False_use_autocast_True_d_model_12_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_False_use_autocast_True_d_model_256_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_True_use_autocast_False_d_model_12_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_True_use_autocast_False_d_model_256_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_True_use_autocast_True_d_model_12_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_True_use_autocast_True_d_model_256_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_square_input_with_no_grad_False_training_False_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_square_input_with_no_grad_False_training_True_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_square_input_with_no_grad_True_training_False_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_square_input_with_no_grad_True_training_True_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoderlayer_no_fastpath_with_hooks_nhead_3_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoderlayer_no_fastpath_with_hooks_nhead_4_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoderlayer_src_mask_nhead_1_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoderlayer_src_mask_nhead_4_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoderlayer_src_mask_nhead_8_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoderlayer_subclass_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoderlayer_subclass_model_cuda, test/test_transformers.py::TestTransformersCUDA::test_with_nested_tensor_input_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_dispatch_fails_no_backend_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_atteention_large_bf16_nan_values_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_attention_fail_with_non_square_causal_attention_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_autocast_fp32_bfloat16_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_autocast_fp32_float16_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_backward_failure_sm86plus_head_dim_193_dropout_p_0_0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_backward_failure_sm86plus_head_dim_193_dropout_p_0_2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_backward_failure_sm86plus_head_dim_256_dropout_p_0_0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_backward_failure_sm86plus_head_dim_256_dropout_p_0_2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_fail_fp32_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_fused_kernels_nested_broadcasting_error_cases_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_fused_kernels_nested_broadcasting_requires_grad_failure_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_fused_kernels_seq_len_0_inputs_fused_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_fused_kernels_seq_len_0_inputs_fused_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_attn_mask_present_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_broadcast_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_broadcast_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_broadcast_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_dim_3_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_dim_3_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_dim_3_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_head_dim_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_head_dim_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_head_dim_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_invalid_dtype_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_invalid_dtype_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_invalid_dtype_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_1_dimensional_inputs_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_1_dimensional_inputs_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_1_dimensional_inputs_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_different_datatypes_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_different_datatypes_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_different_datatypes_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_different_devices_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_different_devices_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_different_devices_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_last_dim_stride_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_last_dim_stride_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_last_dim_stride_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_sdpa_kernel_grouped_query_attention_cuda_fused_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_sequence_lengths_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_sequence_lengths_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_sequence_lengths_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_mask_invalid_last_dim_stride_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_mask_invalid_last_dim_stride_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_mem_eff_attention_fail_with_batch_size_geq_65536_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_mem_eff_attention_fail_with_batch_size_geq_65536_error_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_mem_eff_attention_large_seq_len_uniform_attention_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_mem_efficient_fail_bfloat16_less_than_sm80_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_nested_fails_on_padding_head_dim_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_unaligned_tensors_cuda, test/test_transformers.py::TestSDPACUDA::test_scaled_dot_product_attention_fp16_overflow_cuda, test/test_transformers.py::TestSDPACUDA::test_scaled_dot_product_attention_math_with_negative_scale_kernel0_cuda, test/test_transformers.py::TestSDPACUDA::test_sdp_math_gradcheck_contiguous_inputs_False_cuda, test/test_transformers.py::TestSDPACUDA::test_sdp_math_gradcheck_contiguous_inputs_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_compiles_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_d256_heuristic_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_different_dk_dv_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_fail_d128_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_gqa_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_nonmodulo64seqlen_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_preserves_query_layout_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_seqlen1_dropout_heuristic_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_trivial_output_transpose_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_different_dk_dv_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_backwards_throws_determinism_warning_fused_kernel0_warn_only_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_backwards_throws_determinism_warning_fused_kernel0_warn_only_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_backwards_throws_determinism_warning_fused_kernel1_warn_only_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_backwards_throws_determinism_warning_fused_kernel1_warn_only_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_backwards_throws_determinism_warning_fused_kernel2_warn_only_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_backwards_throws_determinism_warning_fused_kernel2_warn_only_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_query_dense_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_seq_len_1_inputs_fused_kernel0_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_seq_len_1_inputs_fused_kernel1_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_sdp_choice_type_dense_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_sdp_choice_type_nested_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_sdp_priority_order_use_compile_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_sdp_priority_order_use_compile_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_eff_attention_long_sequence_mask_float16_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_eff_attention_long_sequence_mask_float32_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_eff_attention_non_contig_mask_bug_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_eff_attention_non_contiguous_mask_float16_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_eff_attention_non_contiguous_mask_float32_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_eff_backwards_determinism_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_mask_variants_mask_dim_1_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_mask_variants_mask_dim_2_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_mask_variants_mask_dim_3_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_mask_variants_mask_dim_4_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_cudnn_nested_type_nested_is_contiguous_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_cudnn_nested_type_nested_is_contiguous_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_accuracy_type_dense_fused_kernel0_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_accuracy_type_dense_fused_kernel1_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_accuracy_type_nested_fused_kernel0_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_accuracy_type_nested_fused_kernel1_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_type_dense_is_contiguous_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_type_dense_is_contiguous_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_type_nested_is_contiguous_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_type_nested_is_contiguous_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_choice_with_determinism_warn_only_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_choice_with_determinism_warn_only_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_False_is_causal_False_bfloat16_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_False_is_causal_False_float16_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_False_is_causal_True_bfloat16_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_False_is_causal_True_float16_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_True_is_causal_False_bfloat16_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_True_is_causal_False_float16_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_True_is_causal_True_bfloat16_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_True_is_causal_True_float16_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_mem_efficient_grad_against_math_contiguous_inputs_False_is_causal_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_mem_efficient_grad_against_math_contiguous_inputs_False_is_causal_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_mem_efficient_grad_against_math_contiguous_inputs_True_is_causal_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_mem_efficient_grad_against_math_contiguous_inputs_True_is_causal_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_singelton_head_dim_stride_ne_1_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_LOWER_RIGHT_shape0_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_LOWER_RIGHT_shape1_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_LOWER_RIGHT_shape2_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_LOWER_RIGHT_shape3_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_UPPER_LEFT_shape0_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_UPPER_LEFT_shape1_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_UPPER_LEFT_shape2_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_UPPER_LEFT_shape3_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_LOWER_RIGHT_shape0_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_LOWER_RIGHT_shape1_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_LOWER_RIGHT_shape2_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_LOWER_RIGHT_shape3_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_UPPER_LEFT_shape0_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_UPPER_LEFT_shape1_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_UPPER_LEFT_shape2_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_UPPER_LEFT_shape3_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_is_causal_and_mask_fails_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_is_causal_equals_upper_left_shape0_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_is_causal_equals_upper_left_shape1_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_is_causal_equals_upper_left_shape2_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_is_causal_equals_upper_left_shape3_cuda 2025-10-10T02:06:34.3662230Z 2025-10-10T02:06:36.4898733Z Running test_decomp 6/16 ... [2025-10-10 02:06:36.489258] 2025-10-10T02:06:36.4899313Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:06:36.4900376Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=6', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:06:36.489624] 2025-10-10T02:09:08.5701390Z 2025-10-10T02:09:08.5702072Z test_meta 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_meta_1.1_310c899992f4fd8e_.log 2025-10-10T02:09:09.8597183Z Running 40699 items in this shard: test/test_meta.py::TestMetaConverter::test_channels_last, test/test_meta.py::TestMetaConverter::test_channels_last_leaf, test/test_meta.py::TestMetaConverter::test_channels_last_non_leaf, test/test_meta.py::TestMetaConverter::test_complex_noncontiguous_bug, test/test_meta.py::TestMetaConverter::test_empty_strided_non_dense_leaf, test/test_meta.py::TestMetaConverter::test_imag, test/test_meta.py::TestMetaConverter::test_inplace_set_storage, test/test_meta.py::TestMetaConverter::test_leaf, test/test_meta.py::TestMetaConverter::test_non_leaf, test/test_meta.py::TestMetaConverter::test_non_leaf_torture, test/test_meta.py::TestMetaConverter::test_requires_grad_false, test/test_meta.py::TestMetaConverter::test_tensor_outlives_converter, test/test_meta.py::TestMetaConverter::test_view_as_complex, test/test_meta.py::TestMetaConverter::test_view_as_real, test/test_meta.py::TestMetaConverter::test_view_dtype, test/test_meta.py::TestMetaConverter::test_view_mutate, test/test_meta.py::TestMetaConverter::test_view_of_leaf, test/test_meta.py::TestMetaConverter::test_view_of_non_leaf, test/test_meta.py::TestMetaConverter::test_view_of_view_of_leaf, test/test_meta.py::TestMetaConverter::test_weakref, test/test_meta.py::TestMetaCUDA::test_batch_norm_backward_output_mask0_cuda, test/test_meta.py::TestMetaCUDA::test_batch_norm_backward_output_mask1_cuda, test/test_meta.py::TestMetaCUDA::test_batch_norm_backward_output_mask2_cuda, test/test_meta.py::TestMetaCUDA::test_batch_norm_backward_output_mask3_cuda, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype___radd___cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype___rdiv___cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype___rmod___cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype___rmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype___rpow___cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype___rsub___cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs__conversions_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs__conversions_polar_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_atan2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_copysign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_div_floor_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_div_no_rounding_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_div_trunc_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_eq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_float_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_floor_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_fmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_fmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_fmod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_ge_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_gt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_heaviside_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_hypot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_igamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_igammac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_isclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_le_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_logical_and_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_logical_or_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_logical_xor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_lt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_ne_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_nextafter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_remainder_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_rsub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_special_xlog1py_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_special_zeta_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_true_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_xlogy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_atan2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_copysign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_div_floor_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_div_no_rounding_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_div_trunc_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_eq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_float_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_floor_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_fmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_fmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_fmod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_ge_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_gt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_heaviside_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_hypot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_igamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_igammac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_isclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_jiterator_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_jiterator_binary_return_by_ref_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_ldexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_le_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_logical_and_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_logical_or_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_logical_xor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_lt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_max_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_min_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_ne_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_nextafter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_polar_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_remainder_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_rsub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_hermite_polynomial_h_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_hermite_polynomial_he_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_laguerre_polynomial_l_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_legendre_polynomial_p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_xlog1py_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_zeta_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_true_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_xlogy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_cdist_forward_cuda, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rand___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rand___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rand___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rand___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rand___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rand___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmatmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmatmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmatmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmatmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmatmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmatmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmod___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmod___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmod___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmod___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmod___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmod___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmod___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmod___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmod___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___ror___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___ror___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___ror___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___ror___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___ror___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___ror___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rpow___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rpow___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rpow___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rpow___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rpow___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rpow___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rpow___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rpow___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rpow___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rpow___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rpow___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rsub___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rsub___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rsub___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rsub___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rsub___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rsub___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rsub___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rsub___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rsub___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rsub___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rsub___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rxor___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rxor___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rxor___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rxor___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rxor___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rxor___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__batch_norm_with_update_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__batch_norm_with_update_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__batch_norm_with_update_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__batch_norm_with_update_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__native_batch_norm_legit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__native_batch_norm_legit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__native_batch_norm_legit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__native_batch_norm_legit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__segment_reduce_lengths_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__segment_reduce_lengths_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__segment_reduce_lengths_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__segment_reduce_lengths_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__segment_reduce_offsets_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__segment_reduce_offsets_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__segment_reduce_offsets_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__segment_reduce_offsets_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__softmax_backward_data_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__softmax_backward_data_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__softmax_backward_data_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__softmax_backward_data_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__upsample_bilinear2d_aa_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__upsample_bilinear2d_aa_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__upsample_bilinear2d_aa_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__upsample_bilinear2d_aa_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_decomposed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_decomposed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_decomposed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_decomposed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_decomposed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_decomposed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_allclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_allclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_allclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_allclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_allclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_allclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_aminmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_aminmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_aminmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_aminmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_aminmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_aminmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_aminmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_aminmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_aminmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_aminmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_angle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_angle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_angle_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_angle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_angle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_angle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_angle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_angle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_angle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_angle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_angle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_arange_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_arange_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_arange_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_arange_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_arange_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_arange_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_arange_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_arange_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_arange_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argsort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argsort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argsort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argsort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argsort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argsort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argsort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argsort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argsort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argsort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_baddbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_baddbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_baddbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_baddbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_baddbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_baddbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bernoulli_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bernoulli_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bernoulli_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bernoulli_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bincount_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bincount_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bincount_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bincount_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bincount_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_left_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_left_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_left_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_left_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_left_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_right_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_right_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_right_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_right_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_right_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_shapes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bucketize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bucketize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bucketize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bucketize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bucketize_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bucketize_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bucketize_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bucketize_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bucketize_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cauchy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cauchy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cauchy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cauchy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_inverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_inverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_inverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_inverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_copysign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_copysign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_copysign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_copysign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_copysign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_copysign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_copysign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_copysign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_copysign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_copysign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_corrcoef_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_corrcoef_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_corrcoef_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_corrcoef_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_corrcoef_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_corrcoef_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_corrcoef_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_corrcoef_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_corrcoef_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_corrcoef_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_corrcoef_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cov_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cov_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cov_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cov_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cov_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cov_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cov_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cov_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cov_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cov_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cov_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumulative_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumulative_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumulative_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumulative_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumulative_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumulative_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumulative_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumulative_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumulative_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumulative_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumulative_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_deg2rad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_deg2rad_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_deg2rad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_deg2rad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_deg2rad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_deg2rad_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_deg2rad_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_deg2rad_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_deg2rad_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_deg2rad_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_digamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_digamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_digamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_digamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_digamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_digamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_digamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_digamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_digamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_digamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dist_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dist_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dist_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dist_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_floor_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_floor_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_floor_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_floor_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_floor_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_floor_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_floor_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_floor_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_floor_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_trunc_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_trunc_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_trunc_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_trunc_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_trunc_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_trunc_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_trunc_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_trunc_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_trunc_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_einsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_einsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_einsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_einsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_einsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_einsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfinv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfinv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfinv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfinv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfinv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfinv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfinv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfinv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exponential_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exponential_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_float8_e4m3fnuz, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_float8_e5m2, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_float8_e5m2fnuz, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_frexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_frexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_frexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_frexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gcd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gcd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gcd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gcd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gcd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ge_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ge_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ge_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ge_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ge_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ge_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ge_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ge_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ge_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ge_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geometric_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geometric_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geometric_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geometric_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geometric_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geometric_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geometric_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geometric_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geometric_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geqrf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geqrf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geqrf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geqrf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gradient_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gradient_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gradient_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gradient_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gradient_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gradient_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gradient_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gradient_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gradient_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gradient_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_grid_sampler_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_grid_sampler_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_grid_sampler_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_grid_sampler_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_grid_sampler_3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_grid_sampler_3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_grid_sampler_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_grid_sampler_3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hash_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hash_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hash_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hash_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hash_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hash_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hash_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hash_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hash_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hash_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_heaviside_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_heaviside_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_heaviside_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_heaviside_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_heaviside_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_heaviside_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_heaviside_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_heaviside_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_heaviside_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_heaviside_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_histc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_histc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_histc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_histc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_histc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_histc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_histc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hypot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hypot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hypot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hypot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_i0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_i0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_igamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_igamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_igammac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_igammac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_imag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_imag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_imag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_inner_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_inner_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_inner_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_inner_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_inner_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_inner_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isneginf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isneginf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isneginf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isneginf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isneginf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isneginf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isneginf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isneginf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isneginf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isneginf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isposinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isposinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isposinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isposinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isposinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isposinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isposinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isposinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isposinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isposinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_istft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_istft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kthvalue_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kthvalue_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kthvalue_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kthvalue_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kthvalue_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kthvalue_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kthvalue_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kthvalue_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kthvalue_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lcm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lcm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lcm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lcm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lcm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_le_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_le_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_le_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_le_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_le_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_le_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_le_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_le_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_le_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_le_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lerp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cholesky_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cholesky_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cholesky_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cholesky_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cond_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cond_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cond_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cond_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_det_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_det_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_det_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_det_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eig_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eig_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eig_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eig_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigvalsh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigvalsh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigvalsh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigvalsh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_householder_product_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_householder_product_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_householder_product_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_householder_product_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_inv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_inv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_inv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_inv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_inv_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_inv_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_inv_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_inv_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lstsq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lstsq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lstsq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lstsq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lstsq_grad_oriented_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lstsq_grad_oriented_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lstsq_grad_oriented_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lstsq_grad_oriented_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_rank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_rank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_rank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_rank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_rank_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_rank_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_rank_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_rank_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_multi_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_multi_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_multi_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_multi_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_multi_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_multi_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_subgradients_at_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_subgradients_at_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_subgradients_at_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_subgradients_at_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_singular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_singular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_singular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_singular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_slogdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_slogdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_slogdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_slogdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_triangular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_triangular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_triangular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_triangular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_svdvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_svdvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_svdvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_svdvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_tensorinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_tensorinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_tensorinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_tensorinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_tensorsolve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_tensorsolve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_tensorsolve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_tensorsolve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vander_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vander_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vander_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vander_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vander_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vander_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vander_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vander_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vander_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vecdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vecdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vecdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vecdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vecdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vecdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vector_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vector_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vector_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vector_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vector_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vector_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logaddexp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logaddexp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logaddexp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logaddexp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logcumsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logcumsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logcumsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logcumsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logcumsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logcumsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_unpack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_unpack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_unpack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_unpack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_std_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_std_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_std_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_std_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_std_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_var_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_var_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_var_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_var_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_var_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matrix_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matrix_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matrix_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matrix_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matrix_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matrix_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_pool2d_with_indices_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_pool2d_with_indices_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_pool2d_with_indices_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_pool2d_with_indices_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_median_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_median_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_median_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_median_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_median_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_msort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_msort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_msort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_msort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_msort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_msort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_msort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_msort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_msort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_msort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_multinomial_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_multinomial_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_multinomial_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_multinomial_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nan_to_num_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nan_to_num_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nan_to_num_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nan_to_num_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nan_to_num_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nan_to_num_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nan_to_num_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nan_to_num_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nan_to_num_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nan_to_num_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmean_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmedian_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmedian_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmedian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmedian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmedian_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmedian_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmedian_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmedian_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmedian_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanquantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanquantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_dropout_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_dropout_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_dropout_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_dropout_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nextafter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nextafter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nextafter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nextafter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_alpha_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_alpha_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_alpha_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_alpha_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_binary_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_binary_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_binary_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_binary_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_celu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_celu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_celu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_celu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_embedding_loss_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_embedding_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_similarity_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_similarity_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_similarity_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_similarity_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_ctc_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_ctc_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_elu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_elu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_elu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_elu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_embedding_bag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_embedding_bag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_embedding_bag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_embedding_bag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_embedding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_embedding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_embedding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_embedding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_fractional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_fractional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_fractional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_fractional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_fractional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_fractional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_fractional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_fractional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_gaussian_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_gaussian_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_gaussian_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_gaussian_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_gelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_gelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_gelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_gelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_glu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_glu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_glu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_glu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_grid_sample_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_grid_sample_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_grid_sample_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_grid_sample_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_group_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_group_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_group_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_group_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardswish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardswish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardswish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardswish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardtanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardtanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardtanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardtanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardtanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardtanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardtanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardtanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hinge_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hinge_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hinge_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_huber_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_huber_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_huber_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_huber_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_instance_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_instance_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_instance_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_instance_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_area_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_area_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_area_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_area_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_bicubic_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_bicubic_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_bicubic_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_bicubic_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_trilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_trilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_trilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_trilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_kl_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_kl_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_kl_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_kl_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_l1_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_l1_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_leaky_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_leaky_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_leaky_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_leaky_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_linear_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_linear_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_local_response_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_local_response_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_local_response_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_local_response_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_logsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_logsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_logsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_logsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_margin_ranking_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_margin_ranking_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_margin_ranking_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_margin_ranking_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_margin_ranking_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_margin_ranking_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_margin_ranking_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_margin_ranking_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool1d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool1d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool1d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool1d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool2d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool2d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool2d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool2d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool3d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool3d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool3d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool3d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_mish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_mish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_mish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_mish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_mse_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_mse_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_mse_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_mse_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multi_head_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multi_head_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multi_head_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multi_head_attention_forward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multi_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multi_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multi_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multi_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multilabel_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multilabel_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multilabel_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multilabel_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_one_hot_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_reflect_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_reflect_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_reflect_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_reflect_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_reflect_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_reflect_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_reflect_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_reflect_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_reflect_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_reflect_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_reflect_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_negative_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_negative_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_negative_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_negative_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_negative_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_negative_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_negative_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_negative_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_negative_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_negative_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_negative_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pairwise_distance_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pairwise_distance_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pairwise_distance_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pairwise_distance_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pairwise_distance_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pairwise_distance_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pairwise_distance_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pairwise_distance_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pairwise_distance_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pairwise_distance_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pairwise_distance_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_poisson_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_poisson_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_poisson_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_poisson_nll_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_poisson_nll_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_poisson_nll_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_poisson_nll_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_poisson_nll_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_prelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_prelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_prelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_prelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu6_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu6_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu6_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu6_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu6_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu6_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu6_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu6_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu6_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_rms_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_rms_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_rms_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_rms_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_rms_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_rms_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_rrelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_rrelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_rrelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_rrelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_selu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_selu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_selu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_selu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_silu_complex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_silu_complex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_silu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_silu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_silu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_silu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_smooth_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_smooth_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_smooth_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softplus_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softplus_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softplus_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softplus_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_tanhshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_tanhshrink_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_tanhshrink_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_tanhshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_tanhshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_tanhshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_tanhshrink_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_tanhshrink_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_tanhshrink_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_tanhshrink_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_tanhshrink_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_threshold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_threshold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_threshold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_threshold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_threshold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_threshold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_threshold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_threshold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_threshold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_upsample_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_upsample_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_upsample_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_upsample_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_upsample_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_upsample_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_upsample_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_upsample_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_upsample_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_fro_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_fro_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_fro_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_fro_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_fro_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_fro_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_inf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_inf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_inf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_inf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_inf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_inf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_nuc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_nuc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_nuc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_nuc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_in_place_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_in_place_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_in_place_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_in_place_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_in_place_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_in_place_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_number_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_number_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_number_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_number_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ormqr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ormqr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ormqr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ormqr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pca_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pca_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pca_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pca_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pinverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pinverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pinverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pinverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polar_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polar_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_3_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_4_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_4_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_4_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_4_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_4_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_4_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_4_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_4_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_4_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_4_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_quantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_quantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rad2deg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rad2deg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rad2deg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rad2deg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rad2deg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rad2deg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rad2deg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rad2deg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rad2deg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rad2deg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rand_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rand_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rand_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rand_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rand_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rand_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rand_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_remainder_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_remainder_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_remainder_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_remainder_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_remainder_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_remainder_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_remainder_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_remainder_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_remainder_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_renorm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_renorm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_renorm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_renorm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_renorm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_renorm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_neg_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_neg_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_neg_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_neg_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_searchsorted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_searchsorted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_searchsorted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_searchsorted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_searchsorted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_searchsorted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_searchsorted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_searchsorted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_searchsorted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_bartlett_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_bartlett_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_blackman_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_blackman_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_gaussian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_gaussian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_general_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_general_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_general_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_general_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_hann_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_hann_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_kaiser_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_kaiser_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_nuttall_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_nuttall_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signbit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signbit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signbit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signbit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signbit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signbit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signbit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signbit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signbit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signbit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sparse_mm_reduce_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sparse_mm_reduce_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sparse_mm_reduce_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sparse_mm_reduce_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sparse_sampled_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sparse_sampled_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sparse_sampled_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sparse_sampled_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_airy_ai_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_airy_ai_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_airy_ai_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_airy_ai_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_airy_ai_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_airy_ai_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_airy_ai_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_airy_ai_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_entr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_entr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_entr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_entr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_entr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_entr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_entr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_entr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_entr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_entr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_erfcx_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_erfcx_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_erfcx_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_erfcx_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_erfcx_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_erfcx_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_erfcx_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_erfcx_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_h_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_h_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_h_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_h_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_h_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_h_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_h_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_h_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_he_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_he_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_he_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_he_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_he_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_he_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_he_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_he_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i0e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i0e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i0e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i0e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i0e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i0e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i0e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i0e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i0e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i0e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_laguerre_polynomial_l_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_laguerre_polynomial_l_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_laguerre_polynomial_l_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_laguerre_polynomial_l_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_laguerre_polynomial_l_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_laguerre_polynomial_l_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_laguerre_polynomial_l_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_laguerre_polynomial_l_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_legendre_polynomial_p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_legendre_polynomial_p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_legendre_polynomial_p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_legendre_polynomial_p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_legendre_polynomial_p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_legendre_polynomial_p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_legendre_polynomial_p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_legendre_polynomial_p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_log_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_log_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_log_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_log_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_log_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_log_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_log_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_log_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtri_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtri_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtri_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtri_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtri_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtri_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtri_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtri_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_spherical_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_spherical_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_spherical_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_spherical_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_spherical_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_spherical_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_spherical_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_spherical_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_xlog1py_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_xlog1py_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_xlog1py_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_xlog1py_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_xlog1py_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_xlog1py_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_xlog1py_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_xlog1py_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_xlog1py_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_xlog1py_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_zeta_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_zeta_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_zeta_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_zeta_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_zeta_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_zeta_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_zeta_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_zeta_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_svd_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_svd_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_svd_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_svd_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensordot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensordot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensordot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensordot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensordot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensordot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_topk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_topk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_topk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_topk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_topk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_topk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_topk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_topk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_topk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch__scaled_mm_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__efficient_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__efficient_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__flash_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapz_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapz_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapz_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapz_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapz_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapz_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapz_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapz_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapz_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapz_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapz_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triangular_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triangular_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triangular_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triangular_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_uniform_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_uniform_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_uniform_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_uniform_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_uniform_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_uniform_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_consecutive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_consecutive_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_consecutive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_consecutive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_consecutive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_consecutive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_consecutive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_consecutive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_consecutive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_consecutive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_uint64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unravel_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unravel_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unravel_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unravel_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unravel_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_xlogy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_xlogy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_xlogy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_xlogy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_xlogy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_xlogy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_xlogy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_xlogy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_xlogy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_xlogy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rand___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rand___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rand___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rand___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rand___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rand___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmatmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmatmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmatmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmatmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmatmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmatmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmod___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmod___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmod___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmod___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmod___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmod___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmod___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmod___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmod___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___ror___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___ror___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___ror___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___ror___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___ror___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___ror___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rpow___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rpow___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rpow___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rpow___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rpow___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rpow___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rpow___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rpow___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rpow___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rpow___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rpow___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rsub___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rsub___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rsub___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rsub___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rsub___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rsub___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rsub___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rsub___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rsub___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rsub___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rsub___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rxor___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rxor___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rxor___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rxor___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rxor___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rxor___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__batch_norm_with_update_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__batch_norm_with_update_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__batch_norm_with_update_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__batch_norm_with_update_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__native_batch_norm_legit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__native_batch_norm_legit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__native_batch_norm_legit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__native_batch_norm_legit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__segment_reduce_lengths_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__segment_reduce_lengths_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__segment_reduce_lengths_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__segment_reduce_lengths_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__segment_reduce_offsets_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__segment_reduce_offsets_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__segment_reduce_offsets_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__segment_reduce_offsets_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__softmax_backward_data_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__softmax_backward_data_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__softmax_backward_data_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__softmax_backward_data_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__upsample_bilinear2d_aa_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__upsample_bilinear2d_aa_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__upsample_bilinear2d_aa_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__upsample_bilinear2d_aa_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_decomposed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_decomposed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_decomposed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_decomposed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_decomposed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_decomposed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_allclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_allclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_allclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_allclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_allclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_allclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_aminmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_aminmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_aminmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_aminmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_aminmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_aminmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_aminmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_aminmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_aminmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_aminmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_angle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_angle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_angle_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_angle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_angle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_angle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_angle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_angle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_angle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_angle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_angle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_arange_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_arange_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_arange_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_arange_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_arange_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_arange_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_arange_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_arange_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_arange_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argsort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argsort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argsort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argsort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argsort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argsort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argsort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argsort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argsort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argsort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_baddbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_baddbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_baddbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_baddbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_baddbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_baddbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bernoulli_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bernoulli_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bernoulli_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bernoulli_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bincount_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bincount_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bincount_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bincount_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bincount_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_left_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_left_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_left_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_left_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_left_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_right_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_right_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_right_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_right_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_right_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_shapes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bucketize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bucketize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bucketize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bucketize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bucketize_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bucketize_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bucketize_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bucketize_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bucketize_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cauchy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cauchy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cauchy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cauchy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_inverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_inverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_inverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_inverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_copysign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_copysign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_copysign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_copysign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_copysign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_copysign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_copysign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_copysign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_copysign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_copysign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_corrcoef_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_corrcoef_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_corrcoef_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_corrcoef_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_corrcoef_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_corrcoef_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_corrcoef_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_corrcoef_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_corrcoef_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_corrcoef_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_corrcoef_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cov_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cov_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cov_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cov_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cov_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cov_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cov_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cov_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cov_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cov_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cov_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumulative_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumulative_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumulative_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumulative_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumulative_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumulative_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumulative_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumulative_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumulative_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumulative_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumulative_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_deg2rad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_deg2rad_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_deg2rad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_deg2rad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_deg2rad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_deg2rad_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_deg2rad_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_deg2rad_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_deg2rad_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_deg2rad_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_digamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_digamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_digamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_digamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_digamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_digamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_digamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_digamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_digamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_digamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dist_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dist_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dist_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dist_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_floor_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_floor_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_floor_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_floor_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_floor_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_floor_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_floor_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_floor_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_floor_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_trunc_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_trunc_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_trunc_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_trunc_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_trunc_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_trunc_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_trunc_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_trunc_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_trunc_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_einsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_einsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_einsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_einsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_einsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_einsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfinv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfinv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfinv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfinv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfinv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfinv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfinv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfinv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exponential_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exponential_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_float8_e4m3fnuz, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_float8_e5m2, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_float8_e5m2fnuz, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_frexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_frexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_frexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_frexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gcd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gcd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gcd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gcd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gcd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ge_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ge_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ge_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ge_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ge_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ge_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ge_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ge_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ge_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ge_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geometric_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geometric_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geometric_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geometric_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geometric_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geometric_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geometric_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geometric_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geometric_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geqrf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geqrf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geqrf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geqrf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gradient_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gradient_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gradient_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gradient_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gradient_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gradient_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gradient_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gradient_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gradient_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gradient_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_grid_sampler_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_grid_sampler_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_grid_sampler_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_grid_sampler_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_grid_sampler_3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_grid_sampler_3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_grid_sampler_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_grid_sampler_3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hash_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hash_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hash_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hash_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hash_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hash_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hash_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hash_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hash_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hash_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_heaviside_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_heaviside_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_heaviside_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_heaviside_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_heaviside_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_heaviside_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_heaviside_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_heaviside_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_heaviside_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_heaviside_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_histc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_histc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_histc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_histc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_histc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_histc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_histc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hypot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hypot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hypot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hypot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_i0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_i0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_igamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_igamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_igammac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_igammac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_imag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_imag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_imag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_inner_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_inner_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_inner_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_inner_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_inner_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_inner_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isneginf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isneginf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isneginf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isneginf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isneginf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isneginf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isneginf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isneginf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isneginf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isneginf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isposinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isposinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isposinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isposinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isposinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isposinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isposinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isposinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isposinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isposinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_istft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_istft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kthvalue_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kthvalue_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kthvalue_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kthvalue_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kthvalue_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kthvalue_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kthvalue_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kthvalue_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kthvalue_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lcm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lcm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lcm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lcm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lcm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_le_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_le_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_le_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_le_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_le_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_le_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_le_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_le_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_le_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_le_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lerp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cholesky_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cholesky_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cholesky_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cholesky_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cond_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cond_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cond_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cond_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_det_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_det_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_det_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_det_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eig_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eig_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eig_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eig_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigvalsh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigvalsh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigvalsh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigvalsh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_householder_product_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_householder_product_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_householder_product_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_householder_product_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_inv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_inv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_inv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_inv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_inv_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_inv_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_inv_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_inv_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lstsq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lstsq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lstsq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lstsq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lstsq_grad_oriented_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lstsq_grad_oriented_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lstsq_grad_oriented_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lstsq_grad_oriented_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_rank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_rank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_rank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_rank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_rank_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_rank_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_rank_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_rank_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_multi_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_multi_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_multi_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_multi_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_multi_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_multi_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_subgradients_at_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_subgradients_at_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_subgradients_at_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_subgradients_at_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_singular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_singular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_singular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_singular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_slogdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_slogdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_slogdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_slogdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_triangular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_triangular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_triangular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_triangular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_svdvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_svdvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_svdvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_svdvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_tensorinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_tensorinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_tensorinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_tensorinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_tensorsolve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_tensorsolve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_tensorsolve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_tensorsolve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vander_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vander_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vander_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vander_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vander_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vander_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vander_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vander_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vander_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vecdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vecdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vecdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vecdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vecdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vecdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vector_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vector_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vector_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vector_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vector_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vector_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logaddexp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logaddexp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logaddexp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logaddexp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logcumsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logcumsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logcumsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logcumsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logcumsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logcumsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_unpack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_unpack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_unpack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_unpack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_std_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_std_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_std_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_std_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_std_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_var_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_var_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_var_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_var_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_var_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matrix_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matrix_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matrix_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matrix_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matrix_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matrix_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_pool2d_with_indices_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_pool2d_with_indices_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_pool2d_with_indices_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_pool2d_with_indices_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_median_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_median_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_median_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_median_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_median_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_msort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_msort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_msort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_msort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_msort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_msort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_msort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_msort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_msort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_msort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_multinomial_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_multinomial_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_multinomial_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_multinomial_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nan_to_num_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nan_to_num_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nan_to_num_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nan_to_num_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nan_to_num_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nan_to_num_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nan_to_num_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nan_to_num_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nan_to_num_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nan_to_num_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmean_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmedian_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmedian_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmedian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmedian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmedian_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmedian_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmedian_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmedian_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmedian_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanquantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanquantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_dropout_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_dropout_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_dropout_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_dropout_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nextafter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nextafter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nextafter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nextafter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_alpha_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_alpha_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_alpha_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_alpha_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_binary_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_binary_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_binary_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_binary_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_celu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_celu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_celu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_celu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_embedding_loss_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_embedding_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_similarity_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_similarity_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_similarity_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_similarity_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_ctc_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_ctc_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_elu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_elu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_elu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_elu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_embedding_bag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_embedding_bag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_embedding_bag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_embedding_bag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_embedding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_embedding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_embedding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_embedding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_fractional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_fractional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_fractional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_fractional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_fractional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_fractional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_fractional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_fractional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_gaussian_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_gaussian_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_gaussian_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_gaussian_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_gelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_gelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_gelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_gelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_glu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_glu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_glu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_glu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_grid_sample_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_grid_sample_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_grid_sample_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_grid_sample_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_group_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_group_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_group_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_group_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardswish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardswish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardswish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardswish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardtanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardtanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardtanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardtanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardtanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardtanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardtanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardtanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hinge_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hinge_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hinge_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_huber_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_huber_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_huber_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_huber_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_instance_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_instance_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_instance_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_instance_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_area_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_area_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_area_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_area_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_bicubic_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_bicubic_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_bicubic_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_bicubic_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_trilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_trilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_trilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_trilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_kl_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_kl_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_kl_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_kl_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_l1_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_l1_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_leaky_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_leaky_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_leaky_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_leaky_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_linear_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_linear_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_local_response_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_local_response_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_local_response_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_local_response_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_logsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_logsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_logsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_logsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_margin_ranking_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_margin_ranking_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_margin_ranking_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_margin_ranking_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_margin_ranking_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_margin_ranking_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_margin_ranking_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_margin_ranking_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool1d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool1d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool1d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool1d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool2d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool2d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool2d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool2d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool3d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool3d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool3d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool3d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_mish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_mish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_mish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_mish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_mse_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_mse_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_mse_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_mse_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multi_head_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multi_head_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multi_head_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multi_head_attention_forward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multi_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multi_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multi_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multi_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multilabel_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multilabel_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multilabel_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multilabel_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_one_hot_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_reflect_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_reflect_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_reflect_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_reflect_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_reflect_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_reflect_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_reflect_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_reflect_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_reflect_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_reflect_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_reflect_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_negative_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_negative_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_negative_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_negative_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_negative_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_negative_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_negative_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_negative_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_negative_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_negative_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_negative_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pairwise_distance_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pairwise_distance_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pairwise_distance_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pairwise_distance_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pairwise_distance_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pairwise_distance_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pairwise_distance_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pairwise_distance_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pairwise_distance_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pairwise_distance_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pairwise_distance_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_poisson_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_poisson_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_poisson_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_poisson_nll_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_poisson_nll_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_poisson_nll_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_poisson_nll_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_poisson_nll_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_prelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_prelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_prelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_prelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu6_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu6_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu6_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu6_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu6_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu6_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu6_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu6_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu6_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_rms_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_rms_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_rms_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_rms_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_rms_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_rms_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_rrelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_rrelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_rrelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_rrelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_selu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_selu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_selu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_selu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_silu_complex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_silu_complex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_silu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_silu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_silu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_silu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_smooth_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_smooth_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_smooth_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softplus_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softplus_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softplus_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softplus_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_tanhshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_tanhshrink_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_tanhshrink_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_tanhshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_tanhshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_tanhshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_tanhshrink_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_tanhshrink_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_tanhshrink_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_tanhshrink_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_tanhshrink_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_threshold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_threshold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_threshold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_threshold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_threshold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_threshold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_threshold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_threshold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_threshold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_upsample_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_upsample_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_upsample_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_upsample_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_upsample_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_upsample_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_upsample_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_upsample_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_upsample_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_fro_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_fro_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_fro_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_fro_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_fro_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_fro_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_inf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_inf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_inf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_inf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_inf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_inf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_nuc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_nuc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_nuc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_nuc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_in_place_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_in_place_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_in_place_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_in_place_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_in_place_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_in_place_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_number_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_number_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_number_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_number_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ormqr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ormqr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ormqr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ormqr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pca_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pca_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pca_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pca_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pinverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pinverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pinverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pinverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polar_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polar_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_3_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_4_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_4_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_4_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_4_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_4_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_4_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_4_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_4_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_4_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_4_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_quantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_quantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rad2deg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rad2deg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rad2deg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rad2deg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rad2deg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rad2deg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rad2deg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rad2deg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rad2deg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rad2deg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rand_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rand_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rand_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rand_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rand_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rand_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rand_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_remainder_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_remainder_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_remainder_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_remainder_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_remainder_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_remainder_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_remainder_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_remainder_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_remainder_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_renorm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_renorm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_renorm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_renorm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_renorm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_renorm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_neg_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_neg_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_neg_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_neg_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_searchsorted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_searchsorted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_searchsorted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_searchsorted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_searchsorted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_searchsorted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_searchsorted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_searchsorted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_searchsorted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_bartlett_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_bartlett_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_blackman_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_blackman_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_gaussian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_gaussian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_general_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_general_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_general_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_general_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_hann_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_hann_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_kaiser_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_kaiser_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_nuttall_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_nuttall_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signbit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signbit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signbit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signbit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signbit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signbit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signbit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signbit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signbit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signbit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sparse_mm_reduce_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sparse_mm_reduce_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sparse_mm_reduce_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sparse_mm_reduce_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sparse_sampled_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sparse_sampled_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sparse_sampled_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sparse_sampled_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_airy_ai_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_airy_ai_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_airy_ai_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_airy_ai_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_airy_ai_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_airy_ai_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_airy_ai_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_airy_ai_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_entr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_entr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_entr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_entr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_entr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_entr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_entr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_entr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_entr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_entr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_erfcx_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_erfcx_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_erfcx_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_erfcx_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_erfcx_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_erfcx_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_erfcx_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_erfcx_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_h_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_h_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_h_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_h_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_h_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_h_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_h_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_h_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_he_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_he_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_he_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_he_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_he_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_he_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_he_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_he_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i0e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i0e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i0e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i0e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i0e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i0e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i0e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i0e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i0e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i0e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_laguerre_polynomial_l_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_laguerre_polynomial_l_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_laguerre_polynomial_l_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_laguerre_polynomial_l_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_laguerre_polynomial_l_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_laguerre_polynomial_l_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_laguerre_polynomial_l_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_laguerre_polynomial_l_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_legendre_polynomial_p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_legendre_polynomial_p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_legendre_polynomial_p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_legendre_polynomial_p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_legendre_polynomial_p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_legendre_polynomial_p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_legendre_polynomial_p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_legendre_polynomial_p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_log_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_log_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_log_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_log_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_log_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_log_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_log_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_log_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtri_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtri_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtri_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtri_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtri_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtri_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtri_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtri_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_spherical_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_spherical_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_spherical_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_spherical_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_spherical_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_spherical_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_spherical_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_spherical_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_xlog1py_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_xlog1py_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_xlog1py_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_xlog1py_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_xlog1py_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_xlog1py_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_xlog1py_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_xlog1py_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_xlog1py_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_xlog1py_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_zeta_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_zeta_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_zeta_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_zeta_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_zeta_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_zeta_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_zeta_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_zeta_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_svd_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_svd_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_svd_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_svd_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensordot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensordot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensordot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensordot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensordot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensordot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_topk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_topk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_topk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_topk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_topk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_topk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_topk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_topk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_topk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch__scaled_mm_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__efficient_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__efficient_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__flash_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapz_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapz_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapz_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapz_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapz_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapz_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapz_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapz_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapz_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapz_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapz_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triangular_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triangular_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triangular_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triangular_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_uniform_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_uniform_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_uniform_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_uniform_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_uniform_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_uniform_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_consecutive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_consecutive_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_consecutive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_consecutive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_consecutive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_consecutive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_consecutive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_consecutive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_consecutive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_consecutive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_uint64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unravel_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unravel_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unravel_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unravel_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unravel_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_xlogy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_xlogy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_xlogy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_xlogy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_xlogy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_xlogy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_xlogy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_xlogy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_xlogy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_xlogy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rand___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rand___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rand___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rand___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rand___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rand___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmatmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmatmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmatmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmatmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmatmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmatmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmod___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmod___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmod___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmod___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmod___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmod___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmod___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmod___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmod___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___ror___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___ror___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___ror___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___ror___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___ror___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___ror___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rpow___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rpow___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rpow___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rpow___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rpow___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rpow___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rpow___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rpow___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rpow___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rpow___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rpow___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rsub___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rsub___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rsub___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rsub___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rsub___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rsub___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rsub___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rsub___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rsub___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rsub___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rsub___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rxor___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rxor___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rxor___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rxor___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rxor___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rxor___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__batch_norm_with_update_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__batch_norm_with_update_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__batch_norm_with_update_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__batch_norm_with_update_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__native_batch_norm_legit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__native_batch_norm_legit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__native_batch_norm_legit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__native_batch_norm_legit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__segment_reduce_lengths_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__segment_reduce_lengths_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__segment_reduce_lengths_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__segment_reduce_lengths_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__segment_reduce_offsets_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__segment_reduce_offsets_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__segment_reduce_offsets_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__segment_reduce_offsets_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__softmax_backward_data_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__softmax_backward_data_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__softmax_backward_data_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__softmax_backward_data_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__upsample_bilinear2d_aa_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__upsample_bilinear2d_aa_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__upsample_bilinear2d_aa_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__upsample_bilinear2d_aa_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_decomposed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_decomposed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_decomposed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_decomposed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_decomposed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_decomposed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_H_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_T_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides___getitem___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides___radd___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides___rand___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides___rdiv___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides___rmatmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides___rmod___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides___rmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides___ror___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides___rpow___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides___rsub___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides___rxor___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__batch_norm_with_update_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__chunk_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__native_batch_norm_legit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__segment_reduce_lengths_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__segment_reduce_offsets_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__softmax_backward_data_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__unsafe_masked_index_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__unsafe_masked_index_put_accumulate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__upsample_bilinear2d_aa_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_acosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_addbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_addmm_decomposed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_addmv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_addr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_alias_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_all_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_allclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_aminmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_angle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_any_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_arange_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_argsort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_argwhere_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_as_strided_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_as_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_as_strided_partial_views_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_as_strided_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_asinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_atan2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_atanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_atleast_1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_atleast_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_atleast_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_baddbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bernoulli_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bfloat16_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bincount_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bitwise_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bitwise_left_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bitwise_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bitwise_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bitwise_right_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bitwise_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_block_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bool_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_broadcast_shapes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_broadcast_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_broadcast_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bucketize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_byte_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cartesian_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cauchy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cdouble_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cfloat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_chalf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_char_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cholesky_inverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cholesky_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_clamp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_clone_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_column_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_combinations_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_conj_physical_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_constant_pad_nd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_contiguous_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_copysign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_corrcoef_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_count_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cov_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cummax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cummin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cumulative_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_deg2rad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_diag_embed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_diagflat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_diagonal_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_diagonal_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_diff_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_digamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_dist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_div_floor_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_div_no_rounding_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_div_trunc_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_double_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_dsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_dstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_einsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_empty_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_empty_permuted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_eq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_equal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_erfinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_exp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_expand_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_expand_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_expand_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_eye_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_fft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_fft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_fftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_fftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_hfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_hfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_hfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_ifft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_ifft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_ifftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_ifftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_ihfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_ihfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_ihfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_irfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_irfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_irfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_rfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_rfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_rfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_flatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_flip_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fliplr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_flipud_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_float_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_float_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_floor_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fmod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_frexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_full_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_gather_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_gcd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_ge_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_geometric_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_geqrf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_gradient_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_grid_sampler_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_grid_sampler_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_gt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_half_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_hash_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_heaviside_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_histc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_hsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_hstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_hypot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_igamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_igammac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_imag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_index_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_index_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_index_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_index_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_index_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_index_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_index_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_index_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_index_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_inner_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_int_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_isclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_isfinite_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_isin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_isinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_isnan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_isneginf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_isposinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_isreal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_istft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_item_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_jiterator_2inputs_2outputs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_jiterator_4inputs_with_extra_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_jiterator_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_jiterator_binary_return_by_ref_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_jiterator_unary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_kron_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_kthvalue_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_lcm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_ldexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_le_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_cholesky_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_cond_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_det_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_eig_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_eigh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_eigvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_eigvalsh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_householder_product_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_inv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_inv_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_ldl_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_ldl_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_ldl_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_lstsq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_lstsq_grad_oriented_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_lu_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_lu_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_matrix_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_matrix_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_matrix_rank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_matrix_rank_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_multi_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_norm_subgradients_at_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_pinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_pinv_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_pinv_singular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_slogdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_solve_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_solve_triangular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_svdvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_tensorinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_tensorsolve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_vander_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_vecdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_vector_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_log_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_log_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logaddexp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logcumsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logical_and_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logical_not_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logical_or_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logical_xor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_long_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_lt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_lu_unpack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_mH_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_mT_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_matmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_matrix_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_max_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_max_pool2d_with_indices_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_max_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_max_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_meshgrid_list_of_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_meshgrid_variadic_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_min_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_min_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_min_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_mm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_movedim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_msort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_multinomial_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_mv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nan_to_num_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nanmean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nanmedian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nanquantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nansum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_narrow_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_narrow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_native_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_native_dropout_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_native_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_ne_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_new_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_new_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_new_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_new_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_new_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nextafter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_alpha_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_binary_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_celu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_channel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_conv1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_conv2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_conv3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_conv_transpose1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_conv_transpose2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_conv_transpose3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_cosine_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_cosine_similarity_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_ctc_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_dropout2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_dropout3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_elu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_embedding_bag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_embedding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_fractional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_fractional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_gaussian_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_gelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_glu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_grid_sample_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_group_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_hardshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_hardsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_hardswish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_hardtanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_hinge_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_huber_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_instance_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_interpolate_area_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_interpolate_bicubic_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_interpolate_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_interpolate_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_interpolate_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_interpolate_trilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_kl_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_leaky_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_local_response_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_logsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_margin_ranking_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_max_unpool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_max_unpool1d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_max_unpool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_max_unpool2d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_max_unpool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_max_unpool3d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_mish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_mse_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_multi_head_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_multi_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_multilabel_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_one_hot_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_pad_circular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_pad_constant_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_pad_reflect_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_pad_replicate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_pad_replicate_negative_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_pairwise_distance_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_pdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_pixel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_pixel_unshuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_poisson_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_prelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_relu6_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_rms_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_rrelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_selu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_silu_complex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_silu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_smooth_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_softmin_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_softplus_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_softshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_softsign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_tanhshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_threshold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_triplet_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_upsample_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_upsample_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nonzero_static_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_norm_fro_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_norm_inf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_norm_nuc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_normal_in_place_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_normal_number_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_ones_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_ormqr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_outer_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_pca_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_permute_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_permute_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_pinverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_polar_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_polygamma_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_polygamma_polygamma_n_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_polygamma_polygamma_n_2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_polygamma_polygamma_n_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_polygamma_polygamma_n_4_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_positive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_quantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_rad2deg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_rand_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_randint_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_randint_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_randn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_randn_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_ravel_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_real_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_remainder_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_renorm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_repeat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_repeat_interleave_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_reshape_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_reshape_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_resize__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_resize_as__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_resolve_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_resolve_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_roll_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_rot90_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_round_decimals_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_round_decimals_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_round_decimals_neg_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_rsub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_scalar_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_scatter_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_scatter_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_scatter_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_scatter_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_scatter_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_scatter_reduce_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_searchsorted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_select_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sgn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_short_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signal_windows_bartlett_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signal_windows_blackman_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signal_windows_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signal_windows_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signal_windows_gaussian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signal_windows_general_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signal_windows_general_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signal_windows_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signal_windows_hann_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signal_windows_kaiser_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signal_windows_nuttall_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signbit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sinc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_slice_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_slice_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sparse_mm_reduce_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sparse_sampled_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_airy_ai_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_bessel_j1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_bessel_y0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_bessel_y1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_entr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_erfcx_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_hermite_polynomial_h_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_hermite_polynomial_he_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_i0e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_i1e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_laguerre_polynomial_l_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_legendre_polynomial_p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_log_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_modified_bessel_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_modified_bessel_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_ndtri_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_scaled_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_scaled_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_spherical_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_xlog1py_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_zeta_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_split_list_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_split_with_sizes_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_split_with_sizes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_square_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_squeeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_squeeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_squeeze_multiple_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_std_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_std_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_std_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_stft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sum_to_size_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_svd_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_t_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_take_along_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_take_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_tensor_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_tensordot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_tile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_to_sparse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_topk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_torch__scaled_mm_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_trace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_transpose_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_transpose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_trapz_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_triangular_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_tril_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_tril_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_triu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_triu_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_true_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unbind_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unbind_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unflatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unfold_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_uniform_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unique_consecutive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unique_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unravel_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unsafe_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unsafe_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unsqueeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unsqueeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_var_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_var_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_var_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_vdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_view_as_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_view_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_view_as_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_view_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_view_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_vsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_vstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_where_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_xlogy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_zero__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_zeros_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_allclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_allclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_allclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_allclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_allclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_allclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_aminmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_aminmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_aminmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_aminmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_aminmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_aminmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_aminmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_aminmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_aminmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_aminmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_angle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_angle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_angle_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_angle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_angle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_angle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_angle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_angle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_angle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_angle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_angle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_arange_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_arange_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_arange_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_arange_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_arange_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_arange_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_arange_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_arange_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_arange_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argsort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argsort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argsort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argsort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argsort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argsort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argsort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argsort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argsort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argsort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_baddbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_baddbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_baddbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_baddbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_baddbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_baddbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bernoulli_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bernoulli_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bernoulli_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bernoulli_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bincount_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bincount_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bincount_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bincount_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bincount_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_left_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_left_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_left_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_left_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_left_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_right_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_right_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_right_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_right_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_right_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_shapes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bucketize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bucketize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bucketize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bucketize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bucketize_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bucketize_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bucketize_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bucketize_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bucketize_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cauchy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cauchy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cauchy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cauchy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_inverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_inverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_inverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_inverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_copysign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_copysign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_copysign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_copysign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_copysign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_copysign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_copysign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_copysign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_copysign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_copysign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_corrcoef_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_corrcoef_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_corrcoef_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_corrcoef_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_corrcoef_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_corrcoef_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_corrcoef_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_corrcoef_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_corrcoef_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_corrcoef_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_corrcoef_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cov_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cov_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cov_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cov_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cov_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cov_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cov_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cov_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cov_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cov_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cov_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumulative_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumulative_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumulative_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumulative_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumulative_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumulative_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumulative_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumulative_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumulative_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumulative_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumulative_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_deg2rad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_deg2rad_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_deg2rad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_deg2rad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_deg2rad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_deg2rad_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_deg2rad_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_deg2rad_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_deg2rad_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_deg2rad_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_digamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_digamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_digamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_digamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_digamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_digamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_digamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_digamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_digamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_digamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dist_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dist_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dist_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dist_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_floor_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_floor_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_floor_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_floor_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_floor_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_floor_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_floor_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_floor_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_floor_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_trunc_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_trunc_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_trunc_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_trunc_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_trunc_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_trunc_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_trunc_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_trunc_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_trunc_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_einsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_einsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_einsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_einsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_einsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_einsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfinv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfinv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfinv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfinv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfinv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfinv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfinv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfinv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exponential_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exponential_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_float8_e4m3fnuz, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_float8_e5m2, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_float8_e5m2fnuz, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_frexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_frexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_frexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_frexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gcd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gcd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gcd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gcd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gcd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ge_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ge_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ge_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ge_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ge_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ge_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ge_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ge_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ge_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ge_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geometric_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geometric_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geometric_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geometric_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geometric_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geometric_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geometric_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geometric_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geometric_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geqrf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geqrf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geqrf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geqrf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gradient_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gradient_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gradient_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gradient_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gradient_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gradient_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gradient_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gradient_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gradient_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gradient_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_grid_sampler_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_grid_sampler_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_grid_sampler_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_grid_sampler_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_grid_sampler_3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_grid_sampler_3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_grid_sampler_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_grid_sampler_3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hash_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hash_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hash_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hash_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hash_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hash_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hash_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hash_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hash_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hash_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_heaviside_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_heaviside_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_heaviside_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_heaviside_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_heaviside_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_heaviside_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_heaviside_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_heaviside_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_heaviside_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_heaviside_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_histc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_histc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_histc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_histc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_histc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_histc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_histc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hypot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hypot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hypot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hypot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_i0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_i0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_igamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_igamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_igammac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_igammac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_imag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_imag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_imag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_inner_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_inner_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_inner_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_inner_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_inner_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_inner_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isneginf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isneginf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isneginf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isneginf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isneginf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isneginf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isneginf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isneginf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isneginf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isneginf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isposinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isposinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isposinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isposinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isposinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isposinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isposinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isposinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isposinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isposinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_istft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_istft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kthvalue_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kthvalue_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kthvalue_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kthvalue_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kthvalue_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kthvalue_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kthvalue_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kthvalue_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kthvalue_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lcm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lcm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lcm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lcm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lcm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_le_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_le_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_le_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_le_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_le_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_le_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_le_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_le_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_le_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_le_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lerp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cholesky_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cholesky_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cholesky_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cholesky_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cond_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cond_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cond_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cond_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_det_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_det_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_det_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_det_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eig_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eig_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eig_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eig_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigvalsh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigvalsh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigvalsh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigvalsh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_householder_product_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_householder_product_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_householder_product_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_householder_product_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_inv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_inv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_inv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_inv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_inv_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_inv_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_inv_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_inv_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lstsq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lstsq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lstsq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lstsq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lstsq_grad_oriented_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lstsq_grad_oriented_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lstsq_grad_oriented_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lstsq_grad_oriented_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_rank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_rank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_rank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_rank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_rank_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_rank_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_rank_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_rank_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_multi_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_multi_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_multi_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_multi_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_multi_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_multi_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_subgradients_at_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_subgradients_at_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_subgradients_at_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_subgradients_at_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_singular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_singular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_singular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_singular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_slogdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_slogdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_slogdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_slogdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_triangular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_triangular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_triangular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_triangular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_svdvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_svdvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_svdvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_svdvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_tensorinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_tensorinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_tensorinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_tensorinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_tensorsolve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_tensorsolve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_tensorsolve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_tensorsolve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vander_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vander_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vander_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vander_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vander_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vander_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vander_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vander_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vander_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vecdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vecdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vecdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vecdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vecdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vecdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vector_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vector_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vector_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vector_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vector_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vector_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logaddexp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logaddexp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logaddexp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logaddexp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logcumsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logcumsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logcumsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logcumsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logcumsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logcumsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_unpack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_unpack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_unpack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_unpack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_std_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_std_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_std_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_std_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_std_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_var_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_var_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_var_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_var_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_var_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matrix_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matrix_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matrix_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matrix_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matrix_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matrix_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_pool2d_with_indices_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_pool2d_with_indices_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_pool2d_with_indices_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_pool2d_with_indices_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_median_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_median_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_median_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_median_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_median_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_msort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_msort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_msort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_msort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_msort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_msort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_msort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_msort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_msort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_msort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_multinomial_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_multinomial_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_multinomial_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_multinomial_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nan_to_num_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nan_to_num_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nan_to_num_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nan_to_num_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nan_to_num_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nan_to_num_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nan_to_num_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nan_to_num_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nan_to_num_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nan_to_num_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmean_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmedian_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmedian_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmedian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmedian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmedian_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmedian_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmedian_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmedian_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmedian_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanquantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanquantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_dropout_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_dropout_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_dropout_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_dropout_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nextafter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nextafter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nextafter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nextafter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_alpha_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_alpha_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_alpha_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_alpha_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_binary_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_binary_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_binary_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_binary_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_celu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_celu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_celu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_celu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_embedding_loss_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_embedding_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_similarity_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_similarity_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_similarity_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_similarity_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_ctc_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_ctc_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_elu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_elu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_elu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_elu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_embedding_bag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_embedding_bag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_embedding_bag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_embedding_bag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_embedding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_embedding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_embedding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_embedding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_fractional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_fractional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_fractional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_fractional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_fractional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_fractional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_fractional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_fractional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_gaussian_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_gaussian_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_gaussian_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_gaussian_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_gelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_gelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_gelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_gelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_glu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_glu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_glu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_glu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_grid_sample_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_grid_sample_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_grid_sample_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_grid_sample_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_group_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_group_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_group_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_group_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardswish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardswish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardswish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardswish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardtanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardtanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardtanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardtanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardtanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardtanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardtanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardtanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hinge_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hinge_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hinge_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_huber_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_huber_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_huber_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_huber_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_instance_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_instance_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_instance_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_instance_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_area_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_area_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_area_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_area_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_bicubic_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_bicubic_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_bicubic_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_bicubic_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_trilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_trilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_trilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_trilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_kl_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_kl_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_kl_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_kl_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_l1_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_l1_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_leaky_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_leaky_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_leaky_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_leaky_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_linear_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_linear_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_local_response_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_local_response_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_local_response_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_local_response_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_logsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_logsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_logsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_logsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_margin_ranking_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_margin_ranking_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_margin_ranking_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_margin_ranking_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_margin_ranking_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_margin_ranking_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_margin_ranking_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_margin_ranking_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool1d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool1d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool1d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool1d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool2d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool2d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool2d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool2d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool3d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool3d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool3d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool3d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_mish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_mish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_mish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_mish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_mse_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_mse_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_mse_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_mse_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multi_head_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multi_head_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multi_head_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multi_head_attention_forward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multi_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multi_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multi_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multi_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multilabel_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multilabel_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multilabel_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multilabel_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_one_hot_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_reflect_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_reflect_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_reflect_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_reflect_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_reflect_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_reflect_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_reflect_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_reflect_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_reflect_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_reflect_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_reflect_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_negative_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_negative_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_negative_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_negative_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_negative_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_negative_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_negative_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_negative_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_negative_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_negative_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_negative_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pairwise_distance_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pairwise_distance_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pairwise_distance_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pairwise_distance_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pairwise_distance_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pairwise_distance_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pairwise_distance_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pairwise_distance_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pairwise_distance_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pairwise_distance_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pairwise_distance_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_poisson_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_poisson_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_poisson_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_poisson_nll_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_poisson_nll_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_poisson_nll_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_poisson_nll_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_poisson_nll_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_prelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_prelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_prelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_prelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu6_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu6_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu6_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu6_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu6_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu6_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu6_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu6_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu6_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_rms_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_rms_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_rms_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_rms_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_rms_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_rms_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_rrelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_rrelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_rrelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_rrelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_selu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_selu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_selu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_selu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_silu_complex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_silu_complex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_silu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_silu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_silu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_silu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_smooth_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_smooth_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_smooth_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softplus_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softplus_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softplus_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softplus_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_tanhshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_tanhshrink_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_tanhshrink_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_tanhshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_tanhshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_tanhshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_tanhshrink_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_tanhshrink_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_tanhshrink_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_tanhshrink_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_tanhshrink_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_threshold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_threshold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_threshold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_threshold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_threshold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_threshold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_threshold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_threshold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_threshold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_upsample_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_upsample_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_upsample_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_upsample_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_upsample_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_upsample_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_upsample_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_upsample_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_upsample_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_fro_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_fro_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_fro_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_fro_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_fro_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_fro_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_inf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_inf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_inf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_inf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_inf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_inf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_nuc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_nuc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_nuc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_nuc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_in_place_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_in_place_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_in_place_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_in_place_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_in_place_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_in_place_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_number_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_number_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_number_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_number_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ormqr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ormqr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ormqr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ormqr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pca_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pca_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pca_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pca_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pinverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pinverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pinverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pinverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polar_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polar_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_3_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_4_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_4_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_4_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_4_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_4_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_4_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_4_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_4_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_4_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_4_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_quantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_quantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rad2deg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rad2deg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rad2deg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rad2deg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rad2deg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rad2deg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rad2deg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rad2deg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rad2deg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rad2deg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rand_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rand_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rand_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rand_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rand_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rand_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rand_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_remainder_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_remainder_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_remainder_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_remainder_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_remainder_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_remainder_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_remainder_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_remainder_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_remainder_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_renorm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_renorm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_renorm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_renorm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_renorm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_renorm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_neg_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_neg_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_neg_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_neg_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_searchsorted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_searchsorted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_searchsorted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_searchsorted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_searchsorted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_searchsorted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_searchsorted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_searchsorted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_searchsorted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_bartlett_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_bartlett_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_blackman_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_blackman_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_gaussian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_gaussian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_general_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_general_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_general_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_general_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_hann_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_hann_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_kaiser_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_kaiser_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_nuttall_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_nuttall_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signbit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signbit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signbit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signbit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signbit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signbit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signbit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signbit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signbit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signbit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sparse_mm_reduce_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sparse_mm_reduce_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sparse_mm_reduce_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sparse_mm_reduce_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sparse_sampled_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sparse_sampled_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sparse_sampled_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sparse_sampled_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_airy_ai_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_airy_ai_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_airy_ai_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_airy_ai_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_airy_ai_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_airy_ai_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_airy_ai_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_airy_ai_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_entr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_entr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_entr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_entr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_entr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_entr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_entr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_entr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_entr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_entr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_erfcx_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_erfcx_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_erfcx_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_erfcx_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_erfcx_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_erfcx_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_erfcx_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_erfcx_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_h_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_h_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_h_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_h_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_h_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_h_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_h_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_h_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_he_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_he_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_he_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_he_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_he_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_he_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_he_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_he_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i0e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i0e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i0e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i0e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i0e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i0e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i0e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i0e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i0e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i0e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_laguerre_polynomial_l_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_laguerre_polynomial_l_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_laguerre_polynomial_l_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_laguerre_polynomial_l_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_laguerre_polynomial_l_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_laguerre_polynomial_l_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_laguerre_polynomial_l_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_laguerre_polynomial_l_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_legendre_polynomial_p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_legendre_polynomial_p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_legendre_polynomial_p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_legendre_polynomial_p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_legendre_polynomial_p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_legendre_polynomial_p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_legendre_polynomial_p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_legendre_polynomial_p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_log_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_log_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_log_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_log_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_log_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_log_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_log_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_log_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtri_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtri_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtri_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtri_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtri_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtri_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtri_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtri_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_spherical_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_spherical_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_spherical_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_spherical_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_spherical_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_spherical_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_spherical_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_spherical_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_xlog1py_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_xlog1py_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_xlog1py_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_xlog1py_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_xlog1py_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_xlog1py_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_xlog1py_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_xlog1py_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_xlog1py_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_xlog1py_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_zeta_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_zeta_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_zeta_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_zeta_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_zeta_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_zeta_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_zeta_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_zeta_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_svd_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_svd_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_svd_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_svd_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensordot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensordot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensordot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensordot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensordot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensordot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_topk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_topk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_topk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_topk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_topk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_topk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_topk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_topk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_topk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch__scaled_mm_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__efficient_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__efficient_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__flash_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapz_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapz_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapz_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapz_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapz_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapz_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapz_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapz_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapz_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapz_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapz_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triangular_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triangular_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triangular_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triangular_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_uniform_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_uniform_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_uniform_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_uniform_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_uniform_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_uniform_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_consecutive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_consecutive_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_consecutive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_consecutive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_consecutive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_consecutive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_consecutive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_consecutive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_consecutive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_consecutive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_uint64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unravel_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unravel_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unravel_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unravel_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unravel_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_xlogy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_xlogy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_xlogy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_xlogy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_xlogy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_xlogy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_xlogy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_xlogy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_xlogy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_xlogy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rand___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rand___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rand___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rand___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rand___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rand___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmatmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmatmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmatmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmatmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmatmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmatmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmod___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmod___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmod___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmod___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmod___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmod___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmod___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmod___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmod___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___ror___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___ror___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___ror___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___ror___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___ror___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___ror___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rpow___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rpow___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rpow___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rpow___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rpow___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rpow___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rpow___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rpow___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rpow___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rpow___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rpow___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rsub___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rsub___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rsub___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rsub___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rsub___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rsub___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rsub___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rsub___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rsub___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rsub___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rsub___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rxor___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rxor___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rxor___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rxor___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rxor___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rxor___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__batch_norm_with_update_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__batch_norm_with_update_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__batch_norm_with_update_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__batch_norm_with_update_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__native_batch_norm_legit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__native_batch_norm_legit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__native_batch_norm_legit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__native_batch_norm_legit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__segment_reduce_lengths_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__segment_reduce_lengths_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__segment_reduce_lengths_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__segment_reduce_lengths_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__segment_reduce_offsets_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__segment_reduce_offsets_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__segment_reduce_offsets_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__segment_reduce_offsets_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__softmax_backward_data_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__softmax_backward_data_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__softmax_backward_data_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__softmax_backward_data_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__upsample_bilinear2d_aa_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__upsample_bilinear2d_aa_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__upsample_bilinear2d_aa_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__upsample_bilinear2d_aa_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_decomposed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_decomposed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_decomposed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_decomposed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_decomposed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_decomposed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_H_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_T_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides___getitem___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides___radd___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides___rand___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides___rdiv___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides___rmatmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides___rmod___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides___rmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides___ror___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides___rpow___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides___rsub___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides___rxor___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__batch_norm_with_update_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__chunk_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__native_batch_norm_legit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__segment_reduce_lengths_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__segment_reduce_offsets_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__softmax_backward_data_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__unsafe_masked_index_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__unsafe_masked_index_put_accumulate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__upsample_bilinear2d_aa_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_acosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_addbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_addmm_decomposed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_addmv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_addr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_alias_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_all_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_allclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_aminmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_angle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_any_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_arange_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_argsort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_argwhere_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_as_strided_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_as_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_as_strided_partial_views_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_as_strided_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_asinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_atan2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_atanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_atleast_1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_atleast_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_atleast_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_baddbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bernoulli_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bfloat16_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bincount_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bitwise_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bitwise_left_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bitwise_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bitwise_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bitwise_right_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bitwise_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_block_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bool_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_broadcast_shapes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_broadcast_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_broadcast_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bucketize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_byte_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cartesian_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cauchy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cdouble_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cfloat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_chalf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_char_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cholesky_inverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cholesky_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_clamp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_clone_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_column_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_combinations_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_conj_physical_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_constant_pad_nd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_contiguous_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_copysign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_corrcoef_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_count_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cov_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cummax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cummin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cumulative_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_deg2rad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_diag_embed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_diagflat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_diagonal_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_diagonal_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_diff_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_digamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_dist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_div_floor_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_div_no_rounding_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_div_trunc_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_double_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_dsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_dstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_einsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_empty_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_empty_permuted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_eq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_equal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_erfinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_exp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_expand_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_expand_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_expand_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_eye_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_fft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_fft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_fftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_fftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_hfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_hfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_hfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_ifft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_ifft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_ifftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_ifftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_ihfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_ihfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_ihfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_irfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_irfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_irfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_rfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_rfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_rfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_flatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_flip_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fliplr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_flipud_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_float_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_float_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_floor_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fmod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_frexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_full_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_gather_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_gcd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_ge_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_geometric_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_geqrf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_gradient_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_grid_sampler_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_grid_sampler_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_gt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_half_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_hash_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_heaviside_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_histc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_hsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_hstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_hypot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_igamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_igammac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_imag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_inner_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_int_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_isclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_isfinite_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_isin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_isinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_isnan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_isneginf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_isposinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_isreal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_istft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_item_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_jiterator_2inputs_2outputs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_jiterator_4inputs_with_extra_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_jiterator_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_jiterator_binary_return_by_ref_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_jiterator_unary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_kron_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_kthvalue_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_lcm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_ldexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_le_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_cholesky_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_cond_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_det_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_eig_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_eigh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_eigvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_eigvalsh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_householder_product_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_inv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_inv_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_ldl_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_ldl_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_ldl_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_lstsq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_lstsq_grad_oriented_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_lu_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_lu_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_matrix_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_matrix_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_matrix_rank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_matrix_rank_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_multi_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_norm_subgradients_at_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_pinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_pinv_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_pinv_singular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_slogdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_solve_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_solve_triangular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_svdvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_tensorinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_tensorsolve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_vander_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_vecdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_vector_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_log_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_log_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logaddexp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logcumsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logical_and_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logical_not_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logical_or_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logical_xor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_long_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_lt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_lu_unpack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_mH_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_mT_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_matmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_matrix_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_max_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_max_pool2d_with_indices_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_max_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_max_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_meshgrid_list_of_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_meshgrid_variadic_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_min_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_min_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_min_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_mm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_movedim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_msort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_multinomial_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_mv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nan_to_num_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nanmean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nanmedian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nanquantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nansum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_narrow_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_narrow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_native_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_native_dropout_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_native_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_ne_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_new_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_new_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_new_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_new_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_new_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nextafter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_alpha_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_binary_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_celu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_channel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_conv1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_conv2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_conv3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_conv_transpose1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_conv_transpose2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_conv_transpose3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_cosine_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_cosine_similarity_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_ctc_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_dropout2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_dropout3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_elu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_embedding_bag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_embedding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_fractional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_fractional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_gaussian_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_gelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_glu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_grid_sample_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_group_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_hardshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_hardsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_hardswish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_hardtanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_hinge_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_huber_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_instance_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_interpolate_area_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_interpolate_bicubic_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_interpolate_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_interpolate_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_interpolate_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_interpolate_trilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_kl_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_leaky_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_local_response_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_logsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_margin_ranking_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_max_unpool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_max_unpool1d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_max_unpool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_max_unpool2d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_max_unpool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_max_unpool3d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_mish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_mse_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_multi_head_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_multi_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_multilabel_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_one_hot_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_pad_circular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_pad_constant_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_pad_reflect_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_pad_replicate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_pad_replicate_negative_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_pairwise_distance_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_pdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_pixel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_pixel_unshuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_poisson_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_prelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_relu6_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_rms_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_rrelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_selu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_silu_complex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_silu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_smooth_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_softmin_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_softplus_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_softshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_softsign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_tanhshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_threshold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_triplet_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_upsample_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_upsample_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nonzero_static_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_norm_fro_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_norm_inf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_norm_nuc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_normal_in_place_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_normal_number_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_ones_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_ormqr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_outer_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_pca_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_permute_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_permute_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_pinverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_polar_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_polygamma_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_polygamma_polygamma_n_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_polygamma_polygamma_n_2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_polygamma_polygamma_n_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_polygamma_polygamma_n_4_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_positive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_quantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_rad2deg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_rand_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_randint_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_randint_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_randn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_randn_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_ravel_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_real_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_remainder_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_renorm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_repeat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_repeat_interleave_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_reshape_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_reshape_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_resize__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_resize_as__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_resolve_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_resolve_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_roll_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_rot90_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_round_decimals_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_round_decimals_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_round_decimals_neg_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_rsub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_scalar_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_scatter_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_scatter_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_scatter_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_scatter_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_scatter_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_scatter_reduce_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_searchsorted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_select_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sgn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_short_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signal_windows_bartlett_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signal_windows_blackman_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signal_windows_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signal_windows_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signal_windows_gaussian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signal_windows_general_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signal_windows_general_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signal_windows_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signal_windows_hann_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signal_windows_kaiser_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signal_windows_nuttall_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signbit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sinc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_slice_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_slice_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sparse_mm_reduce_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sparse_sampled_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_airy_ai_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_bessel_j1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_bessel_y0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_bessel_y1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_entr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_erfcx_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_hermite_polynomial_h_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_hermite_polynomial_he_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_i0e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_i1e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_laguerre_polynomial_l_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_legendre_polynomial_p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_log_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_modified_bessel_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_modified_bessel_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_ndtri_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_scaled_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_scaled_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_spherical_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_xlog1py_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_zeta_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_split_list_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_split_with_sizes_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_split_with_sizes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_square_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_squeeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_squeeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_squeeze_multiple_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_std_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_std_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_std_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_stft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sum_to_size_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_svd_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_t_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_take_along_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_take_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_tensor_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_tensordot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_tile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_to_sparse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_topk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_torch__scaled_mm_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_trace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_transpose_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_transpose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_trapz_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_triangular_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_tril_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_tril_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_triu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_triu_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_true_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unbind_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unbind_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unflatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unfold_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_uniform_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unique_consecutive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unique_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unravel_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unsafe_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unsafe_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unsqueeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unsqueeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_var_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_var_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_var_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_vdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_view_as_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_view_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_view_as_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_view_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_view_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_vsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_vstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_where_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_xlogy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_zero__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_zeros_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_allclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_allclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_allclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_allclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_allclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_allclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_aminmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_aminmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_aminmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_aminmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_aminmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_aminmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_aminmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_aminmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_aminmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_aminmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_angle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_angle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_angle_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_angle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_angle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_angle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_angle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_angle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_angle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_angle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_angle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_arange_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_arange_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_arange_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_arange_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_arange_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_arange_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_arange_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_arange_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_arange_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argsort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argsort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argsort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argsort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argsort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argsort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argsort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argsort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argsort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argsort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_baddbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_baddbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_baddbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_baddbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_baddbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_baddbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bernoulli_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bernoulli_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bernoulli_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bernoulli_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bincount_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bincount_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bincount_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bincount_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bincount_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_left_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_left_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_left_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_left_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_left_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_right_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_right_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_right_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_right_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_right_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_shapes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bucketize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bucketize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bucketize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bucketize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bucketize_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bucketize_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bucketize_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bucketize_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bucketize_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cauchy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cauchy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cauchy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cauchy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_inverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_inverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_inverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_inverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_copysign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_copysign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_copysign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_copysign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_copysign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_copysign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_copysign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_copysign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_copysign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_copysign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_corrcoef_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_corrcoef_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_corrcoef_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_corrcoef_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_corrcoef_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_corrcoef_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_corrcoef_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_corrcoef_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_corrcoef_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_corrcoef_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_corrcoef_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cov_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cov_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cov_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cov_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cov_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cov_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cov_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cov_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cov_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cov_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cov_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumulative_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumulative_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumulative_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumulative_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumulative_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumulative_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumulative_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumulative_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumulative_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumulative_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumulative_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_deg2rad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_deg2rad_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_deg2rad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_deg2rad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_deg2rad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_deg2rad_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_deg2rad_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_deg2rad_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_deg2rad_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_deg2rad_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_digamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_digamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_digamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_digamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_digamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_digamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_digamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_digamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_digamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_digamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dist_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dist_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dist_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dist_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_floor_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_floor_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_floor_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_floor_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_floor_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_floor_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_floor_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_floor_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_floor_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_trunc_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_trunc_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_trunc_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_trunc_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_trunc_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_trunc_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_trunc_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_trunc_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_trunc_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_einsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_einsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_einsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_einsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_einsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_einsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfinv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfinv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfinv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfinv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfinv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfinv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfinv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfinv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exponential_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exponential_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_float8_e4m3fnuz, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_float8_e5m2, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_float8_e5m2fnuz, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_frexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_frexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_frexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_frexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gcd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gcd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gcd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gcd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gcd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ge_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ge_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ge_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ge_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ge_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ge_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ge_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ge_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ge_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ge_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geometric_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geometric_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geometric_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geometric_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geometric_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geometric_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geometric_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geometric_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geometric_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geqrf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geqrf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geqrf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geqrf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gradient_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gradient_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gradient_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gradient_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gradient_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gradient_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gradient_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gradient_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gradient_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gradient_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_grid_sampler_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_grid_sampler_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_grid_sampler_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_grid_sampler_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_grid_sampler_3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_grid_sampler_3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_grid_sampler_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_grid_sampler_3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hash_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hash_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hash_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hash_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hash_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hash_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hash_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hash_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hash_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hash_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_heaviside_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_heaviside_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_heaviside_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_heaviside_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_heaviside_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_heaviside_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_heaviside_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_heaviside_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_heaviside_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_heaviside_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_histc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_histc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_histc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_histc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_histc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_histc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_histc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hypot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hypot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hypot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hypot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_i0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_i0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_igamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_igamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_igammac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_igammac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_imag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_imag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_imag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_inner_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_inner_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_inner_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_inner_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_inner_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_inner_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isneginf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isneginf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isneginf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isneginf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isneginf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isneginf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isneginf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isneginf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isneginf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isneginf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isposinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isposinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isposinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isposinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isposinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isposinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isposinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isposinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isposinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isposinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_istft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_istft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kthvalue_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kthvalue_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kthvalue_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kthvalue_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kthvalue_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kthvalue_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kthvalue_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kthvalue_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kthvalue_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lcm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lcm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lcm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lcm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lcm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_le_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_le_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_le_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_le_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_le_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_le_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_le_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_le_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_le_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_le_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lerp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cholesky_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cholesky_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cholesky_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cholesky_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cond_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cond_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cond_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cond_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_det_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_det_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_det_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_det_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eig_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eig_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eig_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eig_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigvalsh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigvalsh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigvalsh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigvalsh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_householder_product_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_householder_product_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_householder_product_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_householder_product_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_inv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_inv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_inv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_inv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_inv_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_inv_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_inv_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_inv_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lstsq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lstsq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lstsq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lstsq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lstsq_grad_oriented_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lstsq_grad_oriented_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lstsq_grad_oriented_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lstsq_grad_oriented_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_rank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_rank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_rank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_rank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_rank_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_rank_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_rank_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_rank_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_multi_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_multi_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_multi_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_multi_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_multi_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_multi_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_subgradients_at_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_subgradients_at_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_subgradients_at_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_subgradients_at_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_singular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_singular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_singular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_singular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_slogdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_slogdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_slogdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_slogdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_triangular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_triangular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_triangular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_triangular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_svdvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_svdvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_svdvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_svdvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_tensorinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_tensorinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_tensorinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_tensorinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_tensorsolve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_tensorsolve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_tensorsolve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_tensorsolve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vander_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vander_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vander_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vander_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vander_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vander_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vander_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vander_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vander_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vecdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vecdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vecdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vecdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vecdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vecdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vector_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vector_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vector_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vector_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vector_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vector_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logaddexp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logaddexp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logaddexp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logaddexp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logcumsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logcumsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logcumsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logcumsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logcumsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logcumsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_unpack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_unpack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_unpack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_unpack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_std_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_std_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_std_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_std_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_std_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_var_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_var_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_var_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_var_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_var_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matrix_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matrix_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matrix_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matrix_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matrix_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matrix_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_pool2d_with_indices_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_pool2d_with_indices_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_pool2d_with_indices_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_pool2d_with_indices_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_median_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_median_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_median_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_median_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_median_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_msort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_msort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_msort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_msort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_msort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_msort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_msort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_msort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_msort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_msort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_multinomial_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_multinomial_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_multinomial_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_multinomial_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nan_to_num_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nan_to_num_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nan_to_num_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nan_to_num_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nan_to_num_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nan_to_num_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nan_to_num_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nan_to_num_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nan_to_num_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nan_to_num_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmean_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmedian_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmedian_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmedian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmedian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmedian_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmedian_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmedian_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmedian_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmedian_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanquantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanquantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_dropout_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_dropout_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_dropout_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_dropout_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nextafter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nextafter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nextafter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nextafter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_alpha_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_alpha_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_alpha_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_alpha_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_binary_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_binary_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_binary_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_binary_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_celu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_celu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_celu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_celu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_embedding_loss_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_embedding_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_similarity_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_similarity_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_similarity_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_similarity_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_ctc_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_ctc_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_elu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_elu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_elu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_elu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_embedding_bag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_embedding_bag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_embedding_bag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_embedding_bag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_embedding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_embedding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_embedding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_embedding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_fractional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_fractional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_fractional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_fractional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_fractional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_fractional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_fractional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_fractional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_gaussian_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_gaussian_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_gaussian_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_gaussian_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_gelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_gelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_gelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_gelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_glu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_glu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_glu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_glu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_grid_sample_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_grid_sample_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_grid_sample_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_grid_sample_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_group_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_group_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_group_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_group_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardswish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardswish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardswish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardswish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardtanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardtanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardtanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardtanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardtanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardtanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardtanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardtanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hinge_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hinge_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hinge_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_huber_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_huber_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_huber_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_huber_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_instance_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_instance_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_instance_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_instance_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_area_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_area_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_area_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_area_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_bicubic_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_bicubic_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_bicubic_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_bicubic_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_trilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_trilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_trilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_trilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_kl_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_kl_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_kl_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_kl_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_l1_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_l1_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_leaky_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_leaky_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_leaky_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_leaky_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_linear_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_linear_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_local_response_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_local_response_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_local_response_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_local_response_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_logsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_logsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_logsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_logsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_margin_ranking_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_margin_ranking_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_margin_ranking_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_margin_ranking_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_margin_ranking_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_margin_ranking_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_margin_ranking_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_margin_ranking_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool1d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool1d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool1d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool1d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool2d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool2d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool2d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool2d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool3d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool3d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool3d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool3d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_mish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_mish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_mish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_mish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_mse_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_mse_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_mse_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_mse_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multi_head_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multi_head_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multi_head_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multi_head_attention_forward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multi_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multi_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multi_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multi_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multilabel_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multilabel_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multilabel_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multilabel_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_one_hot_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_reflect_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_reflect_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_reflect_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_reflect_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_reflect_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_reflect_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_reflect_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_reflect_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_reflect_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_reflect_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_reflect_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_negative_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_negative_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_negative_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_negative_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_negative_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_negative_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_negative_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_negative_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_negative_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_negative_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_negative_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pairwise_distance_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pairwise_distance_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pairwise_distance_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pairwise_distance_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pairwise_distance_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pairwise_distance_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pairwise_distance_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pairwise_distance_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pairwise_distance_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pairwise_distance_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pairwise_distance_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_poisson_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_poisson_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_poisson_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_poisson_nll_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_poisson_nll_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_poisson_nll_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_poisson_nll_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_poisson_nll_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_prelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_prelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_prelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_prelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu6_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu6_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu6_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu6_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu6_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu6_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu6_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu6_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu6_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_rms_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_rms_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_rms_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_rms_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_rms_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_rms_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_rrelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_rrelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_rrelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_rrelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_selu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_selu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_selu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_selu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_silu_complex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_silu_complex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_silu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_silu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_silu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_silu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_smooth_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_smooth_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_smooth_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softplus_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softplus_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softplus_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softplus_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_tanhshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_tanhshrink_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_tanhshrink_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_tanhshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_tanhshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_tanhshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_tanhshrink_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_tanhshrink_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_tanhshrink_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_tanhshrink_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_tanhshrink_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_threshold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_threshold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_threshold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_threshold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_threshold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_threshold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_threshold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_threshold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_threshold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_upsample_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_upsample_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_upsample_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_upsample_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_upsample_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_upsample_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_upsample_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_upsample_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_upsample_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_fro_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_fro_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_fro_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_fro_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_fro_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_fro_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_inf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_inf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_inf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_inf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_inf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_inf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_nuc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_nuc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_nuc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_nuc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_in_place_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_in_place_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_in_place_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_in_place_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_in_place_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_in_place_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_number_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_number_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_number_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_number_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ormqr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ormqr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ormqr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ormqr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pca_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pca_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pca_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pca_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pinverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pinverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pinverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pinverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polar_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polar_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_3_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_4_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_4_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_4_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_4_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_4_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_4_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_4_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_4_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_4_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_4_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_quantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_quantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rad2deg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rad2deg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rad2deg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rad2deg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rad2deg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rad2deg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rad2deg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rad2deg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rad2deg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rad2deg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rand_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rand_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rand_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rand_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rand_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rand_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rand_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_remainder_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_remainder_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_remainder_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_remainder_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_remainder_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_remainder_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_remainder_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_remainder_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_remainder_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_renorm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_renorm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_renorm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_renorm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_renorm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_renorm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_neg_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_neg_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_neg_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_neg_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_searchsorted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_searchsorted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_searchsorted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_searchsorted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_searchsorted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_searchsorted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_searchsorted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_searchsorted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_searchsorted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_bartlett_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_bartlett_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_blackman_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_blackman_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_gaussian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_gaussian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_general_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_general_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_general_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_general_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_hann_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_hann_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_kaiser_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_kaiser_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_nuttall_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_nuttall_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signbit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signbit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signbit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signbit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signbit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signbit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signbit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signbit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signbit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signbit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sparse_mm_reduce_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sparse_mm_reduce_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sparse_mm_reduce_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sparse_mm_reduce_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sparse_sampled_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sparse_sampled_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sparse_sampled_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sparse_sampled_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_airy_ai_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_airy_ai_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_airy_ai_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_airy_ai_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_airy_ai_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_airy_ai_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_airy_ai_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_airy_ai_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_entr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_entr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_entr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_entr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_entr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_entr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_entr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_entr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_entr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_entr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_erfcx_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_erfcx_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_erfcx_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_erfcx_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_erfcx_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_erfcx_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_erfcx_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_erfcx_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_h_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_h_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_h_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_h_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_h_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_h_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_h_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_h_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_he_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_he_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_he_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_he_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_he_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_he_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_he_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_he_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i0e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i0e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i0e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i0e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i0e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i0e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i0e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i0e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i0e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i0e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_laguerre_polynomial_l_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_laguerre_polynomial_l_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_laguerre_polynomial_l_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_laguerre_polynomial_l_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_laguerre_polynomial_l_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_laguerre_polynomial_l_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_laguerre_polynomial_l_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_laguerre_polynomial_l_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_legendre_polynomial_p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_legendre_polynomial_p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_legendre_polynomial_p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_legendre_polynomial_p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_legendre_polynomial_p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_legendre_polynomial_p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_legendre_polynomial_p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_legendre_polynomial_p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_log_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_log_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_log_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_log_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_log_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_log_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_log_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_log_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtri_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtri_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtri_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtri_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtri_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtri_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtri_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtri_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_spherical_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_spherical_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_spherical_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_spherical_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_spherical_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_spherical_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_spherical_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_spherical_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_xlog1py_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_xlog1py_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_xlog1py_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_xlog1py_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_xlog1py_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_xlog1py_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_xlog1py_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_xlog1py_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_xlog1py_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_xlog1py_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_zeta_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_zeta_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_zeta_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_zeta_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_zeta_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_zeta_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_zeta_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_zeta_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_svd_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_svd_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_svd_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_svd_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensordot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensordot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensordot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensordot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensordot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensordot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_topk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_topk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_topk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_topk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_topk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_topk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_topk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_topk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_topk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch__scaled_mm_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__efficient_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__efficient_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__flash_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapz_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapz_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapz_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapz_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapz_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapz_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapz_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapz_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapz_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapz_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapz_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triangular_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triangular_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triangular_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triangular_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_uniform_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_uniform_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_uniform_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_uniform_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_uniform_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_uniform_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_consecutive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_consecutive_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_consecutive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_consecutive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_consecutive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_consecutive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_consecutive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_consecutive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_consecutive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_consecutive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_uint64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unravel_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unravel_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unravel_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unravel_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unravel_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_xlogy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_xlogy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_xlogy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_xlogy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_xlogy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_xlogy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_xlogy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_xlogy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_xlogy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_xlogy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_embedding_bag_byte_prepack_cuda, test/test_meta.py::TestMetaCUDA::test_embedding_bag_byte_unpack_cuda, test/test_meta.py::TestMetaCUDA::test_embedding_bag_dense_backward_mode_1_cuda, test/test_meta.py::TestMetaCUDA::test_embedding_bag_dense_backward_mode_2_cuda, test/test_meta.py::TestMetaCUDA::test_embedding_bag_dense_backward_per_sample_weights_cuda, test/test_meta.py::TestMetaCUDA::test_empty_quantized_cuda, test/test_meta.py::TestMetaCUDA::test_fill__alias_relationship_cuda, test/test_meta.py::TestMetaCUDA::test_fill_stride_cuda, test/test_meta.py::TestMetaCUDA::test_group_norm_backward_output_mask0_cuda, test/test_meta.py::TestMetaCUDA::test_group_norm_backward_output_mask1_cuda, test/test_meta.py::TestMetaCUDA::test_group_norm_backward_output_mask2_cuda, test/test_meta.py::TestMetaCUDA::test_group_norm_backward_output_mask3_cuda, test/test_meta.py::TestMetaCUDA::test_group_norm_backward_output_mask4_cuda, test/test_meta.py::TestMetaCUDA::test_group_norm_backward_output_mask5_cuda, test/test_meta.py::TestMetaCUDA::test_group_norm_backward_output_mask6_cuda, test/test_meta.py::TestMetaCUDA::test_group_norm_backward_output_mask7_cuda, test/test_meta.py::TestMetaCUDA::test_huber_loss_backward_cuda, test/test_meta.py::TestMetaCUDA::test_index_select_out_cuda, test/test_meta.py::TestMetaCUDA::test_inplace_bin_ops_error_cuda, test/test_meta.py::TestMetaCUDA::test_inplace_masked_fill_error_cuda, test/test_meta.py::TestMetaCUDA::test_layer_norm_backward_output_mask0_cuda, test/test_meta.py::TestMetaCUDA::test_layer_norm_backward_output_mask1_cuda, test/test_meta.py::TestMetaCUDA::test_layer_norm_backward_output_mask2_cuda, test/test_meta.py::TestMetaCUDA::test_layer_norm_backward_output_mask3_cuda, test/test_meta.py::TestMetaCUDA::test_layer_norm_backward_output_mask4_cuda, test/test_meta.py::TestMetaCUDA::test_layer_norm_backward_output_mask5_cuda, test/test_meta.py::TestMetaCUDA::test_layer_norm_backward_output_mask6_cuda, test/test_meta.py::TestMetaCUDA::test_layer_norm_backward_output_mask7_cuda, test/test_meta.py::TestMetaCUDA::test_local_scalar_dense_call_cuda, test/test_meta.py::TestMetaCUDA::test_map_location_deserialize_cuda, test/test_meta.py::TestMetaCUDA::test_meta__fused_moving_avg_obs_fq_helper_cuda, test/test_meta.py::TestMetaCUDA::test_meta_autograd_no_error_cuda, test/test_meta.py::TestMetaCUDA::test_meta_consistency_out_dtype_mismatch_pow_Tensor_Scalar_cuda, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rand___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rand___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rand___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rand___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rand___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rand___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmatmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmatmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmatmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmatmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmatmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmatmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmod___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmod___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmod___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmod___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmod___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmod___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmod___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmod___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmod___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___ror___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace___ror___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___ror___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___ror___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___ror___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___ror___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rpow___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rpow___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rpow___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rpow___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rpow___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rpow___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rpow___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rpow___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rpow___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rpow___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rpow___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rsub___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rsub___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rsub___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rsub___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rsub___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rsub___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rsub___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rsub___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rsub___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rsub___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rsub___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rxor___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rxor___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rxor___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rxor___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rxor___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rxor___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__batch_norm_with_update_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__batch_norm_with_update_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__batch_norm_with_update_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__batch_norm_with_update_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__native_batch_norm_legit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__native_batch_norm_legit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__native_batch_norm_legit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__native_batch_norm_legit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__segment_reduce_lengths_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__segment_reduce_lengths_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__segment_reduce_lengths_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__segment_reduce_lengths_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__segment_reduce_offsets_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__segment_reduce_offsets_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__segment_reduce_offsets_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__segment_reduce_offsets_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__softmax_backward_data_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__softmax_backward_data_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__softmax_backward_data_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__softmax_backward_data_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__upsample_bilinear2d_aa_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__upsample_bilinear2d_aa_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__upsample_bilinear2d_aa_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__upsample_bilinear2d_aa_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_decomposed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_decomposed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_decomposed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_decomposed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_decomposed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_decomposed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_allclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_allclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_allclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_allclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_allclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_allclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_aminmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_aminmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_aminmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_aminmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_aminmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_aminmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_aminmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_aminmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_aminmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_aminmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_angle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_angle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_angle_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_angle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_angle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_angle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_angle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_angle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_angle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_angle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_angle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_arange_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_arange_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_arange_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_arange_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_arange_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_arange_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_arange_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_arange_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_arange_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argsort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argsort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argsort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argsort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argsort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argsort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argsort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argsort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argsort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argsort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_baddbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_baddbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_baddbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_baddbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_baddbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_baddbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bernoulli_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bernoulli_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bernoulli_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bernoulli_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bincount_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bincount_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bincount_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bincount_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bincount_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_left_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_left_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_left_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_left_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_left_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_right_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_right_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_right_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_right_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_right_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_shapes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bucketize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bucketize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bucketize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bucketize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bucketize_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bucketize_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bucketize_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bucketize_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bucketize_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cauchy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cauchy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cauchy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cauchy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_inverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_inverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_inverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_inverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_copysign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_copysign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_copysign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_copysign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_copysign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_copysign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_copysign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_copysign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_copysign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_copysign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_corrcoef_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_corrcoef_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_corrcoef_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_corrcoef_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_corrcoef_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_corrcoef_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_corrcoef_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_corrcoef_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_corrcoef_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_corrcoef_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_corrcoef_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cov_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cov_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cov_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cov_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cov_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cov_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cov_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cov_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cov_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cov_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cov_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumulative_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumulative_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumulative_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumulative_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumulative_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumulative_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumulative_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumulative_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumulative_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumulative_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumulative_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_deg2rad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_deg2rad_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_deg2rad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_deg2rad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_deg2rad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_deg2rad_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_deg2rad_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_deg2rad_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_deg2rad_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_deg2rad_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_digamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_digamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_digamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_digamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_digamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_digamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_digamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_digamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_digamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_digamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dist_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dist_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dist_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dist_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_floor_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_floor_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_floor_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_floor_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_floor_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_floor_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_floor_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_floor_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_floor_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_trunc_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_trunc_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_trunc_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_trunc_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_trunc_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_trunc_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_trunc_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_trunc_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_trunc_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_einsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_einsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_einsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_einsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_einsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_einsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfinv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfinv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfinv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfinv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfinv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfinv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfinv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfinv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exponential_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exponential_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_float8_e4m3fnuz, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_float8_e5m2, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_float8_e5m2fnuz, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_frexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_frexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_frexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_frexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gcd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gcd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gcd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gcd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gcd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ge_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ge_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ge_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ge_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ge_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ge_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ge_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ge_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ge_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ge_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geometric_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geometric_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geometric_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geometric_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geometric_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geometric_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geometric_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geometric_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geometric_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geqrf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geqrf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geqrf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geqrf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gradient_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gradient_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gradient_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gradient_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gradient_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gradient_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gradient_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gradient_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gradient_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gradient_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_grid_sampler_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_grid_sampler_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_grid_sampler_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_grid_sampler_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_grid_sampler_3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_grid_sampler_3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_grid_sampler_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_grid_sampler_3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hash_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hash_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hash_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hash_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hash_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hash_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hash_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hash_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hash_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hash_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_heaviside_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_heaviside_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_heaviside_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_heaviside_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_heaviside_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_heaviside_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_heaviside_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_heaviside_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_heaviside_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_heaviside_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_histc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_histc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_histc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_histc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_histc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_histc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_histc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hypot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hypot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hypot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hypot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_i0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_i0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_igamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_igamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_igammac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_igammac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_imag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_imag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_imag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_inner_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_inner_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_inner_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_inner_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_inner_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_inner_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isneginf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isneginf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isneginf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isneginf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isneginf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isneginf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isneginf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isneginf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isneginf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isneginf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isposinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isposinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isposinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isposinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isposinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isposinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isposinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isposinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isposinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isposinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_istft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_istft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kthvalue_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kthvalue_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kthvalue_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kthvalue_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kthvalue_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kthvalue_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kthvalue_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kthvalue_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kthvalue_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lcm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lcm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lcm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lcm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lcm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_le_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_le_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_le_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_le_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_le_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_le_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_le_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_le_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_le_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_le_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lerp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cholesky_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cholesky_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cholesky_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cholesky_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cond_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cond_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cond_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cond_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_det_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_det_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_det_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_det_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eig_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eig_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eig_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eig_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigvalsh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigvalsh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigvalsh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigvalsh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_householder_product_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_householder_product_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_householder_product_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_householder_product_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_inv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_inv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_inv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_inv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_inv_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_inv_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_inv_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_inv_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lstsq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lstsq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lstsq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lstsq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lstsq_grad_oriented_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lstsq_grad_oriented_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lstsq_grad_oriented_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lstsq_grad_oriented_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_rank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_rank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_rank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_rank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_rank_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_rank_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_rank_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_rank_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_multi_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_multi_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_multi_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_multi_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_multi_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_multi_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_subgradients_at_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_subgradients_at_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_subgradients_at_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_subgradients_at_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_singular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_singular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_singular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_singular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_slogdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_slogdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_slogdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_slogdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_triangular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_triangular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_triangular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_triangular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_svdvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_svdvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_svdvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_svdvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_tensorinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_tensorinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_tensorinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_tensorinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_tensorsolve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_tensorsolve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_tensorsolve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_tensorsolve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vander_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vander_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vander_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vander_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vander_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vander_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vander_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vander_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vander_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vecdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vecdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vecdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vecdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vecdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vecdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vector_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vector_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vector_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vector_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vector_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vector_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logaddexp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logaddexp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logaddexp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logaddexp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logcumsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logcumsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logcumsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logcumsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logcumsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logcumsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_unpack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_unpack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_unpack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_unpack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_std_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_std_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_std_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_std_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_std_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_var_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_var_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_var_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_var_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_var_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matrix_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matrix_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matrix_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matrix_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matrix_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matrix_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_pool2d_with_indices_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_pool2d_with_indices_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_pool2d_with_indices_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_pool2d_with_indices_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_median_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_median_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_median_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_median_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_median_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_msort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_msort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_msort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_msort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_msort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_msort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_msort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_msort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_msort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_msort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_multinomial_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_multinomial_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_multinomial_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_multinomial_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nan_to_num_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nan_to_num_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nan_to_num_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nan_to_num_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nan_to_num_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nan_to_num_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nan_to_num_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nan_to_num_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nan_to_num_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nan_to_num_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmean_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmedian_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmedian_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmedian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmedian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmedian_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmedian_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmedian_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmedian_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmedian_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanquantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanquantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_dropout_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_dropout_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_dropout_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_dropout_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nextafter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nextafter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nextafter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nextafter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_alpha_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_alpha_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_alpha_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_alpha_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_binary_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_binary_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_binary_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_binary_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_celu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_celu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_celu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_celu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_embedding_loss_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_embedding_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_similarity_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_similarity_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_similarity_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_similarity_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_ctc_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_ctc_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_elu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_elu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_elu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_elu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_embedding_bag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_embedding_bag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_embedding_bag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_embedding_bag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_embedding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_embedding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_embedding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_embedding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_fractional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_fractional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_fractional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_fractional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_fractional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_fractional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_fractional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_fractional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_gaussian_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_gaussian_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_gaussian_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_gaussian_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_gelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_gelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_gelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_gelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_glu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_glu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_glu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_glu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_grid_sample_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_grid_sample_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_grid_sample_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_grid_sample_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_group_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_group_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_group_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_group_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardswish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardswish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardswish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardswish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardtanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardtanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardtanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardtanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardtanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardtanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardtanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardtanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hinge_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hinge_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hinge_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_huber_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_huber_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_huber_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_huber_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_instance_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_instance_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_instance_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_instance_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_area_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_area_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_area_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_area_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_bicubic_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_bicubic_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_bicubic_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_bicubic_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_trilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_trilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_trilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_trilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_kl_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_kl_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_kl_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_kl_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_l1_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_l1_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_leaky_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_leaky_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_leaky_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_leaky_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_linear_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_linear_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_local_response_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_local_response_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_local_response_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_local_response_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_logsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_logsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_logsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_logsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_margin_ranking_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_margin_ranking_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_margin_ranking_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_margin_ranking_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_margin_ranking_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_margin_ranking_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_margin_ranking_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_margin_ranking_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool1d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool1d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool1d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool1d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool2d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool2d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool2d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool2d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool3d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool3d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool3d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool3d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_mish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_mish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_mish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_mish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_mse_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_mse_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_mse_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_mse_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multi_head_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multi_head_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multi_head_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multi_head_attention_forward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multi_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multi_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multi_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multi_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multilabel_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multilabel_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multilabel_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multilabel_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_one_hot_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_reflect_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_reflect_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_reflect_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_reflect_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_reflect_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_reflect_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_reflect_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_reflect_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_reflect_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_reflect_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_reflect_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_negative_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_negative_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_negative_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_negative_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_negative_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_negative_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_negative_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_negative_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_negative_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_negative_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_negative_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pairwise_distance_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pairwise_distance_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pairwise_distance_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pairwise_distance_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pairwise_distance_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pairwise_distance_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pairwise_distance_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pairwise_distance_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pairwise_distance_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pairwise_distance_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pairwise_distance_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_poisson_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_poisson_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_poisson_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_poisson_nll_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_poisson_nll_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_poisson_nll_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_poisson_nll_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_poisson_nll_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_prelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_prelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_prelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_prelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu6_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu6_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu6_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu6_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu6_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu6_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu6_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu6_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu6_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_rms_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_rms_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_rms_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_rms_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_rms_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_rms_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_rrelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_rrelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_rrelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_rrelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_selu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_selu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_selu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_selu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_silu_complex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_silu_complex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_silu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_silu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_silu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_silu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_smooth_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_smooth_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_smooth_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softplus_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softplus_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softplus_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softplus_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_tanhshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_tanhshrink_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_tanhshrink_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_tanhshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_tanhshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_tanhshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_tanhshrink_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_tanhshrink_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_tanhshrink_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_tanhshrink_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_tanhshrink_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_threshold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_threshold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_threshold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_threshold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_threshold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_threshold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_threshold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_threshold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_threshold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_upsample_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_upsample_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_upsample_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_upsample_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_upsample_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_upsample_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_upsample_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_upsample_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_upsample_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_fro_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_fro_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_fro_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_fro_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_fro_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_fro_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_inf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_inf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_inf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_inf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_inf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_inf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_nuc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_nuc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_nuc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_nuc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_in_place_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_in_place_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_in_place_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_in_place_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_in_place_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_in_place_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_number_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_number_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_number_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_number_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ormqr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ormqr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ormqr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ormqr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pca_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pca_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pca_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pca_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pinverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pinverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pinverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pinverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polar_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polar_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_3_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_4_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_4_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_4_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_4_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_4_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_4_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_4_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_4_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_4_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_4_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_quantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_quantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rad2deg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rad2deg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rad2deg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rad2deg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rad2deg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rad2deg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rad2deg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rad2deg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rad2deg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rad2deg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rand_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rand_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rand_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rand_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rand_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rand_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rand_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_remainder_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_remainder_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_remainder_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_remainder_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_remainder_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_remainder_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_remainder_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_remainder_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_remainder_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_renorm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_renorm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_renorm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_renorm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_renorm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_renorm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_neg_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_neg_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_neg_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_neg_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_searchsorted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_searchsorted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_searchsorted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_searchsorted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_searchsorted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_searchsorted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_searchsorted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_searchsorted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_searchsorted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_bartlett_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_bartlett_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_blackman_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_blackman_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_gaussian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_gaussian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_general_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_general_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_general_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_general_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_hann_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_hann_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_kaiser_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_kaiser_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_nuttall_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_nuttall_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signbit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signbit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signbit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signbit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signbit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signbit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signbit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signbit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signbit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signbit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sparse_mm_reduce_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sparse_mm_reduce_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sparse_mm_reduce_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sparse_mm_reduce_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sparse_sampled_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sparse_sampled_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sparse_sampled_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sparse_sampled_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_airy_ai_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_airy_ai_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_airy_ai_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_airy_ai_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_airy_ai_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_airy_ai_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_airy_ai_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_airy_ai_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_entr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_entr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_entr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_entr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_entr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_entr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_entr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_entr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_entr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_entr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_erfcx_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_erfcx_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_erfcx_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_erfcx_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_erfcx_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_erfcx_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_erfcx_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_erfcx_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_h_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_h_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_h_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_h_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_h_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_h_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_h_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_h_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_he_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_he_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_he_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_he_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_he_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_he_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_he_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_he_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i0e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i0e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i0e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i0e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i0e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i0e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i0e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i0e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i0e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i0e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_laguerre_polynomial_l_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_laguerre_polynomial_l_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_laguerre_polynomial_l_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_laguerre_polynomial_l_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_laguerre_polynomial_l_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_laguerre_polynomial_l_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_laguerre_polynomial_l_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_laguerre_polynomial_l_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_legendre_polynomial_p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_legendre_polynomial_p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_legendre_polynomial_p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_legendre_polynomial_p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_legendre_polynomial_p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_legendre_polynomial_p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_legendre_polynomial_p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_legendre_polynomial_p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_log_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_log_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_log_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_log_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_log_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_log_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_log_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_log_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtri_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtri_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtri_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtri_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtri_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtri_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtri_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtri_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_spherical_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_spherical_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_spherical_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_spherical_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_spherical_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_spherical_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_spherical_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_spherical_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_xlog1py_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_xlog1py_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_xlog1py_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_xlog1py_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_xlog1py_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_xlog1py_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_xlog1py_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_xlog1py_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_xlog1py_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_xlog1py_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_zeta_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_zeta_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_zeta_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_zeta_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_zeta_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_zeta_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_zeta_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_zeta_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_svd_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_svd_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_svd_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_svd_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensordot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensordot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensordot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensordot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensordot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensordot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_topk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_topk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_topk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_topk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_topk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_topk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_topk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_topk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_topk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch__scaled_mm_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__efficient_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__efficient_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__flash_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapz_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapz_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapz_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapz_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapz_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapz_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapz_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapz_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapz_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapz_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapz_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triangular_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triangular_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triangular_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triangular_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_uniform_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_uniform_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_uniform_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_uniform_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_uniform_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_uniform_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_consecutive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_consecutive_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_consecutive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_consecutive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_consecutive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_consecutive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_consecutive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_consecutive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_consecutive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_consecutive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_uint64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unravel_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unravel_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unravel_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unravel_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unravel_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_xlogy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_xlogy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_xlogy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_xlogy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_xlogy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_xlogy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_xlogy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_xlogy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_xlogy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_xlogy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rand___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rand___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rand___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rand___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rand___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rand___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmatmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmatmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmatmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmatmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmatmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmatmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmod___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmod___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmod___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmod___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmod___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmod___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmod___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmod___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmod___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___ror___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace___ror___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___ror___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___ror___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___ror___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___ror___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rpow___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rpow___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rpow___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rpow___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rpow___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rpow___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rpow___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rpow___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rpow___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rpow___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rpow___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rsub___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rsub___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rsub___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rsub___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rsub___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rsub___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rsub___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rsub___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rsub___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rsub___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rsub___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rxor___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rxor___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rxor___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rxor___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rxor___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rxor___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__batch_norm_with_update_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__batch_norm_with_update_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__batch_norm_with_update_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__batch_norm_with_update_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__native_batch_norm_legit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__native_batch_norm_legit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__native_batch_norm_legit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__native_batch_norm_legit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__segment_reduce_lengths_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__segment_reduce_lengths_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__segment_reduce_lengths_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__segment_reduce_lengths_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__segment_reduce_offsets_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__segment_reduce_offsets_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__segment_reduce_offsets_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__segment_reduce_offsets_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__softmax_backward_data_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__softmax_backward_data_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__softmax_backward_data_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__softmax_backward_data_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__upsample_bilinear2d_aa_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__upsample_bilinear2d_aa_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__upsample_bilinear2d_aa_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__upsample_bilinear2d_aa_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_decomposed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_decomposed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_decomposed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_decomposed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_decomposed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_decomposed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_allclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_allclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_allclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_allclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_allclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_allclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_aminmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_aminmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_aminmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_aminmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_aminmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_aminmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_aminmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_aminmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_aminmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_aminmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_angle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_angle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_angle_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_angle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_angle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_angle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_angle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_angle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_angle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_angle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_angle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_arange_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_arange_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_arange_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_arange_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_arange_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_arange_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_arange_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_arange_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_arange_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argsort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argsort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argsort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argsort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argsort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argsort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argsort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argsort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argsort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argsort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_baddbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_baddbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_baddbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_baddbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_baddbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_baddbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bernoulli_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bernoulli_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bernoulli_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bernoulli_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bincount_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bincount_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bincount_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bincount_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bincount_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_left_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_left_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_left_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_left_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_left_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_right_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_right_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_right_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_right_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_right_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_shapes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bucketize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bucketize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bucketize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bucketize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bucketize_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bucketize_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bucketize_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bucketize_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bucketize_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cauchy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cauchy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cauchy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cauchy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_inverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_inverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_inverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_inverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_copysign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_copysign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_copysign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_copysign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_copysign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_copysign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_copysign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_copysign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_copysign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_copysign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_corrcoef_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_corrcoef_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_corrcoef_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_corrcoef_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_corrcoef_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_corrcoef_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_corrcoef_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_corrcoef_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_corrcoef_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_corrcoef_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_corrcoef_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cov_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cov_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cov_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cov_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cov_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cov_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cov_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cov_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cov_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cov_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cov_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumulative_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumulative_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumulative_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumulative_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumulative_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumulative_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumulative_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumulative_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumulative_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumulative_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumulative_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_deg2rad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_deg2rad_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_deg2rad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_deg2rad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_deg2rad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_deg2rad_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_deg2rad_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_deg2rad_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_deg2rad_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_deg2rad_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_digamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_digamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_digamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_digamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_digamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_digamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_digamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_digamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_digamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_digamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dist_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dist_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dist_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dist_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_floor_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_floor_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_floor_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_floor_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_floor_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_floor_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_floor_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_floor_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_floor_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_trunc_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_trunc_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_trunc_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_trunc_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_trunc_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_trunc_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_trunc_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_trunc_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_trunc_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_einsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_einsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_einsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_einsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_einsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_einsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfinv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfinv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfinv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfinv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfinv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfinv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfinv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfinv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exponential_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exponential_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_float8_e4m3fnuz, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_float8_e5m2, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_float8_e5m2fnuz, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_frexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_frexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_frexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_frexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gcd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gcd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gcd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gcd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gcd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ge_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ge_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ge_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ge_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ge_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ge_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ge_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ge_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ge_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ge_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geometric_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geometric_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geometric_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geometric_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geometric_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geometric_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geometric_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geometric_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geometric_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geqrf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geqrf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geqrf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geqrf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gradient_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gradient_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gradient_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gradient_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gradient_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gradient_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gradient_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gradient_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gradient_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gradient_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_grid_sampler_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_grid_sampler_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_grid_sampler_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_grid_sampler_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_grid_sampler_3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_grid_sampler_3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_grid_sampler_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_grid_sampler_3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hash_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hash_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hash_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hash_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hash_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hash_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hash_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hash_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hash_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hash_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_heaviside_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_heaviside_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_heaviside_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_heaviside_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_heaviside_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_heaviside_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_heaviside_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_heaviside_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_heaviside_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_heaviside_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_histc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_histc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_histc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_histc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_histc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_histc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_histc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hypot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hypot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hypot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hypot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_i0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_i0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_igamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_igamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_igammac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_igammac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_imag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_imag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_imag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_inner_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_inner_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_inner_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_inner_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_inner_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_inner_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isneginf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isneginf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isneginf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isneginf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isneginf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isneginf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isneginf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isneginf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isneginf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isneginf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isposinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isposinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isposinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isposinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isposinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isposinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isposinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isposinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isposinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isposinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_istft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_istft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kthvalue_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kthvalue_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kthvalue_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kthvalue_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kthvalue_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kthvalue_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kthvalue_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kthvalue_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kthvalue_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lcm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lcm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lcm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lcm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lcm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_le_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_le_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_le_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_le_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_le_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_le_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_le_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_le_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_le_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_le_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lerp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cholesky_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cholesky_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cholesky_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cholesky_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cond_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cond_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cond_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cond_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_det_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_det_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_det_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_det_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eig_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eig_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eig_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eig_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigvalsh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigvalsh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigvalsh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigvalsh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_householder_product_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_householder_product_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_householder_product_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_householder_product_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_inv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_inv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_inv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_inv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_inv_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_inv_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_inv_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_inv_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lstsq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lstsq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lstsq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lstsq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lstsq_grad_oriented_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lstsq_grad_oriented_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lstsq_grad_oriented_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lstsq_grad_oriented_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_rank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_rank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_rank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_rank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_rank_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_rank_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_rank_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_rank_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_multi_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_multi_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_multi_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_multi_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_multi_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_multi_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_subgradients_at_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_subgradients_at_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_subgradients_at_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_subgradients_at_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_singular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_singular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_singular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_singular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_slogdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_slogdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_slogdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_slogdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_triangular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_triangular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_triangular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_triangular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_svdvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_svdvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_svdvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_svdvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_tensorinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_tensorinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_tensorinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_tensorinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_tensorsolve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_tensorsolve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_tensorsolve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_tensorsolve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vander_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vander_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vander_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vander_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vander_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vander_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vander_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vander_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vander_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vecdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vecdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vecdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vecdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vecdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vecdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vector_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vector_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vector_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vector_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vector_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vector_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logaddexp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logaddexp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logaddexp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logaddexp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logcumsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logcumsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logcumsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logcumsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logcumsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logcumsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_unpack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_unpack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_unpack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_unpack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_std_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_std_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_std_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_std_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_std_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_var_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_var_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_var_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_var_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_var_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matrix_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matrix_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matrix_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matrix_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matrix_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matrix_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_pool2d_with_indices_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_pool2d_with_indices_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_pool2d_with_indices_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_pool2d_with_indices_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_median_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_median_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_median_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_median_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_median_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_msort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_msort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_msort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_msort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_msort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_msort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_msort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_msort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_msort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_msort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_multinomial_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_multinomial_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_multinomial_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_multinomial_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nan_to_num_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nan_to_num_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nan_to_num_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nan_to_num_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nan_to_num_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nan_to_num_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nan_to_num_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nan_to_num_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nan_to_num_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nan_to_num_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmean_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmedian_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmedian_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmedian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmedian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmedian_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmedian_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmedian_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmedian_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmedian_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanquantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanquantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_dropout_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_dropout_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_dropout_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_dropout_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nextafter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nextafter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nextafter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nextafter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_alpha_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_alpha_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_alpha_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_alpha_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_binary_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_binary_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_binary_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_binary_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_celu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_celu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_celu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_celu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_embedding_loss_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_embedding_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_similarity_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_similarity_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_similarity_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_similarity_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_ctc_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_ctc_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_elu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_elu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_elu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_elu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_embedding_bag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_embedding_bag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_embedding_bag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_embedding_bag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_embedding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_embedding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_embedding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_embedding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_fractional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_fractional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_fractional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_fractional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_fractional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_fractional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_fractional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_fractional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_gaussian_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_gaussian_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_gaussian_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_gaussian_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_gelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_gelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_gelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_gelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_glu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_glu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_glu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_glu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_grid_sample_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_grid_sample_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_grid_sample_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_grid_sample_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_group_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_group_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_group_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_group_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardswish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardswish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardswish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardswish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardtanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardtanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardtanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardtanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardtanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardtanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardtanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardtanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hinge_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hinge_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hinge_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_huber_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_huber_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_huber_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_huber_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_instance_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_instance_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_instance_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_instance_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_area_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_area_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_area_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_area_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_bicubic_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_bicubic_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_bicubic_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_bicubic_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_trilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_trilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_trilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_trilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_kl_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_kl_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_kl_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_kl_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_l1_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_l1_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_leaky_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_leaky_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_leaky_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_leaky_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_linear_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_linear_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_local_response_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_local_response_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_local_response_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_local_response_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_logsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_logsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_logsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_logsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_margin_ranking_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_margin_ranking_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_margin_ranking_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_margin_ranking_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_margin_ranking_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_margin_ranking_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_margin_ranking_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_margin_ranking_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool1d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool1d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool1d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool1d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool2d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool2d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool2d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool2d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool3d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool3d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool3d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool3d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_mish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_mish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_mish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_mish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_mse_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_mse_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_mse_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_mse_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multi_head_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multi_head_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multi_head_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multi_head_attention_forward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multi_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multi_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multi_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multi_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multilabel_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multilabel_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multilabel_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multilabel_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_one_hot_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_reflect_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_reflect_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_reflect_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_reflect_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_reflect_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_reflect_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_reflect_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_reflect_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_reflect_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_reflect_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_reflect_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_negative_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_negative_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_negative_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_negative_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_negative_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_negative_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_negative_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_negative_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_negative_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_negative_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_negative_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pairwise_distance_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pairwise_distance_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pairwise_distance_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pairwise_distance_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pairwise_distance_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pairwise_distance_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pairwise_distance_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pairwise_distance_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pairwise_distance_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pairwise_distance_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pairwise_distance_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_poisson_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_poisson_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_poisson_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_poisson_nll_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_poisson_nll_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_poisson_nll_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_poisson_nll_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_poisson_nll_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_prelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_prelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_prelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_prelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu6_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu6_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu6_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu6_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu6_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu6_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu6_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu6_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu6_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_rms_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_rms_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_rms_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_rms_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_rms_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_rms_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_rrelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_rrelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_rrelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_rrelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_selu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_selu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_selu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_selu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_silu_complex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_silu_complex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_silu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_silu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_silu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_silu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_smooth_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_smooth_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_smooth_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softplus_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softplus_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softplus_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softplus_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_tanhshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_tanhshrink_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_tanhshrink_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_tanhshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_tanhshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_tanhshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_tanhshrink_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_tanhshrink_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_tanhshrink_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_tanhshrink_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_tanhshrink_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_threshold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_threshold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_threshold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_threshold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_threshold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_threshold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_threshold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_threshold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_threshold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_upsample_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_upsample_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_upsample_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_upsample_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_upsample_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_upsample_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_upsample_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_upsample_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_upsample_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_fro_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_fro_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_fro_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_fro_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_fro_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_fro_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_inf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_inf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_inf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_inf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_inf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_inf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_nuc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_nuc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_nuc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_nuc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_in_place_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_in_place_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_in_place_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_in_place_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_in_place_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_in_place_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_number_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_number_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_number_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_number_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ormqr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ormqr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ormqr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ormqr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pca_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pca_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pca_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pca_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pinverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pinverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pinverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pinverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polar_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polar_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_3_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_4_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_4_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_4_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_4_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_4_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_4_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_4_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_4_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_4_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_4_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_quantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_quantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rad2deg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rad2deg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rad2deg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rad2deg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rad2deg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rad2deg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rad2deg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rad2deg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rad2deg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rad2deg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rand_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rand_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rand_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rand_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rand_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rand_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rand_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_remainder_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_remainder_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_remainder_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_remainder_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_remainder_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_remainder_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_remainder_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_remainder_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_remainder_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_renorm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_renorm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_renorm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_renorm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_renorm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_renorm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_neg_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_neg_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_neg_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_neg_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_searchsorted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_searchsorted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_searchsorted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_searchsorted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_searchsorted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_searchsorted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_searchsorted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_searchsorted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_searchsorted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_bartlett_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_bartlett_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_blackman_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_blackman_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_gaussian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_gaussian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_general_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_general_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_general_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_general_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_hann_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_hann_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_kaiser_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_kaiser_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_nuttall_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_nuttall_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signbit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signbit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signbit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signbit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signbit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signbit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signbit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signbit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signbit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signbit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sparse_mm_reduce_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sparse_mm_reduce_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sparse_mm_reduce_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sparse_mm_reduce_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sparse_sampled_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sparse_sampled_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sparse_sampled_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sparse_sampled_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_airy_ai_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_airy_ai_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_airy_ai_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_airy_ai_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_airy_ai_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_airy_ai_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_airy_ai_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_airy_ai_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_entr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_entr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_entr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_entr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_entr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_entr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_entr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_entr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_entr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_entr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_erfcx_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_erfcx_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_erfcx_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_erfcx_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_erfcx_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_erfcx_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_erfcx_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_erfcx_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_h_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_h_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_h_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_h_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_h_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_h_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_h_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_h_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_he_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_he_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_he_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_he_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_he_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_he_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_he_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_he_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i0e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i0e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i0e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i0e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i0e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i0e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i0e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i0e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i0e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i0e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_laguerre_polynomial_l_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_laguerre_polynomial_l_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_laguerre_polynomial_l_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_laguerre_polynomial_l_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_laguerre_polynomial_l_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_laguerre_polynomial_l_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_laguerre_polynomial_l_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_laguerre_polynomial_l_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_legendre_polynomial_p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_legendre_polynomial_p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_legendre_polynomial_p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_legendre_polynomial_p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_legendre_polynomial_p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_legendre_polynomial_p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_legendre_polynomial_p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_legendre_polynomial_p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_log_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_log_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_log_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_log_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_log_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_log_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_log_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_log_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtri_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtri_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtri_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtri_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtri_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtri_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtri_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtri_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_spherical_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_spherical_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_spherical_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_spherical_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_spherical_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_spherical_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_spherical_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_spherical_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_xlog1py_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_xlog1py_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_xlog1py_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_xlog1py_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_xlog1py_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_xlog1py_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_xlog1py_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_xlog1py_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_xlog1py_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_xlog1py_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_zeta_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_zeta_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_zeta_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_zeta_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_zeta_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_zeta_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_zeta_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_zeta_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_svd_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_svd_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_svd_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_svd_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensordot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensordot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensordot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensordot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensordot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensordot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_topk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_topk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_topk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_topk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_topk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_topk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_topk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_topk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_topk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch__scaled_mm_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__efficient_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__efficient_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__flash_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapz_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapz_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapz_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapz_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapz_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapz_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapz_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapz_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapz_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapz_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapz_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triangular_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triangular_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triangular_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triangular_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_uniform_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_uniform_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_uniform_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_uniform_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_uniform_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_uniform_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_consecutive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_consecutive_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_consecutive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_consecutive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_consecutive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_consecutive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_consecutive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_consecutive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_consecutive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_consecutive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_uint64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unravel_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unravel_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unravel_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unravel_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unravel_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_xlogy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_xlogy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_xlogy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_xlogy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_xlogy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_xlogy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_xlogy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_xlogy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_xlogy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_xlogy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_mixed_dtype_for_native_layer_norm_backward_float16_bias_dtype2_cuda, test/test_meta.py::TestMetaCUDA::test_mixed_dtype_for_native_layer_norm_backward_float16_float16_cuda, test/test_meta.py::TestMetaCUDA::test_mixed_dtype_for_native_layer_norm_backward_float16_float32_cuda, test/test_meta.py::TestMetaCUDA::test_mixed_dtype_for_native_layer_norm_backward_float32_bias_dtype2_cuda, test/test_meta.py::TestMetaCUDA::test_mixed_dtype_for_native_layer_norm_backward_float32_float16_cuda, test/test_meta.py::TestMetaCUDA::test_mixed_dtype_for_native_layer_norm_backward_float32_float32_cuda, test/test_meta.py::TestMetaCUDA::test_nan_to_num_cuda, test/test_meta.py::TestMetaCUDA::test_nonzero_cuda, test/test_meta.py::TestMetaCUDA::test_quantized_embedding_bag_cuda, test/test_meta.py::TestMetaCUDA::test_segment_reduce_backward_cuda, test/test_meta.py::TestMetaCUDA::test_stride_for_index_Tensor_cuda, test/test_meta.py::TestMetaCUDA::test_triangular_solve_out_cuda 2025-10-10T02:09:11.1545649Z 2025-10-10T02:09:12.4758494Z Running test_decomp 7/16 ... [2025-10-10 02:09:12.475227] 2025-10-10T02:09:12.4759043Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:09:12.4760728Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=7', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:09:12.475642] 2025-10-10T02:10:18.7518936Z 2025-10-10T02:10:18.7519854Z test_decomp 6/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_6.16_b6aa06fa43228dc4_.log 2025-10-10T02:10:18.7670634Z Running 544 items in this shard: test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmod___cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmul___cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rsub___cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rxor___cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive__upsample_bilinear2d_aa_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_abs_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_abs_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcdiv_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmv_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmv_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_allclose_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_aminmax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_angle_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_any_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_arange_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_arange_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_partial_views_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_2d_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_baddbmm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bernoulli_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_and_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_or_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_or_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_byte_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_byte_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cat_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ceil_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ceil_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_max_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_max_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_constant_pad_nd_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_constant_pad_nd_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_constant_pad_nd_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_corrcoef_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cos_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cos_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cos_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumprod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumsum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_deg2rad_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_scatter_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_scatter_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dist_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_trunc_rounding_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dsplit_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_permuted_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfinv_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_divide_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_like_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_grid_sampler_3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amax_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amin_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_inner_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isclose_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_unary_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_unary_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lgamma_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lgamma_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cholesky_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cond_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cond_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eig_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_inv_ex_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lu_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lu_solve_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_norm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_norm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_power_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_qr_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_svd_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_tensor_overload_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log1p_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log1p_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_xor_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logit_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logsumexp_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lu_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lu_solve_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mT_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mT_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumprod_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_mean_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_std_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_sum_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_sum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_maximum_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mean_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_multinomial_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmedian_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_neg_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_strided_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_ones_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool3d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_avg_pool1d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_avg_pool3d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout3d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_elu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_elu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_fractional_max_pool3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_glu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_group_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardtanh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_kl_div_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_l1_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool2d_grad_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_constant_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_smooth_l1_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_unfold_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ormqr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pca_lowrank_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_positive_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pow_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_put_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_put_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rand_like_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ravel_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_remainder_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_renorm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsub_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsub_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_mean_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_prod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_prod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_sum_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_searchsorted_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_scatter_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sgn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sort_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sparse_sampled_addmm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j0_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j1_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_t_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_u_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_erfcx_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1e_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtr_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_zeta_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stack_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_unbiased_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sub_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tan_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tanh_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_topk_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__efficient_attention_forward_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapezoid_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapz_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_indices_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_unbiased_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_unbiased_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_xlogy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_decomposed_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_aminmax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_aminmax_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_asinh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_and_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_not_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_or_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_bucketize_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_cat_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_cat_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_ceil_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_clone_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_clone_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_copysign_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_copysign_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_logsumexp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_sinc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_take_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_xlogy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_deg2rad_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_diag_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_scatter_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_div_floor_rounding_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_empty_like_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_empty_strided_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_erf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_erfc_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_erfinv_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_float8_e4m3fnuz, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft2_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft2_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifftn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfftn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_floor_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_floor_divide_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_floor_divide_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_floor_divide_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_fmax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fmod_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_frac_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_full_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_ge_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_gt_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_gt_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_igamma_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_igamma_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_index_fill_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_isin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_isin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_isnan_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_isposinf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_isposinf_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_item_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_item_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_le_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_le_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_le_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_lerp_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_cross_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_diagonal_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_logsumexp_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_masked_fill_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_masked_fill_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_list_of_tensors_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_variadic_tensors_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_mul_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_mul_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_mv_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_nan_to_num_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_narrow_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_narrow_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_native_batch_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_ne_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_embedding_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_embedding_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_glu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardsigmoid_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool2d_grad_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu6_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_softplus_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_softshrink_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_ones_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_ones_like_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_permute_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_permute_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_renorm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_round_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_round_decimals_3_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_rsqrt_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_select_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_slice_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_slice_scatter_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_special_log_ndtr_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_special_zeta_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_split_list_args_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_std_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_unbiased_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_std_unbiased_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_sub_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_sum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_trunc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_uniform_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_vdot_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_vdot_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_like_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_like_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_LSTM_train_mode_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_RNN_eval_mode_cuda_float32, test/test_decomp.py::DecompOneOffTestsCUDA::test_exponential_non_inf_cuda 2025-10-10T02:10:18.7813997Z 2025-10-10T02:10:22.5577145Z Running test_decomp 10/16 ... [2025-10-10 02:10:22.557182] 2025-10-10T02:10:22.5577678Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:10:22.5580432Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=10', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:10:22.557578] 2025-10-10T02:11:23.5223003Z 2025-10-10T02:11:23.5223891Z test_decomp 7/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_7.16_b85ff1c0038feadd_.log 2025-10-10T02:11:23.5380719Z Running 551 items in this shard: test/test_decomp.py::TestDecompCUDA::test_batch_norm_unflatten_weight_bias_cuda, test/test_decomp.py::TestDecompCUDA::test_cat_single_input_cuda, test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rand___cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmod___cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmul___cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rsub___cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rxor___cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_abs_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_add_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_add_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_add_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcdiv_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_alias_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_angle_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_angle_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_partial_views_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_scatter_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asinh_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bernoulli_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_not_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_not_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_right_shift_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_tensors_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_tensors_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bucketize_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_byte_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cat_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cat_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cdist_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cdouble_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_corrcoef_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cos_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_count_nonzero_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_count_nonzero_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cross_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumsum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dot_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_einsum_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_permuted_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_equal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_equal_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_equal_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erf_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfc_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfinv_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp2_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft2_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft2_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfftn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flip_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fliplr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_power_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_divide_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ge_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gradient_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gradient_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hash_tensor_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_heaviside_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_fill_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_fill_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_mean_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_select_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_inner_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isclose_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isclose_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isfinite_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isnan_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isposinf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kron_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ldexp_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_le_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_inv_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_ldl_factor_ex_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_ldl_factor_ex_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lstsq_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_rank_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_multi_dot_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_solve_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_solve_ex_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_tensorinv_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_tensorsolve_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vector_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logaddexp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_and_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_xor_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_tensor_overload_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lt_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lt_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lt_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mH_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mH_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mH_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mT_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumprod_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumprod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumprod_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_log_softmax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logaddexp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logsumexp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logsumexp_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logsumexp_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_scatter_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_scatter_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_softmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_matmul_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_matrix_exp_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_binary_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_binary_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mode_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_native_layer_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_neg_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_strided_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_zeros_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nextafter_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_alpha_dropout_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_batch_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_celu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv1d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose2d_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose2d_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose2d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_embedding_bag_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardswish_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_bilinear_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_nearest_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_logsigmoid_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool1d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool2d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multi_head_attention_forward_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_constant_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pdist_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu6_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_rms_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_smooth_l1_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_smooth_l1_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_threshold_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_unfold_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_in_place_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pinverse_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pow_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pow_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pow_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randn_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randn_like_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randn_like_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ravel_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_roll_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_decimals_neg_3_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsqrt_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsub_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_add_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_searchsorted_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sign_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sign_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_general_cosine_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_nuttall_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_scatter_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_scatter_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_with_dtype_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_with_dtype_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sparse_mm_reduce_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_airy_ai_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_t_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_entr_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_erfcx_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1e_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k0_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtr_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k1_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_xlog1py_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_xlog1py_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_list_args_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_square_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stack_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_unbiased_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_unbiased_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stft_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sub_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_along_dim_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tan_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tan_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tan_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tan_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tanh_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tile_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapezoid_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapz_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_uniform_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_cuda_uint64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vdot_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vstack_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vstack_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_where_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_xlogy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_xlogy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zero__cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zero__cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_like_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_like_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_addcmul_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_decomposed_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_decomposed_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_addmv_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_aminmax_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_any_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_arange_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_asinh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_baddbmm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_left_shift_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_not_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_right_shift_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_xor_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_xor_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_cat_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_clone_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_clone_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_constant_pad_nd_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_constant_pad_nd_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_constant_pad_nd_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_index_fill_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_nn_functional_max_unpool2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_std_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_triu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_cumprod_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_cumprod_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_cumsum_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_digamma_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_exp_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft2_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fmin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_full_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_full_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_geometric_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_heaviside_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_i0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_i0_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_index_fill_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_isin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_vector_norm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_logaddexp2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_logical_and_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_logical_xor_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_tensor_overload_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_masked_fill_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_maximum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_list_of_tensors_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_nan_to_num_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_ne_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nextafter_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nextafter_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_gelu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardsigmoid_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardtanh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_huber_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_leaky_relu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_unfold_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_norm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_norm_inf_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_normal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_normal_in_place_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_ones_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_permute_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_randn_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_remainder_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_repeat_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_rot90_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_round_decimals_3_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_select_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_signbit_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_sinc_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_sinc_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_slice_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_slice_scatter_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_softmax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_special_entr_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_split_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_split_list_args_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_split_list_args_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_split_list_args_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_sum_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_sum_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_tril_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_tril_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_triu_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_triu_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_unsafe_split_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_var_mean_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_like_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_GRU_train_mode_cuda_float64 2025-10-10T02:11:23.5529157Z 2025-10-10T02:11:27.3856944Z Running test_decomp 15/16 ... [2025-10-10 02:11:27.385117] 2025-10-10T02:11:27.3857621Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:11:27.3859056Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=15', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:11:27.385498] 2025-10-10T02:11:35.1662340Z 2025-10-10T02:11:35.1663078Z test_decomp 15/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_15.16_674eae0c73a8ecc0_.log 2025-10-10T02:11:35.1810765Z Running 525 items in this shard: test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmod___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmul___cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___ror___cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rsub___cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rxor___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rxor___cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive__batch_norm_with_update_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_abs_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addbmm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_decomposed_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addr_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_aminmax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_aminmax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_arange_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmax_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argsort_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_2d_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_not_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_block_diag_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bucketize_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cdist_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cov_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cross_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummin_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumprod_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumprod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumsum_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumsum_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_deg2rad_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_scatter_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dist_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_floor_rounding_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_floor_rounding_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_no_rounding_mode_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_no_rounding_mode_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_double_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_like_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_permuted_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfc_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfinv_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft2_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fill_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fill_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flatten_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fliplr_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_power_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_divide_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_divide_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_divide_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmax_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_frac_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_like_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gcd_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_grid_sampler_3d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hash_tensor_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_heaviside_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_fill_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_fill_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_mean_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_select_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_inner_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isfinite_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isnan_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isneginf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isposinf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_unary_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kron_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ldexp_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eig_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lstsq_grad_oriented_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lu_solve_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_norm_subgradients_at_zero_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_tensorsolve_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vecdot_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_tensor_overload_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_normal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_normal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_with_dtype_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logcumsumexp_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logit_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lt_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lu_unpack_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumprod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logaddexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logsumexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_scatter_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_matmul_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_pool2d_with_indices_backward_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_msort_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mul_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_multinomial_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mv_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmedian_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_strided_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_zeros_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_zeros_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool2d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_max_pool1d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_avg_pool3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_binary_cross_entropy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_celu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv1d_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv2d_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose3d_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose3d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout2d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_embedding_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardsigmoid_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hinge_embedding_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_instance_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_kl_div_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_linear_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_logsigmoid_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multi_margin_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_nll_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pdist_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_rrelu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_selu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_silu_complex_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_unfold_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_nuc_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_nuc_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_nuc_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_nuc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_in_place_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_positive_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_qr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rad2deg_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_like_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reciprocal_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_renorm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize__cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize__cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize__cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_roll_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_decimals_3_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_add_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_prod_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_scatter_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_scatter_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_short_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_kaiser_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_kaiser_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_with_dtype_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sort_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y1_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y1_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_u_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1e_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1e_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i1_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtr_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtri_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_list_args_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_square_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sub_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_along_dim_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vstack_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vstack_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick__chunk_cat_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick__chunk_cat_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick__native_batch_norm_legit_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick__softmax_backward_data_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_addcdiv_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_addcmul_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_addmv_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_aminmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_asin_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_atanh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_baddbmm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_baddbmm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_or_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_right_shift_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_xor_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_cat_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_alias_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_std_mean_unbiased_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_deg2rad_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_diag_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_diag_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_scatter_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_erf_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_erfc_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_erfinv_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_exp2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_exponential_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfftn_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_flip_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fmin_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fmin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_full_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_full_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_ge_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_i0_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_igammac_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_isneginf_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_isneginf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_isneginf_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_lgamma_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_diagonal_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_diagonal_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_vector_norm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_vector_norm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_logaddexp2_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_logical_and_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_logit_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_tensor_overload_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_lt_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_lt_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_masked_fill_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_maximum_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_list_of_tensors_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_variadic_tensors_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_mul_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_narrow_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_ne_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_neg_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_elu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardtanh_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_leaky_relu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_mse_loss_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_silu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_softplus_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_softshrink_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_unfold_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_norm_fro_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_norm_inf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_normal_number_mean_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_ones_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_permute_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_permute_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_permute_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_pow_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_pow_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_pow_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_rad2deg_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_randn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_rot90_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_round_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_sinc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_special_entr_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_special_i0e_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1e_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1e_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_special_zeta_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_split_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_split_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_split_list_args_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_sub_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_sub_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_t_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_t_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_t_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_take_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_tril_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_triu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_trunc_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_var_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_var_mean_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_vdot_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_vdot_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_view_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_GRU_eval_mode_cuda_float32, test/test_decomp.py::DecompOneOffTestsCUDA::test_rms_norm_decomp_cuda_cuda, test/test_decomp.py::HasDecompTest::test_aten_core_operators 2025-10-10T02:11:35.1951226Z 2025-10-10T02:11:38.9517703Z Running test_decomp 16/16 ... [2025-10-10 02:11:38.951190] 2025-10-10T02:11:38.9518216Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:11:38.9519380Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=16', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:11:38.951591] 2025-10-10T02:11:46.7801129Z 2025-10-10T02:11:46.7801950Z test_decomp 10/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_10.16_c2717449774e4579_.log 2025-10-10T02:11:46.7962772Z Running 570 items in this shard: test/test_decomp.py::TestDecompCUDA::test_bernoulli_default_cuda, test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rdiv___cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmatmul___cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmul___cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rsub___cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive__segment_reduce_offsets_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcdiv_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_decomposed_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_alias_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_allclose_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amax_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amin_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_aminmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_angle_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_any_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_any_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_arange_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmax_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argsort_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asinh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asinh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_not_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_block_diag_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_tensors_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_tensors_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bucketize_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bucketize_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_byte_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_byte_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cdouble_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cholesky_solve_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_max_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_max_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_constant_pad_nd_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_copysign_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_corrcoef_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_count_nonzero_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cross_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_deg2rad_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dist_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_floor_rounding_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_like_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_equal_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erf_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfc_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfinv_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfinv_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_as_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fill_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flatten_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flip_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmin_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geqrf_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geqrf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gradient_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hash_tensor_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_heaviside_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_histc_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hsplit_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hsplit_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hstack_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_igammac_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_imag_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_imag_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_prod_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_prod_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_select_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isclose_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isfinite_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isnan_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isnan_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isneginf_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_istft_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kthvalue_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_le_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_le_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lerp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lgamma_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigvals_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigvalsh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_inv_ex_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lstsq_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lstsq_grad_oriented_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lu_factor_ex_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_rank_hermitian_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_hermitian_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_slogdet_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vecdot_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vecdot_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_tensor_overload_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log2_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_with_dtype_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_and_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_xor_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_tensor_overload_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logsumexp_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lt_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logsumexp_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_normalize_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_softmin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_sum_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_binary_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_maximum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_median_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_binary_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_minimum_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mul_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_multinomial_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nan_to_num_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_native_dropout_backward_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_neg_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_strided_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_zeros_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool1d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv3d_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_ctc_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_embedding_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_embedding_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_huber_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_bicubic_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool2d_grad_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_mse_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multi_margin_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_circular_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_constant_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu6_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu6_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_silu_complex_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softshrink_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_number_mean_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ormqr_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_put_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_like_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_like_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reciprocal_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_remainder_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_renorm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize__cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize_as__cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_conj_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_roll_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsqrt_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_add_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_add_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_prod_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_scatter_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sgn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sign_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sign_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_bartlett_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_general_hamming_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_scatter_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_with_dtype_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sort_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sparse_sampled_addmm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y0_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y1_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_v_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_entr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i0e_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i0e_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i1_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k1_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_zeta_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_zeta_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sub_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sub_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tan_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_topk_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapezoid_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_indices_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trunc_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_unbiased_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_unbiased_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_unbiased_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_real_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zero__cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick__chunk_cat_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_addcmul_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_addcmul_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_decomposed_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_amax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_amax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_amin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_baddbmm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_not_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_ceil_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_max_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_max_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_max_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_masked_fill_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_nn_functional_max_unpool2d_grad_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_trace_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_cumprod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_cumsum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_scatter_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_empty_like_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_empty_strided_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_erfc_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_expand_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_exponential_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_floor_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fmax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_frexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_ge_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_ge_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_geometric_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_grid_sampler_2d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_hypot_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_isin_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_isneginf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_isneginf_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_le_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_le_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_diagonal_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_tensor_overload_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_log1p_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_log_normal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_logaddexp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_logical_and_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_logical_and_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_logit_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_tensor_overload_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_tensor_overload_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_mean_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_narrow_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_native_dropout_backward_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_neg_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_neg_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_elu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardsigmoid_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool3d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool3d_grad_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu6_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu6_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_rrelu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_softshrink_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_norm_nuc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_normal_number_mean_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_permute_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_permute_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_permute_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_polar_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_pow_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_pow_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_remainder_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_repeat_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_roll_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_round_decimals_0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_round_decimals_neg_3_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_rsqrt_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_select_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sinc_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_slice_scatter_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_softmax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_special_erfcx_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_special_i0e_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_special_i0e_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1e_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtri_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_special_zeta_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_split_list_args_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_std_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_unbiased_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_std_unbiased_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_triu_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_uniform_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_uniform_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_unsafe_split_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_var_mean_unbiased_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_var_unbiased_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_view_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_view_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_like_cuda_complex32, test/test_decomp.py::DecompOneOffTestsCUDA::test_native_layer_norm_cpu_decomp_cuda 2025-10-10T02:11:46.8116400Z 2025-10-10T02:11:50.6483039Z Running xpu/test_conv 1/1 ... [2025-10-10 02:11:50.647652] 2025-10-10T02:11:50.6483531Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:11:50.6485239Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'xpu/test_conv.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:11:50.648075] 2025-10-10T02:11:54.8214748Z 2025-10-10T02:11:54.8215976Z xpu/test_conv 1/1 was successful, full logs can be found in artifacts with path test/test-reports/xpu.test_conv_1.1_c0e4f3c7aae6dfed_.log 2025-10-10T02:11:54.8217015Z Running 0 items in this shard: 2025-10-10T02:11:54.8217321Z 2025-10-10T02:11:58.6927813Z Running functorch/test_ops 2/2 ... [2025-10-10 02:11:58.692185] 2025-10-10T02:11:58.6928633Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:11:58.6931017Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_ops.py', '-m', 'not serial', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:11:58.692594] 2025-10-10T02:16:22.1112826Z 2025-10-10T02:16:22.1113750Z test_decomp 16/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_16.16_bf1d120015c9ac2d_.log 2025-10-10T02:16:22.1276820Z Running 579 items in this shard: test/test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rand___cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rdiv___cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmatmul___cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmatmul___cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmod___cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmul___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___ror___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_abs_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_add_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_add_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmv_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_alias_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_aminmax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_any_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_any_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_any_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_partial_views_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_scatter_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asin_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asinh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_block_diag_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bucketize_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bucketize_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cat_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cat_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cdouble_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cholesky_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cholesky_solve_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_constant_pad_nd_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_copysign_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cos_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumprod_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_scatter_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dist_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_trunc_rounding_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_double_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_einsum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erf_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfc_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_as_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_as_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_as_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfftn_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft2_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fill_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fliplr_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fliplr_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gcd_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_grid_sampler_3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hash_tensor_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_heaviside_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_histc_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hsplit_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hstack_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_igamma_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amax_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_inner_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_int_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isclose_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isclose_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isnan_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isnan_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isneginf_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isposinf_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_unary_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kthvalue_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lcm_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lcm_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ldexp_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lerp_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lgamma_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lgamma_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigvalsh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lu_factor_ex_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_singular_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_solve_ex_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_solve_ex_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_tensor_overload_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_tensor_overload_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log1p_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log1p_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logaddexp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_and_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_and_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_and_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_or_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_xor_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_tensor_overload_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logsumexp_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lu_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lu_unpack_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amin_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_log_softmax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logaddexp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logsumexp_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_mean_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_mean_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_normalize_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_scatter_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_matrix_exp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_matrix_exp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_no_dim_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_no_dim_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mean_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_median_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_binary_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_minimum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mode_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nan_to_num_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmean_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmedian_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_native_dropout_backward_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_neg_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_zeros_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_zeros_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nextafter_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_max_pool2d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_alpha_dropout_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_batch_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_celu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose2d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_similarity_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_embedding_bag_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_glu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_group_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardswish_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_linear_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_nearest_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_l1_loss_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_linear_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool1d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool2d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool2d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multilabel_margin_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_normalize_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_scaled_dot_product_attention_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_tanhshrink_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_threshold_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pow_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ravel_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reciprocal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize_as__cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize_as__cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize_as__cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_decimals_0_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsqrt_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsqrt_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsqrt_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_mean_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_sum_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_sum_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_searchsorted_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_searchsorted_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_scatter_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sgn_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sign_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sign_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_bartlett_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_scatter_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_airy_ai_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j0_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_v_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_entr_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_erfcx_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i1_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_square_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_unbiased_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_unbiased_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_along_dim_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tanh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapezoid_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triangular_solve_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trunc_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trunc_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_consecutive_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unravel_index_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_unbiased_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vstack_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_like_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick__native_batch_norm_legit_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_addcmul_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_decomposed_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_addmv_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_addr_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_amax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_amin_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_amin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_any_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_any_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_arange_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_asinh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_not_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_right_shift_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_clone_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_complex_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_copysign_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_copysign_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_copysign_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_diagonal_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_norm_inf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_roll_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_sgn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_std_mean_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_cumprod_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_cumsum_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_scatter_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_digamma_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_div_trunc_rounding_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_empty_like_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_empty_like_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_empty_like_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_empty_like_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_empty_strided_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_erf_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_erfinv_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfftn_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfftn_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_flip_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_flip_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_flip_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_flip_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_floor_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fmin_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fmin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fmod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_frac_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_full_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_gcd_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_ge_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_ge_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_geometric_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_grid_sampler_2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_i0_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_index_fill_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_isneginf_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_lcm_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_lerp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_lgamma_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_lgamma_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_cross_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_diagonal_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_tensor_overload_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_log1p_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_log2_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_logical_or_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_logical_or_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_logical_xor_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_logical_xor_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_logical_xor_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_logit_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_logit_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_tensor_overload_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_lt_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_lt_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_list_of_tensors_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_list_of_tensors_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_variadic_tensors_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_mul_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_ne_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_ne_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_ne_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_neg_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_glu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardswish_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardtanh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardtanh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool3d_grad_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_silu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_unfold_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_norm_nuc_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_ones_like_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_ones_like_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_ones_like_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_permute_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_rad2deg_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_randn_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_repeat_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_roll_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_roll_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_roll_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_rsqrt_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_rsub_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_rsub_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_rsub_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_rsub_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_rsub_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_select_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_sinc_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_slice_scatter_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_special_log_ndtr_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtri_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_special_zeta_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sub_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_t_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_take_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_trunc_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_unsafe_split_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_unsafe_split_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_unsafe_split_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_var_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_var_mean_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_vdot_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_view_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_view_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_uint8, test/test_decomp.py::HasDecompTest::test_conv1d_decomposition 2025-10-10T02:16:22.1437215Z 2025-10-10T02:16:25.8951041Z Running test_datapipe 1/1 ... [2025-10-10 02:16:25.894492] 2025-10-10T02:16:25.8951599Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:16:25.8953178Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_datapipe.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:16:25.894907] 2025-10-10T02:16:30.1678811Z 2025-10-10T02:16:30.1679502Z test_datapipe 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_datapipe_1.1_dc5c5546e5068041_.log 2025-10-10T02:16:30.1705856Z Running 93 items in this shard: test/test_datapipe.py::TestDataChunk::test_as_string, test/test_datapipe.py::TestDataChunk::test_getitem, test/test_datapipe.py::TestDataChunk::test_iter, test/test_datapipe.py::TestDataChunk::test_len, test/test_datapipe.py::TestDataChunk::test_random_shuffle, test/test_datapipe.py::TestDataChunk::test_reverse, test/test_datapipe.py::TestDataChunk::test_sort, test/test_datapipe.py::TestStreamWrapper::test_api, test/test_datapipe.py::TestStreamWrapper::test_dir, test/test_datapipe.py::TestStreamWrapper::test_pickle, test/test_datapipe.py::TestStreamWrapper::test_repr, test/test_datapipe.py::TestIterableDataPipeBasic::test_demux_mux_datapipe, test/test_datapipe.py::TestIterableDataPipeBasic::test_groupby_iterable_datapipe, test/test_datapipe.py::TestIterableDataPipeBasic::test_listdirfiles_iterable_datapipe, test/test_datapipe.py::TestIterableDataPipeBasic::test_listdirfilesdeterministic_iterable_datapipe, test/test_datapipe.py::TestIterableDataPipeBasic::test_map_with_col_file_handle_datapipe, test/test_datapipe.py::TestIterableDataPipeBasic::test_openfilesfromdisk_iterable_datapipe, test/test_datapipe.py::TestIterableDataPipeBasic::test_routeddecoder_iterable_datapipe, test/test_datapipe.py::TestCaptureDataFrame::test_basic_capture, test/test_datapipe.py::TestDataFramesPipes::test_batch, test/test_datapipe.py::TestDataFramesPipes::test_capture, test/test_datapipe.py::TestDataFramesPipes::test_collate, test/test_datapipe.py::TestDataFramesPipes::test_filter, test/test_datapipe.py::TestDataFramesPipes::test_shuffle, test/test_datapipe.py::TestDataFramesPipes::test_unbatch, test/test_datapipe.py::TestFunctionalIterDataPipe::test_batch_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_collate_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_concat_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_demux_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_docstring, test/test_datapipe.py::TestFunctionalIterDataPipe::test_filter_datapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_fork_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_iterable_wrapper_datapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_map_dict_with_col_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_map_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_map_tuple_list_with_col_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_mux_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_sampler_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_serializable, test/test_datapipe.py::TestFunctionalIterDataPipe::test_serializable_with_dill, test/test_datapipe.py::TestFunctionalIterDataPipe::test_shuffler_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_stream_reader_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_unbatch_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_zip_iterdatapipe, test/test_datapipe.py::TestFunctionalMapDataPipe::test_batch_mapdatapipe, test/test_datapipe.py::TestFunctionalMapDataPipe::test_concat_mapdatapipe, test/test_datapipe.py::TestFunctionalMapDataPipe::test_docstring, test/test_datapipe.py::TestFunctionalMapDataPipe::test_map_mapdatapipe, test/test_datapipe.py::TestFunctionalMapDataPipe::test_sequence_wrapper_datapipe, test/test_datapipe.py::TestFunctionalMapDataPipe::test_serializable, test/test_datapipe.py::TestFunctionalMapDataPipe::test_serializable_with_dill, test/test_datapipe.py::TestFunctionalMapDataPipe::test_shuffler_mapdatapipe, test/test_datapipe.py::TestFunctionalMapDataPipe::test_zip_mapdatapipe, test/test_datapipe.py::TestTyping::test_compile_time, test/test_datapipe.py::TestTyping::test_construct_time, test/test_datapipe.py::TestTyping::test_isinstance, test/test_datapipe.py::TestTyping::test_issubinstance, test/test_datapipe.py::TestTyping::test_protocol, test/test_datapipe.py::TestTyping::test_reinforce, test/test_datapipe.py::TestTyping::test_runtime, test/test_datapipe.py::TestTyping::test_subtype, test/test_datapipe.py::TestGraph::test_simple_traverse, test/test_datapipe.py::TestGraph::test_traverse_circular_datapipe, test/test_datapipe.py::TestGraph::test_traverse_forked, test/test_datapipe.py::TestGraph::test_traverse_mapdatapipe, test/test_datapipe.py::TestGraph::test_traverse_mixdatapipe, test/test_datapipe.py::TestGraph::test_traverse_unhashable_datapipe, test/test_datapipe.py::TestSerialization::test_spawn_lambdas_iter, test/test_datapipe.py::TestSerialization::test_spawn_lambdas_map, test/test_datapipe.py::TestCircularSerialization::test_circular_serialization_with_dill, test/test_datapipe.py::TestCircularSerialization::test_circular_serialization_with_pickle, test/test_datapipe.py::TestSharding::test_legacy_custom_sharding, test/test_datapipe.py::TestSharding::test_legacy_custom_sharding_with_old_dataloader, test/test_datapipe.py::TestSharding::test_multi_sharding, test/test_datapipe.py::TestSharding::test_old_dataloader, test/test_datapipe.py::TestSharding::test_sharding_groups, test/test_datapipe.py::TestSharding::test_sharding_length, test/test_datapipe.py::TestSharding::test_simple_sharding, test/test_datapipe.py::TestIterDataPipeSingletonConstraint::test_iterdatapipe_singleton_buggy, test/test_datapipe.py::TestIterDataPipeSingletonConstraint::test_iterdatapipe_singleton_constraint_multiple_outputs, test/test_datapipe.py::TestIterDataPipeSingletonConstraint::test_iterdatapipe_singleton_generator, test/test_datapipe.py::TestIterDataPipeSingletonConstraint::test_iterdatapipe_singleton_new_object, test/test_datapipe.py::TestIterDataPipeSingletonConstraint::test_iterdatapipe_singleton_self_next, test/test_datapipe.py::TestIterDataPipeCountSampleYielded::test_iterdatapipe_sample_yielded_generator_function, test/test_datapipe.py::TestIterDataPipeCountSampleYielded::test_iterdatapipe_sample_yielded_generator_function_exception, test/test_datapipe.py::TestIterDataPipeCountSampleYielded::test_iterdatapipe_sample_yielded_next, test/test_datapipe.py::TestIterDataPipeCountSampleYielded::test_iterdatapipe_sample_yielded_next_exception, test/test_datapipe.py::TestIterDataPipeCountSampleYielded::test_iterdatapipe_sample_yielded_return_self, test/test_datapipe.py::TestIterDataPipeGraphFastForward::test_simple_snapshot_custom_non_generator, test/test_datapipe.py::TestIterDataPipeGraphFastForward::test_simple_snapshot_custom_self_next, test/test_datapipe.py::TestIterDataPipeGraphFastForward::test_simple_snapshot_graph, test/test_datapipe.py::TestIterDataPipeGraphFastForward::test_simple_snapshot_graph_repeated, test/test_datapipe.py::TestIterDataPipeGraphFastForward::test_simple_snapshot_graph_with_serialization 2025-10-10T02:16:30.1730832Z 2025-10-10T02:16:34.0943799Z Running lazy/test_generator 1/1 ... [2025-10-10 02:16:34.093765] 2025-10-10T02:16:34.0944231Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:16:34.0945975Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'lazy/test_generator.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:16:34.094215] 2025-10-10T02:16:38.1668844Z 2025-10-10T02:16:38.1670049Z lazy/test_generator 1/1 was successful, full logs can be found in artifacts with path test/test-reports/lazy.test_generator_1.1_3db541494829351f_.log 2025-10-10T02:16:38.1671263Z Running 2 items in this shard: test/lazy/test_generator.py::LazyGeneratorTest::test_generator, test/lazy/test_generator.py::LazyGeneratorTest::test_generator_causes_multiple_compiles 2025-10-10T02:16:38.1671966Z 2025-10-10T02:16:42.0314545Z Running torch_np/numpy_tests/lib/test_type_check 1/1 ... [2025-10-10 02:16:42.030850] 2025-10-10T02:16:42.0315345Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:16:42.0317060Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/lib/test_type_check.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:16:42.031232] 2025-10-10T02:16:46.2044346Z 2025-10-10T02:16:46.2046065Z torch_np/numpy_tests/lib/test_type_check 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.lib.test_type_check_1.1_d4097b0680d9fabe_.log 2025-10-10T02:16:46.2069949Z Running 50 items in this shard: test/torch_np/numpy_tests/lib/test_type_check.py::TestCommonType::test_basic, test/torch_np/numpy_tests/lib/test_type_check.py::TestMintypecode::test_default_1, test/torch_np/numpy_tests/lib/test_type_check.py::TestMintypecode::test_default_2, test/torch_np/numpy_tests/lib/test_type_check.py::TestMintypecode::test_default_3, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsscalar::test_basic, test/torch_np/numpy_tests/lib/test_type_check.py::TestReal::test_cmplx, test/torch_np/numpy_tests/lib/test_type_check.py::TestReal::test_real, test/torch_np/numpy_tests/lib/test_type_check.py::TestImag::test_cmplx, test/torch_np/numpy_tests/lib/test_type_check.py::TestImag::test_real, test/torch_np/numpy_tests/lib/test_type_check.py::TestIscomplex::test_fail, test/torch_np/numpy_tests/lib/test_type_check.py::TestIscomplex::test_pass, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsreal::test_fail, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsreal::test_isreal_real, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsreal::test_pass, test/torch_np/numpy_tests/lib/test_type_check.py::TestIscomplexobj::test_basic, test/torch_np/numpy_tests/lib/test_type_check.py::TestIscomplexobj::test_list, test/torch_np/numpy_tests/lib/test_type_check.py::TestIscomplexobj::test_scalar, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsrealobj::test_basic, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsnan::test_complex, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsnan::test_complex1, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsnan::test_goodvalues, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsnan::test_ind, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsnan::test_integer, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsnan::test_neginf, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsnan::test_posinf, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsfinite::test_complex, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsfinite::test_complex1, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsfinite::test_goodvalues, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsfinite::test_ind, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsfinite::test_integer, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsfinite::test_neginf, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsfinite::test_posinf, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsinf::test_goodvalues, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsinf::test_ind, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsinf::test_neginf, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsinf::test_neginf_scalar, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsinf::test_posinf, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsinf::test_posinf_scalar, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsposinf::test_generic, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsneginf::test_generic, test/torch_np/numpy_tests/lib/test_type_check.py::TestNanToNum::test_array, test/torch_np/numpy_tests/lib/test_type_check.py::TestNanToNum::test_complex_bad, test/torch_np/numpy_tests/lib/test_type_check.py::TestNanToNum::test_complex_bad2, test/torch_np/numpy_tests/lib/test_type_check.py::TestNanToNum::test_complex_good, test/torch_np/numpy_tests/lib/test_type_check.py::TestNanToNum::test_do_not_rewrite_previous_keyword, test/torch_np/numpy_tests/lib/test_type_check.py::TestNanToNum::test_float, test/torch_np/numpy_tests/lib/test_type_check.py::TestNanToNum::test_generic, test/torch_np/numpy_tests/lib/test_type_check.py::TestNanToNum::test_integer, test/torch_np/numpy_tests/lib/test_type_check.py::TestRealIfClose::test_basic, test/torch_np/numpy_tests/lib/test_type_check.py::TestArrayConversion::test_asfarray 2025-10-10T02:16:46.2092628Z 2025-10-10T02:16:50.0762205Z Running lazy/test_debug_util 1/1 ... [2025-10-10 02:16:50.075606] 2025-10-10T02:16:50.0762830Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:16:50.0764356Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'lazy/test_debug_util.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:16:50.076017] 2025-10-10T02:16:54.1484782Z 2025-10-10T02:16:54.1485845Z lazy/test_debug_util 1/1 was successful, full logs can be found in artifacts with path test/test-reports/lazy.test_debug_util_1.1_782c9fff9d33d04b_.log 2025-10-10T02:16:54.1487448Z Running 1 items in this shard: test/lazy/test_debug_util.py::DebugUtilTest::test_get_python_frames 2025-10-10T02:16:54.1488107Z 2025-10-10T02:16:58.0461253Z Running test_jit_llga_fuser 1/1 ... [2025-10-10 02:16:58.045484] 2025-10-10T02:16:58.0461977Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:16:58.0463802Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_jit_llga_fuser.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:16:58.045909] 2025-10-10T02:17:02.4197596Z 2025-10-10T02:17:02.4199054Z test_jit_llga_fuser 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_jit_llga_fuser_1.1_95e48d3d30174322_.log 2025-10-10T02:17:02.4233056Z Running 107 items in this shard: test/test_jit_llga_fuser.py::TestEnableDisableLlgaFuser::test_context_manager, test/test_jit_llga_fuser.py::TestDynamoAOT::test_dynamo_aot_ts_onednn, test/test_jit_llga_fuser.py::TestModel::test_vision_alexnet_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_alexnet_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_densenet121_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_densenet121_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_densenet161_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_densenet161_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_densenet169_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_densenet169_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_densenet201_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_densenet201_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b0_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b0_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b1_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b1_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b2_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b2_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b3_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b3_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b4_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b4_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b5_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b5_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b6_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b6_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b7_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b7_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_googlenet_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_googlenet_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_mnasnet1_0_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_mnasnet1_0_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_mobilenet_v2_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_mobilenet_v2_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_mobilenet_v3_large_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_mobilenet_v3_large_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_regnet_y_400mf_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_regnet_y_400mf_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_resnet50_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_resnet50_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_resnext101_32x8d_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_resnext101_32x8d_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_resnext50_32x4d_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_resnext50_32x4d_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_shufflenet_v2_x1_0_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_shufflenet_v2_x1_0_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_squeezenet1_0_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_squeezenet1_0_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_vgg16_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_vgg16_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_wide_resnet50_2_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_wide_resnet50_2_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_bn2d_eltwise_cuda_bfloat16, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_bn2d_eltwise_cuda_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_bn_cuda_bfloat16, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_bn_cuda_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_bn_relu_cuda_bfloat16, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_bn_relu_cuda_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_clamp_cuda_bfloat16, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_clamp_cuda_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_eltwise_cuda_bfloat16, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_eltwise_cuda_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_silu_cuda_bfloat16, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_silu_cuda_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_sum_cuda_bfloat16, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_sum_cuda_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_ensure_tensor_is_rewrapped_cuda_bfloat16, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_ensure_tensor_is_rewrapped_cuda_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_linear_eltwise_cuda_bfloat16, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_linear_eltwise_cuda_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_rewrap_tensor_input_to_pytorch_cuda_bfloat16, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_rewrap_tensor_input_to_pytorch_cuda_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_wildcard_cuda_bfloat16, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_wildcard_cuda_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_wildcard_unsupported_dtype_cuda_int32, test/test_jit_llga_fuser.py::TestOpCUDA::test_add_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_add_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_add_scalar_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_add_scalar_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_addmm_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_addmm_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_avg_pool2d_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_avg_pool2d_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_bn2d_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_bn2d_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_cat_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_cat_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_conv2d_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_conv2d_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_eltwise_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_eltwise_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_identity_binary_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_identity_binary_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_layer_norm_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_layer_norm_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_linear_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_linear_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_max_pool2d_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_max_pool2d_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_mul_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_mul_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_softmax_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_softmax_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_typecheck_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_typecheck_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_variable_kernel_avg_pool2d_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_variable_kernel_avg_pool2d_cuda_float32 2025-10-10T02:17:02.4264888Z 2025-10-10T02:17:06.3125745Z Running test_numa_binding 1/1 ... [2025-10-10 02:17:06.311991] 2025-10-10T02:17:06.3126259Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:17:06.3127970Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_numa_binding.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:17:06.312388] 2025-10-10T02:17:10.4361538Z 2025-10-10T02:17:10.4362379Z test_numa_binding 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_numa_binding_1.1_ef4381c1447d526b_.log 2025-10-10T02:17:10.4368566Z Running 19 items in this shard: test/test_numa_binding.py::NumaBindingTest::test_binds_to_node_0_if_node_stored_as_minus_one, test/test_numa_binding.py::NumaBindingTest::test_callable_entrypoint_basic, test/test_numa_binding.py::NumaBindingTest::test_core_complex_numa_binding_with_extra_l3, test/test_numa_binding.py::NumaBindingTest::test_core_complex_numa_binding_with_fewer_l3_than_gpu, test/test_numa_binding.py::NumaBindingTest::test_core_complex_prefers_caches_with_more_cpus, test/test_numa_binding.py::NumaBindingTest::test_core_complex_tiebreak_prefers_lower_cache_key, test/test_numa_binding.py::NumaBindingTest::test_default_numa_binding, test/test_numa_binding.py::NumaBindingTest::test_exclusive_numa_binding, test/test_numa_binding.py::NumaBindingTest::test_exclusive_raises_if_too_few_physical_cores, test/test_numa_binding.py::NumaBindingTest::test_explicit_numa_options_overrides_default, test/test_numa_binding.py::NumaBindingTest::test_fallback, test/test_numa_binding.py::NumaBindingTest::test_get_range_str_from_ints, test/test_numa_binding.py::NumaBindingTest::test_get_set_of_int_from_ranges_str, test/test_numa_binding.py::NumaBindingTest::test_no_numa_binding_if_numa_options_not_provided, test/test_numa_binding.py::NumaBindingTest::test_node_numa_binding, test/test_numa_binding.py::NumaBindingTest::test_nproc_must_equal_cuda_device_count_to_use_default_numa_options, test/test_numa_binding.py::NumaBindingTest::test_raises_if_binding_to_empty_set, test/test_numa_binding.py::NumaBindingTest::test_socket_numa_binding_with_multiple_numa_per_socket, test/test_numa_binding.py::NumaBindingTest::test_socket_numa_binding_with_single_numa_per_socket 2025-10-10T02:17:10.4374338Z 2025-10-10T02:17:14.3409870Z Running torch_np/numpy_tests/lib/test_histograms 1/1 ... [2025-10-10 02:17:14.340394] 2025-10-10T02:17:14.3410497Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:17:14.3412436Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/lib/test_histograms.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:17:14.340790] 2025-10-10T02:17:19.1152150Z 2025-10-10T02:17:19.1153156Z torch_np/numpy_tests/lib/test_histograms 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.lib.test_histograms_1.1_3b65da280a49a1f9_.log 2025-10-10T02:17:19.1175465Z Running 60 items in this shard: test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_arr_weights_mismatch, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_big_arrays, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_bin_array_dims, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_bin_edge_cases, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_bool_conversion, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_density, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_empty, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_error_binnum_type, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_exotic_weights, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_f32_rounding, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_finite_range, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_histogram_bin_edges, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_invalid_range, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_last_bin_inclusive_range, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_no_side_effects, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_object_array_of_0d, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_one_bin, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_outliers, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_precision, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_signed_overflow_bounds, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_signed_overflow_bounds_2, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_simple, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_some_nan_values, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_type, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_unsigned_monotonicity_check, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogram::test_weights, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramOptimBinNums::test_empty, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramOptimBinNums::test_incorrect_methods, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramOptimBinNums::test_limited_variance, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramOptimBinNums::test_novariance, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramOptimBinNums::test_outlier, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramOptimBinNums::test_scott_vs_stone, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramOptimBinNums::test_signed_integer_data_bins_auto, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramOptimBinNums::test_signed_integer_data_bins_doane, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramOptimBinNums::test_signed_integer_data_bins_fd, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramOptimBinNums::test_signed_integer_data_bins_rice, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramOptimBinNums::test_signed_integer_data_bins_scott, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramOptimBinNums::test_signed_integer_data_bins_stone, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramOptimBinNums::test_signed_integer_data_bins_sturges, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramOptimBinNums::test_simple, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramOptimBinNums::test_simple_range, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramOptimBinNums::test_simple_weighted, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramOptimBinNums::test_small, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramdd::test_bins_array, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramdd::test_bins_error_2, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramdd::test_bins_errors, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramdd::test_density_non_uniform_1d, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramdd::test_density_non_uniform_2d, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramdd::test_edge_dtype, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramdd::test_empty, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramdd::test_equal_edges, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramdd::test_finite_range, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramdd::test_identical_samples, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramdd::test_inf_edges, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramdd::test_large_integers, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramdd::test_rightmost_binedge, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramdd::test_shape_3d, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramdd::test_shape_4d, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramdd::test_simple, test/torch_np/numpy_tests/lib/test_histograms.py::TestHistogramdd::test_weights 2025-10-10T02:17:19.1195231Z 2025-10-10T02:17:23.0109001Z Running benchmark_utils/test_benchmark_utils 1/1 ... [2025-10-10 02:17:23.010350] 2025-10-10T02:17:23.0109502Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:17:23.0111476Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'benchmark_utils/test_benchmark_utils.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:17:23.010796] 2025-10-10T02:17:29.4612119Z 2025-10-10T02:17:29.4612834Z test_decomp 1/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_1.16_6946e487729ae572_.log 2025-10-10T02:17:29.4775614Z Running 579 items in this shard: test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rand___cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rand___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmod___cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmul___cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rsub___cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_add_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_decomposed_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amax_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_angle_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_angle_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_any_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argsort_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_partial_views_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asin_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_baddbmm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_baddbmm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bincount_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_and_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_left_shift_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_left_shift_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_or_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_right_shift_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_xor_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_block_diag_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bmm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_tensors_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cat_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cat_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cdouble_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ceil_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_max_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_count_nonzero_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cov_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cross_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_deg2rad_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_no_rounding_mode_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_no_rounding_mode_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_double_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_double_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dsplit_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dsplit_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_einsum_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_einsum_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_like_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_like_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_equal_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fill_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flip_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_frac_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_frexp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gcd_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hash_tensor_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_mean_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_prod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_select_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_int_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_unary_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lcm_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_le_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lerp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cholesky_ex_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_ldl_factor_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lu_factor_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lu_solve_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_rank_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_norm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_norm_subgradients_at_zero_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_norm_subgradients_at_zero_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_slogdet_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_svdvals_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_tensorinv_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vander_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log1p_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_with_dtype_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logaddexp2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logaddexp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logdet_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_or_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_xor_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logit_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logit_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logit_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lt_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mH_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_median_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_normalize_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_scatter_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_softmin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_std_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_matmul_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_binary_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_no_dim_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_with_dim_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_with_dim_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mode_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_msort_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mul_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mv_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_native_batch_norm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_neg_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_ones_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_ones_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_ones_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_ones_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool2d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cross_entropy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_elu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_gaussian_nll_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_glu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_grid_sample_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardswish_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardswish_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_instance_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_area_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_area_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_bicubic_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_bilinear_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_linear_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_l1_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_layer_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_local_response_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool2d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool3d_grad_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multilabel_margin_loss_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multilabel_margin_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_normalize_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_circular_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu6_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_tanhshrink_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_threshold_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_unfold_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_bilinear_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_fro_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_inf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_in_place_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ormqr_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_positive_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pow_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_put_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_put_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rad2deg_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rand_like_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rand_like_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reciprocal_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_remainder_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_conj_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_roll_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_decimals_0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_mean_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_prod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_prod_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_sum_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_searchsorted_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_scatter_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_short_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sign_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sort_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j0_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y0_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y0_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_v_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i0_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i0_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i0_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i1_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k0_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k0_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_list_args_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_square_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_square_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stack_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stft_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sub_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_along_dim_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tanh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tanh_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tile_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapz_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trunc_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_consecutive_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_consecutive_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_unbiased_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_unbiased_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vdot_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_where_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_where_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zero__cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_like_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_like_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick__chunk_cat_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick__upsample_bilinear2d_aa_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_addmv_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_addr_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_addr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_amax_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_amin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_arange_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_asinh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_atanh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_baddbmm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_or_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_cauchy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_ceil_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_max_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_max_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_copysign_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_diag_embed_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_logaddexp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_native_dropout_backward_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_nn_functional_hardshrink_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_norm_fro_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_unfold_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_unsqueeze_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_cumsum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_scatter_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_dist_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_dist_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_div_trunc_rounding_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_div_trunc_rounding_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_dot_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_exp2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_exp_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_exp_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_exponential_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft2_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfftn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft2_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_flip_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_floor_divide_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_fmax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_hypot_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_index_fill_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_isnan_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_isnan_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_isnan_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_cross_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_tensor_overload_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_log1p_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_log2_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_log2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_log_softmax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_logaddexp2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_logaddexp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_lt_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_masked_fill_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_maximum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_variadic_tensors_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_mv_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nan_to_num_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nan_to_num_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_nan_to_num_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_narrow_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_native_layer_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_embedding_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardtanh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool2d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_norm_fro_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_norm_fro_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_ones_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_ones_like_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_rad2deg_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_round_decimals_0_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_select_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_select_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_select_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_signbit_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_slice_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_slice_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_softmax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_special_entr_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_special_erfcx_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_special_erfcx_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1e_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_special_log_ndtr_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_split_list_args_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_std_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_t_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_triu_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_trunc_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_var_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_var_mean_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_like_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_like_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_like_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_LSTM_train_mode_cuda_float64, test/test_decomp.py::DecompOneOffTestsCUDA::test_sdpa_nn_functional_scaled_dot_product_attention_cuda_bfloat16 2025-10-10T02:17:29.4931560Z 2025-10-10T02:17:33.3716440Z Running torch_np/numpy_tests/core/test_scalarmath 1/1 ... [2025-10-10 02:17:33.371041] 2025-10-10T02:17:33.3716990Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:17:33.3718082Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_scalarmath.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:17:33.371417] 2025-10-10T02:18:23.5308069Z 2025-10-10T02:18:23.5309117Z torch_np/numpy_tests/core/test_scalarmath 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_scalarmath_1.1_fc9dd0dc170e7c55_.log 2025-10-10T02:18:23.5391826Z Running 186 items in this shard: test/torch_np/numpy_tests/core/test_scalarmath.py::TestTypes::test_leak, test/torch_np/numpy_tests/core/test_scalarmath.py::TestTypes::test_type_add, test/torch_np/numpy_tests/core/test_scalarmath.py::TestTypes::test_type_create, test/torch_np/numpy_tests/core/test_scalarmath.py::TestTypes::test_types, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBaseMath::test_blocked, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBaseMath::test_lower_align, test/torch_np/numpy_tests/core/test_scalarmath.py::TestPower::test_integers_to_negative_integer_power, test/torch_np/numpy_tests/core/test_scalarmath.py::TestPower::test_large_types, test/torch_np/numpy_tests/core/test_scalarmath.py::TestPower::test_mixed_types, test/torch_np/numpy_tests/core/test_scalarmath.py::TestPower::test_modular_power, test/torch_np/numpy_tests/core/test_scalarmath.py::TestPower::test_small_types, test/torch_np/numpy_tests/core/test_scalarmath.py::TestModulus::test_float_modulus_corner_cases_dt_d, test/torch_np/numpy_tests/core/test_scalarmath.py::TestModulus::test_float_modulus_corner_cases_dt_e, test/torch_np/numpy_tests/core/test_scalarmath.py::TestModulus::test_float_modulus_corner_cases_dt_f, test/torch_np/numpy_tests/core/test_scalarmath.py::TestModulus::test_float_modulus_exact, test/torch_np/numpy_tests/core/test_scalarmath.py::TestModulus::test_float_modulus_roundoff, test/torch_np/numpy_tests/core/test_scalarmath.py::TestModulus::test_modulus_basic, test/torch_np/numpy_tests/core/test_scalarmath.py::TestComplexDivision::test_branches, test/torch_np/numpy_tests/core/test_scalarmath.py::TestComplexDivision::test_signed_zeros, test/torch_np/numpy_tests/core/test_scalarmath.py::TestComplexDivision::test_zero_division, test/torch_np/numpy_tests/core/test_scalarmath.py::TestConversion::test_iinfo_long_values_1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestConversion::test_iinfo_long_values_2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestConversion::test_int_from_long, test/torch_np/numpy_tests/core/test_scalarmath.py::TestConversion::test_int_raise_behaviour, test/torch_np/numpy_tests/core/test_scalarmath.py::TestConversion::test_numpy_scalar_relational_operators, test/torch_np/numpy_tests/core/test_scalarmath.py::TestConversion::test_numpy_scalar_relational_operators_2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestConversion::test_scalar_comparison_to_none, test/torch_np/numpy_tests/core/test_scalarmath.py::TestRepr::test_float_repr, test/torch_np/numpy_tests/core/test_scalarmath.py::TestMultiply::test_no_seq_repeat_basic_array_like, test/torch_np/numpy_tests/core/test_scalarmath.py::TestMultiply::test_seq_repeat, test/torch_np/numpy_tests/core/test_scalarmath.py::TestNegative::test_exceptions, test/torch_np/numpy_tests/core/test_scalarmath.py::TestNegative::test_result, test/torch_np/numpy_tests/core/test_scalarmath.py::TestSubtract::test_exceptions, test/torch_np/numpy_tests/core/test_scalarmath.py::TestSubtract::test_result, test/torch_np/numpy_tests/core/test_scalarmath.py::TestAbs::test_builtin_abs_dtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestAbs::test_builtin_abs_dtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestAbs::test_builtin_abs_dtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestAbs::test_builtin_abs_dtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestAbs::test_builtin_abs_dtype4, test/torch_np/numpy_tests/core/test_scalarmath.py::TestAbs::test_numpy_abs_dtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestAbs::test_numpy_abs_dtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestAbs::test_numpy_abs_dtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestAbs::test_numpy_abs_dtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestAbs::test_numpy_abs_dtype4, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBitShifts::test_shift_all_bits_type_code_B_op0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBitShifts::test_shift_all_bits_type_code_B_op1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBitShifts::test_shift_all_bits_type_code_b_op0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBitShifts::test_shift_all_bits_type_code_b_op1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBitShifts::test_shift_all_bits_type_code_h_op0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBitShifts::test_shift_all_bits_type_code_h_op1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBitShifts::test_shift_all_bits_type_code_i_op0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBitShifts::test_shift_all_bits_type_code_i_op1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBitShifts::test_shift_all_bits_type_code_l_op0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBitShifts::test_shift_all_bits_type_code_l_op1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_complex_hashes_type_code_D, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_complex_hashes_type_code_F, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_float_and_complex_hashes_type_code_D, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_float_and_complex_hashes_type_code_F, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_float_and_complex_hashes_type_code_d, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_float_and_complex_hashes_type_code_e, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_float_and_complex_hashes_type_code_f, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_integer_hashes_type_code_B, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_integer_hashes_type_code_b, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_integer_hashes_type_code_h, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_integer_hashes_type_code_i, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_integer_hashes_type_code_l, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_divbyzero_dtype_B_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_divbyzero_dtype_B_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_divbyzero_dtype_b_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_divbyzero_dtype_b_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_divbyzero_dtype_h_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_divbyzero_dtype_h_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_divbyzero_dtype_i_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_divbyzero_dtype_i_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_divbyzero_dtype_l_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_divbyzero_dtype_l_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_B_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_B_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_B_operation2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_b_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_b_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_b_operation2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_h_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_h_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_h_operation2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_i_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_i_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_i_operation2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_l_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_l_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_l_operation2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_b_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_b_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_b_operation2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_b_operation3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_h_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_h_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_h_operation2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_h_operation3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_i_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_i_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_i_operation2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_i_operation3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_l_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_l_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_l_operation2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_l_operation3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_unsigned_integer_overflow_dtype_B, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____add_____rop_____radd___op8_cmp_False_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____add_____rop_____radd___op8_cmp_False_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____add_____rop_____radd___op8_cmp_False_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____add_____rop_____radd___op8_cmp_False_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____eq_____rop_____eq___op2_cmp_True_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____eq_____rop_____eq___op2_cmp_True_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____eq_____rop_____eq___op2_cmp_True_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____eq_____rop_____eq___op2_cmp_True_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____floordiv_____rop_____rfloordiv___op6_cmp_False_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____floordiv_____rop_____rfloordiv___op6_cmp_False_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____floordiv_____rop_____rfloordiv___op6_cmp_False_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____floordiv_____rop_____rfloordiv___op6_cmp_False_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____ge_____rop_____le___op5_cmp_True_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____ge_____rop_____le___op5_cmp_True_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____ge_____rop_____le___op5_cmp_True_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____ge_____rop_____le___op5_cmp_True_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____gt_____rop_____lt___op4_cmp_True_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____gt_____rop_____lt___op4_cmp_True_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____gt_____rop_____lt___op4_cmp_True_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____gt_____rop_____lt___op4_cmp_True_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____le_____rop_____ge___op1_cmp_True_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____le_____rop_____ge___op1_cmp_True_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____le_____rop_____ge___op1_cmp_True_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____le_____rop_____ge___op1_cmp_True_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____lt_____rop_____gt___op0_cmp_True_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____lt_____rop_____gt___op0_cmp_True_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____lt_____rop_____gt___op0_cmp_True_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____lt_____rop_____gt___op0_cmp_True_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____mod_____rop_____rmod___op9_cmp_False_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____mod_____rop_____rmod___op9_cmp_False_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____mod_____rop_____rmod___op9_cmp_False_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____mod_____rop_____rmod___op9_cmp_False_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____mul_____rop_____rmul___op10_cmp_False_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____mul_____rop_____rmul___op10_cmp_False_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____mul_____rop_____rmul___op10_cmp_False_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____mul_____rop_____rmul___op10_cmp_False_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____ne_____rop_____ne___op3_cmp_True_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____ne_____rop_____ne___op3_cmp_True_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____ne_____rop_____ne___op3_cmp_True_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____ne_____rop_____ne___op3_cmp_True_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____pow_____rop_____rpow___op11_cmp_False_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____pow_____rop_____rpow___op11_cmp_False_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____pow_____rop_____rpow___op11_cmp_False_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____pow_____rop_____rpow___op11_cmp_False_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____sub_____rop_____rsub___op12_cmp_False_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____sub_____rop_____rsub___op12_cmp_False_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____sub_____rop_____rsub___op12_cmp_False_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____sub_____rop_____rsub___op12_cmp_False_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____truediv_____rop_____rtruediv___op7_cmp_False_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____truediv_____rop_____rtruediv___op7_cmp_False_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____truediv_____rop_____rtruediv___op7_cmp_False_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____truediv_____rop_____rtruediv___op7_cmp_False_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____add_____rop_____radd___op8_cmp_False_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____add_____rop_____radd___op8_cmp_False_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____eq_____rop_____eq___op2_cmp_True_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____eq_____rop_____eq___op2_cmp_True_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____floordiv_____rop_____rfloordiv___op6_cmp_False_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____floordiv_____rop_____rfloordiv___op6_cmp_False_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____ge_____rop_____le___op5_cmp_True_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____ge_____rop_____le___op5_cmp_True_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____gt_____rop_____lt___op4_cmp_True_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____gt_____rop_____lt___op4_cmp_True_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____le_____rop_____ge___op1_cmp_True_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____le_____rop_____ge___op1_cmp_True_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____lt_____rop_____gt___op0_cmp_True_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____lt_____rop_____gt___op0_cmp_True_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____mod_____rop_____rmod___op9_cmp_False_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____mod_____rop_____rmod___op9_cmp_False_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____mul_____rop_____rmul___op10_cmp_False_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____mul_____rop_____rmul___op10_cmp_False_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____ne_____rop_____ne___op3_cmp_True_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____ne_____rop_____ne___op3_cmp_True_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____pow_____rop_____rpow___op11_cmp_False_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____pow_____rop_____rpow___op11_cmp_False_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____sub_____rop_____rsub___op12_cmp_False_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____sub_____rop_____rsub___op12_cmp_False_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____truediv_____rop_____rtruediv___op7_cmp_False_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____truediv_____rop_____rtruediv___op7_cmp_False_sctype1 2025-10-10T02:18:23.5472691Z 2025-10-10T02:18:27.3061294Z Running test_indexing 1/1 ... [2025-10-10 02:18:27.305646] 2025-10-10T02:18:27.3061886Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:18:27.3064087Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_indexing.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:18:27.306034] 2025-10-10T02:18:31.6299122Z 2025-10-10T02:18:31.6300301Z test_indexing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_indexing_1.1_1aabc744c25f1d2e_.log 2025-10-10T02:18:31.6379868Z Running 185 items in this shard: test/test_indexing.py::TestIndexingCUDA::test_advancedindex_big_cuda, test/test_indexing.py::TestIndexingCUDA::test_advancedindex_cuda_float16, test/test_indexing.py::TestIndexingCUDA::test_advancedindex_cuda_float64, test/test_indexing.py::TestIndexingCUDA::test_basic_advanced_combined_cuda, test/test_indexing.py::TestIndexingCUDA::test_bool_indices_accumulate_cuda, test/test_indexing.py::TestIndexingCUDA::test_bool_indices_cuda, test/test_indexing.py::TestIndexingCUDA::test_bool_mask_assignment_cuda, test/test_indexing.py::TestIndexingCUDA::test_byte_mask2d_cuda, test/test_indexing.py::TestIndexingCUDA::test_byte_mask_accumulate_cuda, test/test_indexing.py::TestIndexingCUDA::test_byte_mask_cuda, test/test_indexing.py::TestIndexingCUDA::test_byte_tensor_assignment_cuda, test/test_indexing.py::TestIndexingCUDA::test_cpu_indices_cuda, test/test_indexing.py::TestIndexingCUDA::test_cuda_broadcast_index_use_deterministic_algorithms_cuda, test/test_indexing.py::TestIndexingCUDA::test_ellipsis_tensor_cuda, test/test_indexing.py::TestIndexingCUDA::test_empty_index_cuda, test/test_indexing.py::TestIndexingCUDA::test_empty_ndim_index_bool_cuda, test/test_indexing.py::TestIndexingCUDA::test_empty_ndim_index_cuda, test/test_indexing.py::TestIndexingCUDA::test_empty_slice_cuda, test/test_indexing.py::TestIndexingCUDA::test_errors_index_copy_cuda, test/test_indexing.py::TestIndexingCUDA::test_gather_take_along_dim_cross_device_cuda_float32, test/test_indexing.py::TestIndexingCUDA::test_getitem_scalars_cuda, test/test_indexing.py::TestIndexingCUDA::test_index_add_deterministic_cuda, test/test_indexing.py::TestIndexingCUDA::test_index_copy_cuda_bfloat16, test/test_indexing.py::TestIndexingCUDA::test_index_copy_cuda_bool, test/test_indexing.py::TestIndexingCUDA::test_index_copy_cuda_complex128, test/test_indexing.py::TestIndexingCUDA::test_index_copy_cuda_complex64, test/test_indexing.py::TestIndexingCUDA::test_index_copy_cuda_float16, test/test_indexing.py::TestIndexingCUDA::test_index_copy_cuda_float32, test/test_indexing.py::TestIndexingCUDA::test_index_copy_cuda_float64, test/test_indexing.py::TestIndexingCUDA::test_index_copy_cuda_int16, test/test_indexing.py::TestIndexingCUDA::test_index_copy_cuda_int32, test/test_indexing.py::TestIndexingCUDA::test_index_copy_cuda_int64, test/test_indexing.py::TestIndexingCUDA::test_index_copy_cuda_int8, test/test_indexing.py::TestIndexingCUDA::test_index_copy_cuda_uint8, test/test_indexing.py::TestIndexingCUDA::test_index_copy_deterministic_cuda, test/test_indexing.py::TestIndexingCUDA::test_index_copy_scalars_cuda_bfloat16, test/test_indexing.py::TestIndexingCUDA::test_index_copy_scalars_cuda_bool, test/test_indexing.py::TestIndexingCUDA::test_index_copy_scalars_cuda_complex128, test/test_indexing.py::TestIndexingCUDA::test_index_copy_scalars_cuda_complex64, test/test_indexing.py::TestIndexingCUDA::test_index_copy_scalars_cuda_float16, test/test_indexing.py::TestIndexingCUDA::test_index_copy_scalars_cuda_float32, test/test_indexing.py::TestIndexingCUDA::test_index_copy_scalars_cuda_float64, test/test_indexing.py::TestIndexingCUDA::test_index_copy_scalars_cuda_int16, test/test_indexing.py::TestIndexingCUDA::test_index_copy_scalars_cuda_int32, test/test_indexing.py::TestIndexingCUDA::test_index_copy_scalars_cuda_int64, test/test_indexing.py::TestIndexingCUDA::test_index_copy_scalars_cuda_int8, test/test_indexing.py::TestIndexingCUDA::test_index_copy_scalars_cuda_uint8, test/test_indexing.py::TestIndexingCUDA::test_index_cuda, test/test_indexing.py::TestIndexingCUDA::test_index_fill_cuda_bfloat16, test/test_indexing.py::TestIndexingCUDA::test_index_fill_cuda_bool, test/test_indexing.py::TestIndexingCUDA::test_index_fill_cuda_complex128, test/test_indexing.py::TestIndexingCUDA::test_index_fill_cuda_complex64, test/test_indexing.py::TestIndexingCUDA::test_index_fill_cuda_float16, test/test_indexing.py::TestIndexingCUDA::test_index_fill_cuda_float32, test/test_indexing.py::TestIndexingCUDA::test_index_fill_cuda_float64, test/test_indexing.py::TestIndexingCUDA::test_index_fill_cuda_int16, test/test_indexing.py::TestIndexingCUDA::test_index_fill_cuda_int32, test/test_indexing.py::TestIndexingCUDA::test_index_fill_cuda_int64, test/test_indexing.py::TestIndexingCUDA::test_index_fill_cuda_int8, test/test_indexing.py::TestIndexingCUDA::test_index_fill_cuda_uint8, test/test_indexing.py::TestIndexingCUDA::test_index_getitem_copy_bools_slices_cuda, test/test_indexing.py::TestIndexingCUDA::test_index_ind_dtype_cuda, test/test_indexing.py::TestIndexingCUDA::test_index_limits_cuda, test/test_indexing.py::TestIndexingCUDA::test_index_put_accumulate_duplicate_indices_cuda, test/test_indexing.py::TestIndexingCUDA::test_index_put_accumulate_empty_cuda, test/test_indexing.py::TestIndexingCUDA::test_index_put_accumulate_expanded_values_cuda, test/test_indexing.py::TestIndexingCUDA::test_index_put_accumulate_non_contiguous_cuda, test/test_indexing.py::TestIndexingCUDA::test_index_put_deterministic_with_optional_tensors_cuda, test/test_indexing.py::TestIndexingCUDA::test_index_put_large_indices_cuda, test/test_indexing.py::TestIndexingCUDA::test_index_put_non_accumulate_deterministic_cuda, test/test_indexing.py::TestIndexingCUDA::test_index_put_src_datatype_cuda_bfloat16, test/test_indexing.py::TestIndexingCUDA::test_index_put_src_datatype_cuda_bool, test/test_indexing.py::TestIndexingCUDA::test_index_put_src_datatype_cuda_complex128, test/test_indexing.py::TestIndexingCUDA::test_index_put_src_datatype_cuda_complex64, test/test_indexing.py::TestIndexingCUDA::test_index_put_src_datatype_cuda_float16, test/test_indexing.py::TestIndexingCUDA::test_index_put_src_datatype_cuda_float8_e4m3fn, test/test_indexing.py::TestIndexingCUDA::test_index_put_src_datatype_cuda_float8_e5m2, test/test_indexing.py::TestIndexingCUDA::test_index_put_src_datatype_cuda_int64, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_amax_cuda_bfloat16, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_amax_cuda_float16, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_amax_cuda_float32, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_amax_cuda_float64, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_amax_cuda_int16, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_amax_cuda_int32, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_amax_cuda_int64, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_amax_cuda_int8, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_amax_cuda_uint8, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_amin_cuda_bfloat16, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_amin_cuda_float16, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_amin_cuda_float32, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_amin_cuda_float64, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_amin_cuda_int16, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_amin_cuda_int32, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_amin_cuda_int64, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_amin_cuda_int8, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_amin_cuda_uint8, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_mean_cuda_bfloat16, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_mean_cuda_float16, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_mean_cuda_float32, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_mean_cuda_float64, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_mean_cuda_int16, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_mean_cuda_int32, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_mean_cuda_int64, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_mean_cuda_int8, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_mean_cuda_uint8, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_prod_cuda_bfloat16, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_prod_cuda_float16, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_prod_cuda_float32, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_prod_cuda_float64, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_prod_cuda_int16, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_prod_cuda_int32, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_prod_cuda_int64, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_prod_cuda_int8, test/test_indexing.py::TestIndexingCUDA::test_index_reduce_reduce_prod_cuda_uint8, test/test_indexing.py::TestIndexingCUDA::test_index_scalar_with_bool_mask_cuda, test/test_indexing.py::TestIndexingCUDA::test_index_select_cuda_bfloat16, test/test_indexing.py::TestIndexingCUDA::test_index_select_cuda_bool, test/test_indexing.py::TestIndexingCUDA::test_index_select_cuda_complex128, test/test_indexing.py::TestIndexingCUDA::test_index_select_cuda_complex64, test/test_indexing.py::TestIndexingCUDA::test_index_select_cuda_float16, test/test_indexing.py::TestIndexingCUDA::test_index_select_cuda_float32, test/test_indexing.py::TestIndexingCUDA::test_index_select_cuda_float64, test/test_indexing.py::TestIndexingCUDA::test_index_select_cuda_float8_e4m3fn, test/test_indexing.py::TestIndexingCUDA::test_index_select_cuda_float8_e4m3fnuz, test/test_indexing.py::TestIndexingCUDA::test_index_select_cuda_float8_e5m2, test/test_indexing.py::TestIndexingCUDA::test_index_select_cuda_float8_e5m2fnuz, test/test_indexing.py::TestIndexingCUDA::test_index_select_cuda_int16, test/test_indexing.py::TestIndexingCUDA::test_index_select_cuda_int32, test/test_indexing.py::TestIndexingCUDA::test_index_select_cuda_int64, test/test_indexing.py::TestIndexingCUDA::test_index_select_cuda_int8, test/test_indexing.py::TestIndexingCUDA::test_index_select_cuda_uint8, test/test_indexing.py::TestIndexingCUDA::test_index_setitem_bools_slices_cuda, test/test_indexing.py::TestIndexingCUDA::test_index_src_datatype_cuda_bfloat16, test/test_indexing.py::TestIndexingCUDA::test_index_src_datatype_cuda_bool, test/test_indexing.py::TestIndexingCUDA::test_index_src_datatype_cuda_float16, test/test_indexing.py::TestIndexingCUDA::test_index_src_datatype_cuda_int64, test/test_indexing.py::TestIndexingCUDA::test_int_assignment_cuda, test/test_indexing.py::TestIndexingCUDA::test_int_indices2d_cuda, test/test_indexing.py::TestIndexingCUDA::test_int_indices_broadcast_cuda, test/test_indexing.py::TestIndexingCUDA::test_int_indices_cuda, test/test_indexing.py::TestIndexingCUDA::test_invalid_device_cuda, test/test_indexing.py::TestIndexingCUDA::test_invalid_index_cuda, test/test_indexing.py::TestIndexingCUDA::test_jit_indexing_cuda, test/test_indexing.py::TestIndexingCUDA::test_list_indices_cuda, test/test_indexing.py::TestIndexingCUDA::test_multi_dimensional_bool_mask_assignment_cuda, test/test_indexing.py::TestIndexingCUDA::test_multi_dimensional_bool_mask_cuda, test/test_indexing.py::TestIndexingCUDA::test_multiple_bool_indices_cuda, test/test_indexing.py::TestIndexingCUDA::test_multiple_byte_mask_cuda, test/test_indexing.py::TestIndexingCUDA::test_multiple_int_cuda, test/test_indexing.py::TestIndexingCUDA::test_none_cuda, test/test_indexing.py::TestIndexingCUDA::test_out_of_bound_index_cuda, test/test_indexing.py::TestIndexingCUDA::test_set_item_to_scalar_tensor_cuda, test/test_indexing.py::TestIndexingCUDA::test_setitem_expansion_error_cuda, test/test_indexing.py::TestIndexingCUDA::test_setitem_scalars_cuda, test/test_indexing.py::TestIndexingCUDA::test_single_int_cuda, test/test_indexing.py::TestIndexingCUDA::test_step_assignment_cuda, test/test_indexing.py::TestIndexingCUDA::test_step_cuda, test/test_indexing.py::TestIndexingCUDA::test_take_along_dim_cuda_float32, test/test_indexing.py::TestIndexingCUDA::test_take_along_dim_cuda_int64, test/test_indexing.py::TestIndexingCUDA::test_take_along_dim_invalid_cuda_float32, test/test_indexing.py::TestIndexingCUDA::test_take_along_dim_invalid_cuda_int64, test/test_indexing.py::TestIndexingCUDA::test_unravel_index_errors_cuda, test/test_indexing.py::TestIndexingCUDA::test_variable_slicing_cuda, test/test_indexing.py::TestIndexingCUDA::test_zero_dim_index_cuda, test/test_indexing.py::NumpyTestsCUDA::test_boolean_assignment_value_mismatch_cuda, test/test_indexing.py::NumpyTestsCUDA::test_boolean_indexing_alldims_cuda, test/test_indexing.py::NumpyTestsCUDA::test_boolean_indexing_onedim_cuda, test/test_indexing.py::NumpyTestsCUDA::test_boolean_indexing_twodim_cuda, test/test_indexing.py::NumpyTestsCUDA::test_boolean_indexing_weirdness_cuda, test/test_indexing.py::NumpyTestsCUDA::test_boolean_indexing_weirdness_tensors_cuda, test/test_indexing.py::NumpyTestsCUDA::test_boolean_list_indexing_cuda, test/test_indexing.py::NumpyTestsCUDA::test_boolean_shape_mismatch_cuda, test/test_indexing.py::NumpyTestsCUDA::test_broadcast_subspace_cuda, test/test_indexing.py::NumpyTestsCUDA::test_broaderrors_indexing_cuda, test/test_indexing.py::NumpyTestsCUDA::test_ellipsis_index_cuda, test/test_indexing.py::NumpyTestsCUDA::test_empty_fancy_index_cuda, test/test_indexing.py::NumpyTestsCUDA::test_empty_tuple_index_cuda, test/test_indexing.py::NumpyTestsCUDA::test_everything_returns_views_cuda, test/test_indexing.py::NumpyTestsCUDA::test_index_is_larger_cuda, test/test_indexing.py::NumpyTestsCUDA::test_index_no_floats_cuda, test/test_indexing.py::NumpyTestsCUDA::test_none_index_cuda, test/test_indexing.py::NumpyTestsCUDA::test_single_bool_index_cuda, test/test_indexing.py::NumpyTestsCUDA::test_single_int_index_cuda, test/test_indexing.py::NumpyTestsCUDA::test_trivial_fancy_out_of_bounds_cuda, test/test_indexing.py::NumpyTestsCUDA::test_truncate_leading_1s_cuda 2025-10-10T02:18:31.6458854Z 2025-10-10T02:18:35.5545893Z Running profiler/test_torch_tidy 1/1 ... [2025-10-10 02:18:35.553996] 2025-10-10T02:18:35.5546513Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:18:35.5548797Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_torch_tidy.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:18:35.554419] 2025-10-10T02:18:39.5768735Z 2025-10-10T02:18:39.5769757Z profiler/test_torch_tidy 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_torch_tidy_1.1_b59fbc7b8680e5c8_.log 2025-10-10T02:18:39.5777146Z Running 22 items in this shard: test/profiler/test_torch_tidy.py::TestTorchTidyProfiler::test_allocation_id_uniqueness, test/profiler/test_torch_tidy.py::TestTorchTidyProfiler::test_allocation_ids, test/profiler/test_torch_tidy.py::TestTorchTidyProfiler::test_allocation_ids_with_other_ops, test/profiler/test_torch_tidy.py::TestTorchTidyProfiler::test_allocations, test/profiler/test_torch_tidy.py::TestTorchTidyProfiler::test_extra_fields, test/profiler/test_torch_tidy.py::TestTorchTidyProfiler::test_impl_reuse, test/profiler/test_torch_tidy.py::TestTorchTidyProfiler::test_mkldnn_tensors, test/profiler/test_torch_tidy.py::TestTorchTidyProfiler::test_module_and_optimizer_ids, test/profiler/test_torch_tidy.py::TestTorchTidyProfiler::test_nnmodule_params, test/profiler/test_torch_tidy.py::TestTorchTidyProfiler::test_optimizer, test/profiler/test_torch_tidy.py::TestTorchTidyProfiler::test_optimizer_parameters_adam, test/profiler/test_torch_tidy.py::TestTorchTidyProfiler::test_optimizer_parameters_sgd, test/profiler/test_torch_tidy.py::TestTorchTidyProfiler::test_pointers_and_ids, test/profiler/test_torch_tidy.py::TestTorchTidyProfiler::test_refcounts, test/profiler/test_torch_tidy.py::TestTorchTidyProfiler::test_scalar_ins, test/profiler/test_torch_tidy.py::TestTorchTidyProfiler::test_sparse_tensors, test/profiler/test_torch_tidy.py::TestTorchTidyProfiler::test_tensor_lists, test/profiler/test_torch_tidy.py::TestTorchTidyProfiler::test_tensor_properties, test/profiler/test_torch_tidy.py::TestTorchTidyProfiler::test_tensorimpl_invalidation_full, test/profiler/test_torch_tidy.py::TestTorchTidyProfiler::test_tensorimpl_invalidation_keep_alive, test/profiler/test_torch_tidy.py::TestTorchTidyProfiler::test_tensorimpl_invalidation_scalar_args, test/profiler/test_torch_tidy.py::TestTorchTidyProfiler::test_tensorimpl_invalidation_set 2025-10-10T02:18:39.5784000Z 2025-10-10T02:18:43.4389341Z Running nn/test_module_hooks 1/1 ... [2025-10-10 02:18:43.438270] 2025-10-10T02:18:43.4390420Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:18:43.4393132Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'nn/test_module_hooks.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:18:43.438655] 2025-10-10T02:18:47.5114154Z 2025-10-10T02:18:47.5115266Z nn/test_module_hooks 1/1 was successful, full logs can be found in artifacts with path test/test-reports/nn.test_module_hooks_1.1_e2c8b40e464fd47e_.log 2025-10-10T02:18:47.5143909Z Running 53 items in this shard: test/nn/test_module_hooks.py::TestModuleHooks::test_always_called_forward_hooks, test/nn/test_module_hooks.py::TestModuleHooks::test_bw_hook_warning_for_non_tensor_or_tuple, test/nn/test_module_hooks.py::TestModuleHooks::test_forward_hooks_named_tuple_False, test/nn/test_module_hooks.py::TestModuleHooks::test_forward_hooks_named_tuple_True, test/nn/test_module_hooks.py::TestModuleHooks::test_forward_pre_hooks_named_tuple_False, test/nn/test_module_hooks.py::TestModuleHooks::test_forward_pre_hooks_named_tuple_True, test/nn/test_module_hooks.py::TestModuleHooks::test_full_backward_hooks_named_tuple_False, test/nn/test_module_hooks.py::TestModuleHooks::test_full_backward_hooks_named_tuple_True, test/nn/test_module_hooks.py::TestModuleHooks::test_full_backward_pre_hooks_named_tuple_False, test/nn/test_module_hooks.py::TestModuleHooks::test_full_backward_pre_hooks_named_tuple_True, test/nn/test_module_hooks.py::TestModuleHooks::test_kwarg_hooks, test/nn/test_module_hooks.py::TestModuleHooks::test_mixed_hooks_named_tuple_False, test/nn/test_module_hooks.py::TestModuleHooks::test_mixed_hooks_named_tuple_True, test/nn/test_module_hooks.py::TestModuleHooks::test_remove_kwarg_hooks, test/nn/test_module_hooks.py::TestStateDictHooks::test_load_state_dict_module_pre_hook_swap_False, test/nn/test_module_hooks.py::TestStateDictHooks::test_load_state_dict_module_pre_hook_swap_True, test/nn/test_module_hooks.py::TestStateDictHooks::test_load_state_dict_post_hook_backward_compatibility_swap_False, test/nn/test_module_hooks.py::TestStateDictHooks::test_load_state_dict_post_hook_backward_compatibility_swap_True, test/nn/test_module_hooks.py::TestStateDictHooks::test_load_state_dict_post_hook_swap_False, test/nn/test_module_hooks.py::TestStateDictHooks::test_load_state_dict_post_hook_swap_True, test/nn/test_module_hooks.py::TestStateDictHooks::test_load_state_dict_pre_hook_swap_False, test/nn/test_module_hooks.py::TestStateDictHooks::test_load_state_dict_pre_hook_swap_True, test/nn/test_module_hooks.py::TestStateDictHooks::test_no_extra_ref_to_module, test/nn/test_module_hooks.py::TestStateDictHooks::test_pickled_hook, test/nn/test_module_hooks.py::TestStateDictHooks::test_register_state_dict_post_hook_private_False, test/nn/test_module_hooks.py::TestStateDictHooks::test_register_state_dict_post_hook_private_True, test/nn/test_module_hooks.py::TestStateDictHooks::test_register_state_dict_pre_hook, test/nn/test_module_hooks.py::TestStateDictHooks::test_register_state_dict_pre_hook_backward_compat, test/nn/test_module_hooks.py::TestStateDictHooks::test_register_state_dict_pre_hook_lazy_module, test/nn/test_module_hooks.py::TestModuleGlobalHooks::test_global_and_local_hooks_order, test/nn/test_module_hooks.py::TestModuleGlobalHooks::test_module_backward_global_hook_writeable, test/nn/test_module_hooks.py::TestModuleGlobalHooks::test_module_forward_forward_hook_removable, test/nn/test_module_hooks.py::TestModuleGlobalHooks::test_module_forward_preforward_hook_removable, test/nn/test_module_hooks.py::TestModuleGlobalHooks::test_module_global_forward_preforward_hook_writeable, test/nn/test_module_hooks.py::TestModuleGlobalHooks::test_module_global_hook_invalid_outputs, test/nn/test_module_hooks.py::TestModuleGlobalHooks::test_module_global_hooks, test/nn/test_module_hooks.py::TestModuleGlobalHooks::test_module_global_hooks_with_kwargs, test/nn/test_module_hooks.py::TestModuleHookNN::test_backward_hooks_interaction, test/nn/test_module_hooks.py::TestModuleHookNN::test_hook_backward_size, test/nn/test_module_hooks.py::TestModuleHookNN::test_hook_backward_writeable, test/nn/test_module_hooks.py::TestModuleHookNN::test_hook_buffer_registration, test/nn/test_module_hooks.py::TestModuleHookNN::test_hook_cpp, test/nn/test_module_hooks.py::TestModuleHookNN::test_hook_extra_input, test/nn/test_module_hooks.py::TestModuleHookNN::test_hook_forward_preforward_writable, test/nn/test_module_hooks.py::TestModuleHookNN::test_hook_inplace, test/nn/test_module_hooks.py::TestModuleHookNN::test_hook_invalid_outputs, test/nn/test_module_hooks.py::TestModuleHookNN::test_hook_last_arg_requires_grad, test/nn/test_module_hooks.py::TestModuleHookNN::test_hook_no_requires_grad, test/nn/test_module_hooks.py::TestModuleHookNN::test_hook_non_full_warning, test/nn/test_module_hooks.py::TestModuleHookNN::test_hook_parameter_registration, test/nn/test_module_hooks.py::TestModuleHookNN::test_hook_requires_grad, test/nn/test_module_hooks.py::TestModuleHookNN::test_hook_submodule_registration, test/nn/test_module_hooks.py::TestModuleHookNN::test_hooks 2025-10-10T02:18:47.5171452Z 2025-10-10T02:18:51.4501588Z Running functorch/test_aotdispatch 1/1 ... [2025-10-10 02:18:51.449591] 2025-10-10T02:18:51.4502044Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:18:51.4504589Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_aotdispatch.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:18:51.449998] 2025-10-10T02:18:58.1789238Z 2025-10-10T02:18:58.1790264Z functorch/test_aotdispatch 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_aotdispatch_1.1_1e441b16cfa8619c_.log 2025-10-10T02:18:58.1995064Z Running 531 items in this shard: test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_False_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_False_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_True_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_True_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_False_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_False_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_True_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_True_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_autocast_disable_guard, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_mutation_data, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_mutation_forward_inputs, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_mutation_forward_inputs_create_graph, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_mutation_on_grad_out, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_pass_autocast_custom, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_pass_autocast_off, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_pass_autocast_on, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_batch_norm_amp, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_batchnorm_inference, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_buffer_batch_norm, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_buffer_copied_in_graph, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_buffer_copied_in_graph_with_different_shapes, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_compilation_context, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_complex_linear, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_composite_impl_compile, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_custom_autograd, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_custom_tensor_metadata, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_default_partitioner_saves_symints_not_tensors_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_dupe_arg, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_dupe_arg_returned_as_output, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_dupe_arg_torture, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_duplicated_arguments_on_tensor_overlap, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_dynamic_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_dynamic_shape_output_not_in_bw_graph, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_embedding_bag_view_dynamic, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_grad_context, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_inference_mode, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_inner_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_aliased_with_mutation_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_data_and_metadata_mutation, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_data_and_metadata_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_inplace_requires_grad_true, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_metadata_mutation_aliases, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_alias_everything, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_aliases_and_none_require_gradients, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_aliases_and_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_aliases_bases_out_of_order, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_aliases_other_input2, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_and_output_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_false_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_hidden_from_autograd_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_is_output, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_metadata2, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_modifies_autograd_meta_of_aliases, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_multiple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_output_view_multiple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_requires_grad_detach, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_requires_grad_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_requires_grad_no_grad_detach_mixed, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_requires_grad_no_grad_inference_graph, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_return, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_set__input_mutation, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_set__nop, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_simple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_simple_with_none_and_nontensor, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_storage_resize_before_set_, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_storage_resize_down, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_storage_resize_down_and_set_, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_storage_resize_up, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_output_aliase_custom_autograd_function, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_output_view_metadata_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_output_view_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_output_view_simple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_invalid_dupe, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_invalid_dupe_fake, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_invalid_dupe_left_bias, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_invalid_requires_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_invalid_requires_grad_fake, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_list_codegen, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_mark_activations_dynamic, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_mark_activations_dynamic_with_nested, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_mark_outputs_dynamic_use_autograd_False, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_mark_outputs_dynamic_use_autograd_True, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_mem_leak_from_save_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_module, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_multi_output, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_multi_output_list, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_mutates_input_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nested_subclasses, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nested_subclasses_complicated_inps, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nested_subclasses_complicated_inps_mixed, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nested_subclasses_non_homogenous, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nested_subclasses_non_nested_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_new_inp_requires_grad_now, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_no_grad_input_output, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_non_tensor_and_none_inputs, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nonidempotent_amp, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_input_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_input_multi_output_view_should_raise_autograd_error, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_and_returned, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_and_returned_different_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_and_returned_flipped, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_inplace_view_and_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_inplace_view_with_detach, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_multiple_mixed, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_mutation_linear, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_returned_multiple_times, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_single, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_multiple_inputs_get_correct_one, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_output_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_all_alias_types, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_dict, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_op_depending_on_symint, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_outputs_are_aliased, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_real_weights_in_symbolic_mode, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_real_weights_in_symbolic_mode_with_inplace_ops, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_saved_tensors_hooks_mutations_raise, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_set__and_data_mutation_bad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_set__and_data_mutation_good, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_set__not_allowed, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_set__steals_view_chain, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_single_output, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_some_output_requires_grad_input_doesnt, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_some_outputs_dont_require_grad_non_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_some_outputs_dont_require_grad_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_squeeze_mutation, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_subclass_metadata_mutation_req_grad_False, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_subclass_metadata_mutation_req_grad_True, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_subclasses_mixed, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_subclasses_mixed_mode, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_synthetic_base_base_attribute_is_none, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_view_and_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_view_detach, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_ban_dropout_mut_pre_dispatch, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_forward_mutation_multiple_mut, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_forward_mutation_no_buffer_mut, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_functionalized_rng_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_input_dupes_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_input_mutation, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_input_mutation_on_input_requiring_grad_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_input_mutation_on_parameter_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_metadata_mutation_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_module_joint, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_multiple_outputs_require_grad_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_buffer_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_composite_implicit_inplace, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_composite_implicit_linear, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_contiguous, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_conv_and_bn, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_func_composite_implicit, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_func_simple, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_func_view, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_map_1, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_map_2, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_outdtype, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_reshape, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_with_autograd_op, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_with_cond, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_with_cond_nested, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_simplified_basic, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_simplified_pytrees_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_synthetic_bases_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_unbacked_arg, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_with_torch_cond, test/functorch/test_aotdispatch.py::TestPartitioning::test_autocast, test/functorch/test_aotdispatch.py::TestPartitioning::test_contiguous, test/functorch/test_aotdispatch.py::TestPartitioning::test_custom_partitioner_fn, test/functorch/test_aotdispatch.py::TestPartitioning::test_default_partitioner_getitem, test/functorch/test_aotdispatch.py::TestPartitioning::test_default_partitioner_output_tensor_shape_tensor, test/functorch/test_aotdispatch.py::TestPartitioning::test_generate_gives_inference_graph, test/functorch/test_aotdispatch.py::TestPartitioning::test_meta_tensor_inplace_op, test/functorch/test_aotdispatch.py::TestPartitioning::test_min_cut_partitioner, test/functorch/test_aotdispatch.py::TestPartitioning::test_min_cut_partitioner_output_tensor_shape_tensor, test/functorch/test_aotdispatch.py::TestPartitioning::test_min_cut_partitioner_raise_getitems, test/functorch/test_aotdispatch.py::TestPartitioning::test_min_cut_partitioner_save_shape, test/functorch/test_aotdispatch.py::TestPartitioning::test_preserve_random, test/functorch/test_aotdispatch.py::TestPartitioning::test_quantize_activation_duplicate_nodes, test/functorch/test_aotdispatch.py::TestPartitioning::test_recompute_partitioning, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_incorrect_backward, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_inference, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_input_data_and_metadata_mutation, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_input_metadata_mutation, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_input_mutation, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_input_mutation_and_output_alias, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_output_alias, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_output_requires_grad_in_no_grad, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_output_requires_grad_in_no_grad_views, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_simple, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_module_simplified, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_module_simplified_dynamic, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_module_simplified_fake_tensor_gm_raises, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_module_simplified_preserves_stack_trace, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_module_simplified_preserves_stack_trace_from_mutation, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_test_subclasses_with_tensor_factories, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_flex_attn_noncontiguous_tangents, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_grads_no_force_contiguous_dense, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_grads_no_force_contiguous_nested_subclass, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_grads_no_force_contiguous_nested_tensor_tangent, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_grads_no_force_contiguous_subclass, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_inductor_freezing_with_subclasses, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_inference_python_dispatcher, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_layer_norm, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_lift_fresh_copy_in_graph, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_False_test_subclasses_False_device_cpu, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_False_test_subclasses_False_device_cuda, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_False_test_subclasses_True_device_cpu, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_False_test_subclasses_True_device_cuda, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_True_test_subclasses_False_device_cpu, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_True_test_subclasses_False_device_cuda, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_True_test_subclasses_True_device_cpu, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_True_test_subclasses_True_device_cuda, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_rms_norm, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_rrelu, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_rrelu_with_noise_mutation, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_saved_tensors_hooks_base_saved_tensors_hooks_filtering_mode_all, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_saved_tensors_hooks_base_saved_tensors_hooks_filtering_mode_donated, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_saved_tensors_hooks_base_saved_tensors_hooks_filtering_mode_no_static, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_saved_tensors_hooks_donated_buffers, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_saved_tensors_hooks_params, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_saved_tensors_hooks_recompile, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_subclass_parameters, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_subclass_parameters_torture_case, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_tangent_type_coercion, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_wrong_guess_tangent_type, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_False_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_False_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_True_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_True_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_False_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_False_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_True_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_True_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_autocast_disable_guard, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_mutation_data, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_mutation_forward_inputs, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_mutation_forward_inputs_create_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_mutation_on_grad_out, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_pass_autocast_custom, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_pass_autocast_off, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_pass_autocast_on, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_batch_norm_amp, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_batchnorm_inference, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_buffer_batch_norm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_buffer_copied_in_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_buffer_copied_in_graph_with_different_shapes, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_compilation_context, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_complex_linear, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_composite_impl_compile, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_custom_autograd, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_custom_tensor_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_default_partitioner_saves_symints_not_tensors_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_dupe_arg, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_dupe_arg_returned_as_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_dupe_arg_torture, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_duplicated_arguments_on_tensor_overlap, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_dynamic_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_dynamic_shape_output_not_in_bw_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_embedding_bag_view_dynamic, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_grad_context, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_inference_mode, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_inner_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_aliased_with_mutation_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_data_and_metadata_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_data_and_metadata_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_inplace_requires_grad_true, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_metadata_mutation_aliases, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_alias_everything, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_aliases_and_none_require_gradients, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_aliases_and_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_aliases_bases_out_of_order, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_aliases_other_input2, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_and_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_false_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_hidden_from_autograd_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_is_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_metadata2, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_modifies_autograd_meta_of_aliases, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_output_view_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_requires_grad_detach, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_requires_grad_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_requires_grad_no_grad_detach_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_requires_grad_no_grad_inference_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_return, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_set__input_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_set__nop, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_simple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_simple_with_none_and_nontensor, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_storage_resize_before_set_, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_storage_resize_down, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_storage_resize_down_and_set_, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_storage_resize_up, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_output_aliase_custom_autograd_function, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_output_view_metadata_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_output_view_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_output_view_simple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_inputs_overlapping_unsqueeze_with_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_inputs_overlapping_with_mutation_guard_base, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_invalid_dupe, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_invalid_dupe_fake, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_invalid_dupe_left_bias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_invalid_requires_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_invalid_requires_grad_fake, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_list_codegen, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mark_activations_dynamic, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mark_activations_dynamic_with_nested, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mark_outputs_dynamic_use_autograd_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mark_outputs_dynamic_use_autograd_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mem_leak_from_save_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_module, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_multi_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_multi_output_list, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mutates_input_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mutation_of_input_in_fw_and_bw, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mutations_in_bw_detached_from_tangent, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nested_subclasses, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nested_subclasses_complicated_inps, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nested_subclasses_complicated_inps_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nested_subclasses_non_homogenous, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nested_subclasses_non_nested_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_new_inp_requires_grad_now, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_no_grad_input_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_non_tensor_and_none_inputs, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nonidempotent_amp, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_input_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_input_multi_output_view_should_raise_autograd_error, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_and_returned, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_and_returned_different_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_and_returned_flipped, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_inplace_view_and_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_inplace_view_with_detach, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_multiple_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_mutation_linear, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_returned_multiple_times, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_single, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_multiple_inputs_get_correct_one, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_output_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_all_alias_types, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_dict, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_op_depending_on_symint, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_outputs_are_aliased, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_real_weights_in_symbolic_mode, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_real_weights_in_symbolic_mode_with_inplace_ops, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_saved_tensors_hooks_mutations_raise, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_set__and_data_mutation_bad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_set__and_data_mutation_good, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_set__not_allowed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_set__steals_view_chain, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_single_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_some_output_requires_grad_input_doesnt, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_some_outputs_dont_require_grad_non_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_some_outputs_dont_require_grad_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_squeeze_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_subclass_metadata_mutation_req_grad_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_subclass_metadata_mutation_req_grad_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_subclasses_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_subclasses_mixed_mode, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_synthetic_base_base_attribute_is_none, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_view_and_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_view_detach, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_False_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_False_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_True_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_True_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_False_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_False_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_True_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_True_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_autocast_disable_guard, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_mutation_data, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_mutation_forward_inputs, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_mutation_forward_inputs_create_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_mutation_on_grad_out, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_pass_autocast_custom, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_pass_autocast_off, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_pass_autocast_on, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_batch_norm_amp, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_batchnorm_inference, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_buffer_batch_norm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_buffer_copied_in_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_buffer_copied_in_graph_with_different_shapes, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_compilation_context, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_complex_linear, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_composite_impl_compile, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_custom_autograd, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_custom_tensor_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_default_partitioner_saves_symints_not_tensors_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_dupe_arg, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_dupe_arg_returned_as_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_dupe_arg_torture, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_duplicated_arguments_on_tensor_overlap, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_dynamic_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_dynamic_shape_output_not_in_bw_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_embedding_bag_view_dynamic, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_grad_context, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_inference_mode, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_inner_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_aliased_with_mutation_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_data_and_metadata_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_data_and_metadata_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_inplace_requires_grad_true, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_metadata_mutation_aliases, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_alias_everything, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_aliases_and_none_require_gradients, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_aliases_and_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_aliases_bases_out_of_order, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_aliases_other_input2, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_and_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_false_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_hidden_from_autograd_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_is_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_metadata2, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_modifies_autograd_meta_of_aliases, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_output_view_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_requires_grad_detach, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_requires_grad_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_requires_grad_no_grad_detach_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_requires_grad_no_grad_inference_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_return, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_set__input_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_set__nop, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_simple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_simple_with_none_and_nontensor, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_storage_resize_before_set_, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_storage_resize_down, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_storage_resize_down_and_set_, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_storage_resize_up, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_output_aliase_custom_autograd_function, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_output_view_metadata_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_output_view_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_output_view_simple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_inputs_overlapping_unsqueeze_with_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_inputs_overlapping_with_mutation_guard_base, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_invalid_dupe, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_invalid_dupe_fake, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_invalid_dupe_left_bias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_invalid_requires_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_invalid_requires_grad_fake, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_list_codegen, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mark_activations_dynamic, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mark_activations_dynamic_with_nested, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mark_outputs_dynamic_use_autograd_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mark_outputs_dynamic_use_autograd_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mem_leak_from_save_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_module, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_multi_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_multi_output_list, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mutates_input_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mutation_of_input_in_fw_and_bw, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mutations_in_bw_detached_from_tangent, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nested_subclasses, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nested_subclasses_complicated_inps, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nested_subclasses_complicated_inps_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nested_subclasses_non_homogenous, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nested_subclasses_non_nested_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_new_inp_requires_grad_now, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_no_grad_input_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_non_tensor_and_none_inputs, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nonidempotent_amp, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_input_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_input_multi_output_view_should_raise_autograd_error, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_and_returned, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_and_returned_different_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_and_returned_flipped, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_inplace_view_and_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_inplace_view_with_detach, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_multiple_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_mutation_linear, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_returned_multiple_times, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_single, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_multiple_inputs_get_correct_one, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_output_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_all_alias_types, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_dict, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_op_depending_on_symint, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_outputs_are_aliased, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_real_weights_in_symbolic_mode, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_real_weights_in_symbolic_mode_with_inplace_ops, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_saved_tensors_hooks_mutations_raise, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_set__and_data_mutation_bad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_set__and_data_mutation_good, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_set__not_allowed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_set__steals_view_chain, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_single_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_some_output_requires_grad_input_doesnt, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_some_outputs_dont_require_grad_non_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_some_outputs_dont_require_grad_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_squeeze_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_subclass_metadata_mutation_req_grad_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_subclass_metadata_mutation_req_grad_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_subclasses_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_subclasses_mixed_mode, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_synthetic_base_base_attribute_is_none, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_view_and_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_view_detach 2025-10-10T02:18:58.2189980Z 2025-10-10T02:19:02.0461204Z Running nn/test_load_state_dict 1/1 ... [2025-10-10 02:19:02.045538] 2025-10-10T02:19:02.0461636Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:19:02.0462667Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'nn/test_load_state_dict.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:19:02.045910] 2025-10-10T02:19:06.0681665Z 2025-10-10T02:19:06.0684010Z nn/test_load_state_dict 1/1 was successful, full logs can be found in artifacts with path test/test-reports/nn.test_load_state_dict_1.1_6cdbff59400f14c3_.log 2025-10-10T02:19:06.0694937Z Running 27 items in this shard: test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_BC_swap_False, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_BC_swap_True, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_assign_meta_swap_False_keep_vars_False, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_assign_meta_swap_False_keep_vars_True, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_assign_meta_swap_True_keep_vars_False, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_assign_meta_swap_True_keep_vars_True, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_assign_shape_stride_swap_False, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_assign_shape_stride_swap_True, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_assign_with_optimizer_swap_False, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_assign_with_optimizer_swap_True, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_child_swap_False, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_child_swap_True, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_custom_swap_False, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_custom_swap_True, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_invalid_swap_False, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_invalid_swap_True, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_ref_cycle_swap_False, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_swap_False, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_swap_True, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_type_swap_False, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_type_swap_True, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_warn_assign_swap_False, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_warn_assign_swap_True, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_with_unexpected_key_swap_False, test/nn/test_load_state_dict.py::TestLoadStateDict::test_load_state_dict_with_unexpected_key_swap_True, test/nn/test_load_state_dict.py::TestLoadStateDictSwap::test_swap_subclass_swap_True_assign_False, test/nn/test_load_state_dict.py::TestLoadStateDictSwap::test_swap_subclass_swap_True_assign_True 2025-10-10T02:19:06.0704252Z 2025-10-10T02:19:09.9613649Z Running torch_np/numpy_tests/linalg/test_linalg 1/1 ... [2025-10-10 02:19:09.960855] 2025-10-10T02:19:09.9614349Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:19:09.9616257Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/linalg/test_linalg.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:19:09.961266] 2025-10-10T02:19:15.9885778Z 2025-10-10T02:19:15.9887262Z torch_np/numpy_tests/linalg/test_linalg 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.linalg.test_linalg_1.1_72b5754c3ec219f3_.log 2025-10-10T02:19:15.9971611Z Running 268 items in this shard: test/torch_np/numpy_tests/linalg/test_linalg.py::TestSolve::test_0_size, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSolve::test_0_size_k, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSolve::test_empty_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSolve::test_generalized_empty_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSolve::test_generalized_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSolve::test_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSolve::test_types_dtype0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSolve::test_types_dtype1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSolve::test_types_dtype2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSolve::test_types_dtype3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestInv::test_0_size, test/torch_np/numpy_tests/linalg/test_linalg.py::TestInv::test_empty_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestInv::test_generalized_empty_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestInv::test_generalized_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestInv::test_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestInv::test_types_dtype0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestInv::test_types_dtype1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestInv::test_types_dtype2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestInv::test_types_dtype3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigvals::test_0_size, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigvals::test_empty_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigvals::test_generalized_empty_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigvals::test_generalized_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigvals::test_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigvals::test_types_dtype0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigvals::test_types_dtype1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigvals::test_types_dtype2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigvals::test_types_dtype3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEig::test_0_size, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEig::test_empty_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEig::test_generalized_empty_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEig::test_generalized_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEig::test_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEig::test_types_dtype0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEig::test_types_dtype1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEig::test_types_dtype2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEig::test_types_dtype3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSVD::test_empty_identity, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSVD::test_empty_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSVD::test_generalized_empty_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSVD::test_generalized_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSVD::test_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSVD::test_types_dtype0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSVD::test_types_dtype1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSVD::test_types_dtype2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSVD::test_types_dtype3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSVDHermitian::test_empty_herm_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSVDHermitian::test_generalized_empty_herm_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSVDHermitian::test_generalized_herm_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSVDHermitian::test_herm_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSVDHermitian::test_types_dtype0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSVDHermitian::test_types_dtype1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSVDHermitian::test_types_dtype2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestSVDHermitian::test_types_dtype3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCond::test_basic_nonsvd, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCond::test_empty_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCond::test_generalized_empty_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCond::test_generalized_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCond::test_nan, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCond::test_singular, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCond::test_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCond::test_stacked_singular, test/torch_np/numpy_tests/linalg/test_linalg.py::TestPinv::test_empty_nonsq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestPinv::test_empty_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestPinv::test_generalized_empty_nonsq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestPinv::test_generalized_empty_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestPinv::test_generalized_nonsq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestPinv::test_generalized_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestPinv::test_nonsq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestPinv::test_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestPinvHermitian::test_empty_herm_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestPinvHermitian::test_generalized_empty_herm_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestPinvHermitian::test_generalized_herm_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestPinvHermitian::test_herm_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestDet::test_0_size, test/torch_np/numpy_tests/linalg/test_linalg.py::TestDet::test_empty_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestDet::test_generalized_empty_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestDet::test_generalized_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestDet::test_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestDet::test_types_dtype0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestDet::test_types_dtype1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestDet::test_types_dtype2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestDet::test_types_dtype3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestDet::test_zero, test/torch_np/numpy_tests/linalg/test_linalg.py::TestLstsq::test_empty_a_b_m_0_n_0_n_rhs_0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestLstsq::test_empty_a_b_m_0_n_4_n_rhs_1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestLstsq::test_empty_a_b_m_0_n_4_n_rhs_2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestLstsq::test_empty_a_b_m_4_n_0_n_rhs_1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestLstsq::test_empty_a_b_m_4_n_0_n_rhs_2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestLstsq::test_empty_a_b_m_4_n_2_n_rhs_2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestLstsq::test_empty_nonsq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestLstsq::test_empty_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestLstsq::test_future_rcond, test/torch_np/numpy_tests/linalg/test_linalg.py::TestLstsq::test_incompatible_dims, test/torch_np/numpy_tests/linalg/test_linalg.py::TestLstsq::test_nonsq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestLstsq::test_sq_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigvalshCases::test_generalized_herm_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigvalshCases::test_generalized_empty_herm_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigvalshCases::test_herm_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigvalshCases::test_empty_herm_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigvalsh::test_0_size, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigvalsh::test_UPLO, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigvalsh::test_invalid, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigvalsh::test_types_dtype0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigvalsh::test_types_dtype1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigvalsh::test_types_dtype2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigvalsh::test_types_dtype3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEighCases::test_generalized_herm_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEighCases::test_generalized_empty_herm_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEighCases::test_herm_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEighCases::test_empty_herm_cases, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigh::test_0_size, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigh::test_UPLO, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigh::test_invalid, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigh::test_types_dtype0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigh::test_types_dtype1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigh::test_types_dtype2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestEigh::test_types_dtype3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNorm_NonSystematic::test_intmin, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormDouble::test_axis, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormDouble::test_bad_args, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormDouble::test_empty, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormDouble::test_keepdims, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormDouble::test_matrix_2x2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormDouble::test_matrix_3x3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormDouble::test_matrix_empty, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormDouble::test_matrix_return_type, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormDouble::test_vector, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormDouble::test_vector_return_type, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormSingle::test_axis, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormSingle::test_bad_args, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormSingle::test_empty, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormSingle::test_keepdims, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormSingle::test_matrix_2x2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormSingle::test_matrix_3x3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormSingle::test_matrix_empty, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormSingle::test_matrix_return_type, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormSingle::test_vector, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormSingle::test_vector_return_type, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormInt64::test_axis, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormInt64::test_bad_args, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormInt64::test_empty, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormInt64::test_keepdims, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormInt64::test_matrix_2x2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormInt64::test_matrix_3x3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormInt64::test_matrix_empty, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormInt64::test_matrix_return_type, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormInt64::test_vector, test/torch_np/numpy_tests/linalg/test_linalg.py::TestNormInt64::test_vector_return_type, test/torch_np/numpy_tests/linalg/test_linalg.py::TestMatrixRank::test_matrix_rank, test/torch_np/numpy_tests/linalg/test_linalg.py::TestMatrixRank::test_reduced_rank, test/torch_np/numpy_tests/linalg/test_linalg.py::TestMatrixRank::test_symmetric_rank, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_mode_all_but_economic, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_mode_raw, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_qr_empty_m_0_n_0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_qr_empty_m_0_n_3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_qr_empty_m_3_n_0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size0_outer_size0_dt0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size0_outer_size0_dt1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size0_outer_size0_dt2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size0_outer_size0_dt3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size0_outer_size1_dt0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size0_outer_size1_dt1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size0_outer_size1_dt2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size0_outer_size1_dt3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size0_outer_size2_dt0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size0_outer_size2_dt1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size0_outer_size2_dt2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size0_outer_size2_dt3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size1_outer_size0_dt0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size1_outer_size0_dt1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size1_outer_size0_dt2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size1_outer_size0_dt3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size1_outer_size1_dt0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size1_outer_size1_dt1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size1_outer_size1_dt2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size1_outer_size1_dt3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size1_outer_size2_dt0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size1_outer_size2_dt1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size1_outer_size2_dt2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size1_outer_size2_dt3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size2_outer_size0_dt0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size2_outer_size0_dt1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size2_outer_size0_dt2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size2_outer_size0_dt3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size2_outer_size1_dt0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size2_outer_size1_dt1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size2_outer_size1_dt2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size2_outer_size1_dt3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size2_outer_size2_dt0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size2_outer_size2_dt1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size2_outer_size2_dt2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size2_outer_size2_dt3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size3_outer_size0_dt0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size3_outer_size0_dt1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size3_outer_size0_dt2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size3_outer_size0_dt3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size3_outer_size1_dt0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size3_outer_size1_dt1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size3_outer_size1_dt2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size3_outer_size1_dt3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size3_outer_size2_dt0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size3_outer_size2_dt1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size3_outer_size2_dt2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size3_outer_size2_dt3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size4_outer_size0_dt0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size4_outer_size0_dt1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size4_outer_size0_dt2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size4_outer_size0_dt3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size4_outer_size1_dt0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size4_outer_size1_dt1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size4_outer_size1_dt2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size4_outer_size1_dt3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size4_outer_size2_dt0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size4_outer_size2_dt1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size4_outer_size2_dt2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestQR::test_stacked_inputs_size4_outer_size2_dt3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCholesky::test_0_size, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCholesky::test_basic_property_shape0_dtype0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCholesky::test_basic_property_shape0_dtype1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCholesky::test_basic_property_shape0_dtype2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCholesky::test_basic_property_shape0_dtype3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCholesky::test_basic_property_shape1_dtype0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCholesky::test_basic_property_shape1_dtype1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCholesky::test_basic_property_shape1_dtype2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCholesky::test_basic_property_shape1_dtype3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCholesky::test_basic_property_shape2_dtype0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCholesky::test_basic_property_shape2_dtype1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCholesky::test_basic_property_shape2_dtype2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCholesky::test_basic_property_shape2_dtype3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCholesky::test_basic_property_shape3_dtype0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCholesky::test_basic_property_shape3_dtype1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCholesky::test_basic_property_shape3_dtype2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCholesky::test_basic_property_shape3_dtype3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCholesky::test_basic_property_shape4_dtype0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCholesky::test_basic_property_shape4_dtype1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCholesky::test_basic_property_shape4_dtype2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestCholesky::test_basic_property_shape4_dtype3, test/torch_np/numpy_tests/linalg/test_linalg.py::TestMisc::test_byteorder_check, test/torch_np/numpy_tests/linalg/test_linalg.py::TestMisc::test_generalized_raise_multiloop, test/torch_np/numpy_tests/linalg/test_linalg.py::TestMisc::test_sdot_bug_8577, test/torch_np/numpy_tests/linalg/test_linalg.py::TestMisc::test_xerbla_override, test/torch_np/numpy_tests/linalg/test_linalg.py::TestMultiDot::test_basic_function_with_dynamic_programming_optimization, test/torch_np/numpy_tests/linalg/test_linalg.py::TestMultiDot::test_basic_function_with_three_arguments, test/torch_np/numpy_tests/linalg/test_linalg.py::TestMultiDot::test_basic_function_with_two_arguments, test/torch_np/numpy_tests/linalg/test_linalg.py::TestMultiDot::test_dynamic_programming_logic, test/torch_np/numpy_tests/linalg/test_linalg.py::TestMultiDot::test_dynamic_programming_optimization_and_out, test/torch_np/numpy_tests/linalg/test_linalg.py::TestMultiDot::test_three_arguments_and_out, test/torch_np/numpy_tests/linalg/test_linalg.py::TestMultiDot::test_too_few_input_arrays, test/torch_np/numpy_tests/linalg/test_linalg.py::TestMultiDot::test_two_arguments_and_out, test/torch_np/numpy_tests/linalg/test_linalg.py::TestMultiDot::test_vector_as_first_and_last_argument, test/torch_np/numpy_tests/linalg/test_linalg.py::TestMultiDot::test_vector_as_first_argument, test/torch_np/numpy_tests/linalg/test_linalg.py::TestMultiDot::test_vector_as_last_argument, test/torch_np/numpy_tests/linalg/test_linalg.py::TestTensorinv::test_non_square_handling_arr0_ind_2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestTensorinv::test_non_square_handling_arr1_ind_1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestTensorinv::test_tensorinv_ind_limit_ind_-2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestTensorinv::test_tensorinv_ind_limit_ind_0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestTensorinv::test_tensorinv_result, test/torch_np/numpy_tests/linalg/test_linalg.py::TestTensorinv::test_tensorinv_shape_shape0_ind_2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestTensorinv::test_tensorinv_shape_shape1_ind_1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestTensorsolve::test_non_square_handling_a0_axes0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestTensorsolve::test_non_square_handling_a1_axes1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestTensorsolve::test_tensorsolve_result_shape0, test/torch_np/numpy_tests/linalg/test_linalg.py::TestTensorsolve::test_tensorsolve_result_shape1, test/torch_np/numpy_tests/linalg/test_linalg.py::TestTensorsolve::test_tensorsolve_result_shape2, test/torch_np/numpy_tests/linalg/test_linalg.py::TestMisc2::test_blas64_dot, test/torch_np/numpy_tests/linalg/test_linalg.py::TestMisc2::test_blas64_geqrf_lwork_smoketest, test/torch_np/numpy_tests/linalg/test_linalg.py::TestMisc2::test_unsupported_commontype 2025-10-10T02:19:16.0052553Z 2025-10-10T02:19:19.8530304Z Running test_shape_ops 1/1 ... [2025-10-10 02:19:19.852441] 2025-10-10T02:19:19.8530987Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:19:19.8532419Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_shape_ops.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:19:19.852840] 2025-10-10T02:19:23.9753116Z 2025-10-10T02:19:23.9754020Z test_shape_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_shape_ops_1.1_baf0e023aba1b3cf_.log 2025-10-10T02:19:23.9777872Z Running 98 items in this shard: test/test_shape_ops.py::TestShapeOpsCUDA::test_clamp_cuda_float32, test/test_shape_ops.py::TestShapeOpsCUDA::test_clamp_cuda_int64, test/test_shape_ops.py::TestShapeOpsCUDA::test_clamp_propagates_nans_cuda, test/test_shape_ops.py::TestShapeOpsCUDA::test_clamp_raises_arg_errors_cuda, test/test_shape_ops.py::TestShapeOpsCUDA::test_complex_rot90_cuda_complex128, test/test_shape_ops.py::TestShapeOpsCUDA::test_complex_rot90_cuda_complex64, test/test_shape_ops.py::TestShapeOpsCUDA::test_diag_cuda_bool, test/test_shape_ops.py::TestShapeOpsCUDA::test_diag_cuda_float32, test/test_shape_ops.py::TestShapeOpsCUDA::test_diagonal_cuda, test/test_shape_ops.py::TestShapeOpsCUDA::test_diagonal_multidim_cuda_float32, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_cuda_bfloat16, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_cuda_bool, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_cuda_complex128, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_cuda_complex64, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_cuda_float16, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_cuda_float32, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_cuda_float64, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_cuda_int16, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_cuda_int32, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_cuda_int64, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_cuda_int8, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_cuda_uint8, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_errors_cuda_bfloat16, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_errors_cuda_bool, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_errors_cuda_complex128, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_errors_cuda_complex64, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_errors_cuda_float16, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_errors_cuda_float32, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_errors_cuda_float64, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_errors_cuda_int16, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_errors_cuda_int32, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_errors_cuda_int64, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_errors_cuda_int8, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_errors_cuda_uint8, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_large_tensor_cuda, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_numpy_cuda_bfloat16, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_numpy_cuda_bool, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_numpy_cuda_complex128, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_numpy_cuda_complex64, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_numpy_cuda_float16, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_numpy_cuda_float32, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_numpy_cuda_float64, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_numpy_cuda_int16, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_numpy_cuda_int32, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_numpy_cuda_int64, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_numpy_cuda_int8, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_numpy_cuda_uint8, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_unsupported_dtype_cuda_quint2x4, test/test_shape_ops.py::TestShapeOpsCUDA::test_flip_unsupported_dtype_cuda_quint4x2, test/test_shape_ops.py::TestShapeOpsCUDA::test_fliplr_cuda_complex128, test/test_shape_ops.py::TestShapeOpsCUDA::test_fliplr_cuda_float64, test/test_shape_ops.py::TestShapeOpsCUDA::test_fliplr_cuda_int64, test/test_shape_ops.py::TestShapeOpsCUDA::test_fliplr_invalid_cuda_complex128, test/test_shape_ops.py::TestShapeOpsCUDA::test_fliplr_invalid_cuda_float64, test/test_shape_ops.py::TestShapeOpsCUDA::test_fliplr_invalid_cuda_int64, test/test_shape_ops.py::TestShapeOpsCUDA::test_flipud_cuda_complex128, test/test_shape_ops.py::TestShapeOpsCUDA::test_flipud_cuda_float64, test/test_shape_ops.py::TestShapeOpsCUDA::test_flipud_cuda_int64, test/test_shape_ops.py::TestShapeOpsCUDA::test_flipud_invalid_cuda_complex128, test/test_shape_ops.py::TestShapeOpsCUDA::test_flipud_invalid_cuda_float64, test/test_shape_ops.py::TestShapeOpsCUDA::test_flipud_invalid_cuda_int64, test/test_shape_ops.py::TestShapeOpsCUDA::test_movedim_cuda_complex128, test/test_shape_ops.py::TestShapeOpsCUDA::test_movedim_cuda_float32, test/test_shape_ops.py::TestShapeOpsCUDA::test_movedim_cuda_int64, test/test_shape_ops.py::TestShapeOpsCUDA::test_movedim_invalid_cuda_complex128, test/test_shape_ops.py::TestShapeOpsCUDA::test_movedim_invalid_cuda_float32, test/test_shape_ops.py::TestShapeOpsCUDA::test_movedim_invalid_cuda_int64, test/test_shape_ops.py::TestShapeOpsCUDA::test_nonzero_astuple_out_cuda, test/test_shape_ops.py::TestShapeOpsCUDA::test_nonzero_cuda_bfloat16, test/test_shape_ops.py::TestShapeOpsCUDA::test_nonzero_cuda_bool, test/test_shape_ops.py::TestShapeOpsCUDA::test_nonzero_cuda_float16, test/test_shape_ops.py::TestShapeOpsCUDA::test_nonzero_cuda_float32, test/test_shape_ops.py::TestShapeOpsCUDA::test_nonzero_cuda_float64, test/test_shape_ops.py::TestShapeOpsCUDA::test_nonzero_cuda_int16, test/test_shape_ops.py::TestShapeOpsCUDA::test_nonzero_cuda_int32, test/test_shape_ops.py::TestShapeOpsCUDA::test_nonzero_cuda_int64, test/test_shape_ops.py::TestShapeOpsCUDA::test_nonzero_cuda_int8, test/test_shape_ops.py::TestShapeOpsCUDA::test_nonzero_cuda_uint8, test/test_shape_ops.py::TestShapeOpsCUDA::test_nonzero_discontiguous_cuda, test/test_shape_ops.py::TestShapeOpsCUDA::test_nonzero_no_warning_cuda, test/test_shape_ops.py::TestShapeOpsCUDA::test_nonzero_non_diff_cuda, test/test_shape_ops.py::TestShapeOpsCUDA::test_rot90_cuda, test/test_shape_ops.py::TestShapeOpsCUDA::test_sparse_dense_dim_cuda_complex128, test/test_shape_ops.py::TestShapeOpsCUDA::test_sparse_dense_dim_cuda_float32, test/test_shape_ops.py::TestShapeOpsCUDA::test_sparse_dense_dim_cuda_int64, test/test_shape_ops.py::TestShapeOpsCUDA::test_tolist_cuda, test/test_shape_ops.py::TestShapeOpsCUDA::test_trace_cuda_float16, test/test_shape_ops.py::TestShapeOpsCUDA::test_trace_cuda_float32, test/test_shape_ops.py::TestShapeOpsCUDA::test_trace_cuda_float64, test/test_shape_ops.py::TestShapeOpsCUDA::test_trace_cuda_int16, test/test_shape_ops.py::TestShapeOpsCUDA::test_trace_cuda_int32, test/test_shape_ops.py::TestShapeOpsCUDA::test_trace_cuda_int64, test/test_shape_ops.py::TestShapeOpsCUDA::test_trace_cuda_int8, test/test_shape_ops.py::TestShapeOpsCUDA::test_trace_cuda_uint8, test/test_shape_ops.py::TestShapeOpsCUDA::test_unbind_cuda, test/test_shape_ops.py::TestShapeOpsCUDA::test_unfold_all_devices_and_dtypes_cuda, test/test_shape_ops.py::TestShapeOpsCUDA::test_unfold_errors_cuda, test/test_shape_ops.py::TestShapeOpsCUDA::test_unfold_scalars_cuda 2025-10-10T02:19:23.9801172Z 2025-10-10T02:19:27.8933983Z Running torch_np/numpy_tests/core/test_shape_base 1/1 ... [2025-10-10 02:19:27.892757] 2025-10-10T02:19:27.8934625Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:19:27.8936305Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_shape_base.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:19:27.893184] 2025-10-10T02:19:32.0167572Z 2025-10-10T02:19:32.0168629Z torch_np/numpy_tests/core/test_shape_base 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_shape_base_1.1_734bdded1185faf0_.log 2025-10-10T02:19:32.0210464Z Running 119 items in this shard: test/torch_np/numpy_tests/core/test_shape_base.py::TestAtleast1d::test_0D_array, test/torch_np/numpy_tests/core/test_shape_base.py::TestAtleast1d::test_1D_array, test/torch_np/numpy_tests/core/test_shape_base.py::TestAtleast1d::test_2D_array, test/torch_np/numpy_tests/core/test_shape_base.py::TestAtleast1d::test_3D_array, test/torch_np/numpy_tests/core/test_shape_base.py::TestAtleast1d::test_r1array, test/torch_np/numpy_tests/core/test_shape_base.py::TestAtleast2d::test_0D_array, test/torch_np/numpy_tests/core/test_shape_base.py::TestAtleast2d::test_1D_array, test/torch_np/numpy_tests/core/test_shape_base.py::TestAtleast2d::test_2D_array, test/torch_np/numpy_tests/core/test_shape_base.py::TestAtleast2d::test_3D_array, test/torch_np/numpy_tests/core/test_shape_base.py::TestAtleast2d::test_r2array, test/torch_np/numpy_tests/core/test_shape_base.py::TestAtleast3d::test_0D_array, test/torch_np/numpy_tests/core/test_shape_base.py::TestAtleast3d::test_1D_array, test/torch_np/numpy_tests/core/test_shape_base.py::TestAtleast3d::test_2D_array, test/torch_np/numpy_tests/core/test_shape_base.py::TestAtleast3d::test_3D_array, test/torch_np/numpy_tests/core/test_shape_base.py::TestHstack::test_0D_array, test/torch_np/numpy_tests/core/test_shape_base.py::TestHstack::test_1D_array, test/torch_np/numpy_tests/core/test_shape_base.py::TestHstack::test_2D_array, test/torch_np/numpy_tests/core/test_shape_base.py::TestHstack::test_casting_and_dtype, test/torch_np/numpy_tests/core/test_shape_base.py::TestHstack::test_casting_and_dtype_type_error, test/torch_np/numpy_tests/core/test_shape_base.py::TestHstack::test_empty_input, test/torch_np/numpy_tests/core/test_shape_base.py::TestHstack::test_generator, test/torch_np/numpy_tests/core/test_shape_base.py::TestHstack::test_non_iterable, test/torch_np/numpy_tests/core/test_shape_base.py::TestVstack::test_0D_array, test/torch_np/numpy_tests/core/test_shape_base.py::TestVstack::test_1D_array, test/torch_np/numpy_tests/core/test_shape_base.py::TestVstack::test_2D_array, test/torch_np/numpy_tests/core/test_shape_base.py::TestVstack::test_2D_array2, test/torch_np/numpy_tests/core/test_shape_base.py::TestVstack::test_casting_and_dtype, test/torch_np/numpy_tests/core/test_shape_base.py::TestVstack::test_casting_and_dtype_type_error, test/torch_np/numpy_tests/core/test_shape_base.py::TestVstack::test_empty_input, test/torch_np/numpy_tests/core/test_shape_base.py::TestVstack::test_generator, test/torch_np/numpy_tests/core/test_shape_base.py::TestVstack::test_non_iterable, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_bad_out_shape, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_concatenate, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_concatenate_axis_None, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_exceptions, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_large_concatenate_axis_None, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_operator_concat, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis0_out_dtype_c8_casting_equiv, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis0_out_dtype_c8_casting_no, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis0_out_dtype_c8_casting_safe, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis0_out_dtype_c8_casting_same_kind, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis0_out_dtype_c8_casting_unsafe, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis0_out_dtype_f4_casting_equiv, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis0_out_dtype_f4_casting_no, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis0_out_dtype_f4_casting_safe, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis0_out_dtype_f4_casting_same_kind, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis0_out_dtype_f4_casting_unsafe, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis0_out_dtype_f8_casting_equiv, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis0_out_dtype_f8_casting_no, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis0_out_dtype_f8_casting_safe, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis0_out_dtype_f8_casting_same_kind, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis0_out_dtype_f8_casting_unsafe, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis0_out_dtype_i8_casting_equiv, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis0_out_dtype_i8_casting_no, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis0_out_dtype_i8_casting_safe, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis0_out_dtype_i8_casting_same_kind, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis0_out_dtype_i8_casting_unsafe, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis_0_out_dtype_c8_casting_equiv, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis_0_out_dtype_c8_casting_no, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis_0_out_dtype_c8_casting_safe, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis_0_out_dtype_c8_casting_same_kind, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis_0_out_dtype_c8_casting_unsafe, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis_0_out_dtype_f4_casting_equiv, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis_0_out_dtype_f4_casting_no, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis_0_out_dtype_f4_casting_safe, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis_0_out_dtype_f4_casting_same_kind, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis_0_out_dtype_f4_casting_unsafe, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis_0_out_dtype_f8_casting_equiv, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis_0_out_dtype_f8_casting_no, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis_0_out_dtype_f8_casting_safe, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis_0_out_dtype_f8_casting_same_kind, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis_0_out_dtype_f8_casting_unsafe, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis_0_out_dtype_i8_casting_equiv, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis_0_out_dtype_i8_casting_no, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis_0_out_dtype_i8_casting_safe, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis_0_out_dtype_i8_casting_same_kind, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_axis_0_out_dtype_i8_casting_unsafe, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_out_and_dtype_simple, test/torch_np/numpy_tests/core/test_shape_base.py::TestConcatenate::test_returns_copy, test/torch_np/numpy_tests/core/test_shape_base.py::TestStackMisc::test_stack, test/torch_np/numpy_tests/core/test_shape_base.py::TestStackMisc::test_stack_out_and_dtype_axis_0_out_dtype_c8_casting_equiv, test/torch_np/numpy_tests/core/test_shape_base.py::TestStackMisc::test_stack_out_and_dtype_axis_0_out_dtype_c8_casting_no, test/torch_np/numpy_tests/core/test_shape_base.py::TestStackMisc::test_stack_out_and_dtype_axis_0_out_dtype_c8_casting_safe, test/torch_np/numpy_tests/core/test_shape_base.py::TestStackMisc::test_stack_out_and_dtype_axis_0_out_dtype_c8_casting_same_kind, test/torch_np/numpy_tests/core/test_shape_base.py::TestStackMisc::test_stack_out_and_dtype_axis_0_out_dtype_c8_casting_unsafe, test/torch_np/numpy_tests/core/test_shape_base.py::TestStackMisc::test_stack_out_and_dtype_axis_0_out_dtype_f4_casting_equiv, test/torch_np/numpy_tests/core/test_shape_base.py::TestStackMisc::test_stack_out_and_dtype_axis_0_out_dtype_f4_casting_no, test/torch_np/numpy_tests/core/test_shape_base.py::TestStackMisc::test_stack_out_and_dtype_axis_0_out_dtype_f4_casting_safe, test/torch_np/numpy_tests/core/test_shape_base.py::TestStackMisc::test_stack_out_and_dtype_axis_0_out_dtype_f4_casting_same_kind, test/torch_np/numpy_tests/core/test_shape_base.py::TestStackMisc::test_stack_out_and_dtype_axis_0_out_dtype_f4_casting_unsafe, test/torch_np/numpy_tests/core/test_shape_base.py::TestStackMisc::test_stack_out_and_dtype_axis_0_out_dtype_f8_casting_equiv, test/torch_np/numpy_tests/core/test_shape_base.py::TestStackMisc::test_stack_out_and_dtype_axis_0_out_dtype_f8_casting_no, test/torch_np/numpy_tests/core/test_shape_base.py::TestStackMisc::test_stack_out_and_dtype_axis_0_out_dtype_f8_casting_safe, test/torch_np/numpy_tests/core/test_shape_base.py::TestStackMisc::test_stack_out_and_dtype_axis_0_out_dtype_f8_casting_same_kind, test/torch_np/numpy_tests/core/test_shape_base.py::TestStackMisc::test_stack_out_and_dtype_axis_0_out_dtype_f8_casting_unsafe, test/torch_np/numpy_tests/core/test_shape_base.py::TestStackMisc::test_stack_out_and_dtype_axis_0_out_dtype_i8_casting_equiv, test/torch_np/numpy_tests/core/test_shape_base.py::TestStackMisc::test_stack_out_and_dtype_axis_0_out_dtype_i8_casting_no, test/torch_np/numpy_tests/core/test_shape_base.py::TestStackMisc::test_stack_out_and_dtype_axis_0_out_dtype_i8_casting_safe, test/torch_np/numpy_tests/core/test_shape_base.py::TestStackMisc::test_stack_out_and_dtype_axis_0_out_dtype_i8_casting_same_kind, test/torch_np/numpy_tests/core/test_shape_base.py::TestStackMisc::test_stack_out_and_dtype_axis_0_out_dtype_i8_casting_unsafe, test/torch_np/numpy_tests/core/test_shape_base.py::TestBlock::test_3d, test/torch_np/numpy_tests/core/test_shape_base.py::TestBlock::test_block_complicated, test/torch_np/numpy_tests/core/test_shape_base.py::TestBlock::test_block_memory_order, test/torch_np/numpy_tests/core/test_shape_base.py::TestBlock::test_block_mixed_1d_and_2d, test/torch_np/numpy_tests/core/test_shape_base.py::TestBlock::test_block_simple_column_wise, test/torch_np/numpy_tests/core/test_shape_base.py::TestBlock::test_block_simple_row_wise, test/torch_np/numpy_tests/core/test_shape_base.py::TestBlock::test_block_total_size_estimate, test/torch_np/numpy_tests/core/test_shape_base.py::TestBlock::test_block_with_1d_arrays_column_wise, test/torch_np/numpy_tests/core/test_shape_base.py::TestBlock::test_block_with_1d_arrays_multiple_rows, test/torch_np/numpy_tests/core/test_shape_base.py::TestBlock::test_block_with_1d_arrays_row_wise, test/torch_np/numpy_tests/core/test_shape_base.py::TestBlock::test_block_with_mismatched_shape, test/torch_np/numpy_tests/core/test_shape_base.py::TestBlock::test_different_ndims, test/torch_np/numpy_tests/core/test_shape_base.py::TestBlock::test_different_ndims_depths, test/torch_np/numpy_tests/core/test_shape_base.py::TestBlock::test_empty_lists, test/torch_np/numpy_tests/core/test_shape_base.py::TestBlock::test_invalid_nesting, test/torch_np/numpy_tests/core/test_shape_base.py::TestBlock::test_nested, test/torch_np/numpy_tests/core/test_shape_base.py::TestBlock::test_no_lists, test/torch_np/numpy_tests/core/test_shape_base.py::TestBlock::test_returns_copy, test/torch_np/numpy_tests/core/test_shape_base.py::TestBlock::test_tuple 2025-10-10T02:19:32.0250958Z 2025-10-10T02:19:35.9038281Z Running torch_np/numpy_tests/core/test_dtype 1/1 ... [2025-10-10 02:19:35.903145] 2025-10-10T02:19:35.9038939Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:19:35.9040357Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_dtype.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:19:35.903535] 2025-10-10T02:19:40.0267242Z 2025-10-10T02:19:40.0268507Z torch_np/numpy_tests/core/test_dtype 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_dtype_1.1_71068497346246e2_.log 2025-10-10T02:19:40.0305857Z Running 102 items in this shard: test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_equivalent_dtype_hashing, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_invalid_types, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_Bool, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_Bytes0, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_Complex128, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_Complex32, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_Complex64, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_Datetime64, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_Float128, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_Float16, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_Float32, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_Float64, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_Int16, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_Int32, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_Int64, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_Int8, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_Object0, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_Str0, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_Timedelta64, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_UInt16, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_UInt32, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_UInt64, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_UInt8, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_Uint32, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_Uint64, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_numeric_style_types_are_invalid_dtype_Void0, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_richcompare_invalid_dtype_comparison_operation0, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_richcompare_invalid_dtype_comparison_operation1, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_richcompare_invalid_dtype_comparison_operation2, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_richcompare_invalid_dtype_comparison_operation3, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_richcompare_invalid_dtype_equality, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_run_t0, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_run_t1, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_run_t2, test/torch_np/numpy_tests/core/test_dtype.py::TestBuiltin::test_run_t3, test/torch_np/numpy_tests/core/test_dtype.py::TestDtypeAttributeDeletion::test_dtype_non_writable_attributes_deletion, test/torch_np/numpy_tests/core/test_dtype.py::TestDtypeAttributeDeletion::test_dtype_writable_attributes_deletion, test/torch_np/numpy_tests/core/test_dtype.py::TestPickling::test_builtin_t0, test/torch_np/numpy_tests/core/test_dtype.py::TestPickling::test_builtin_t1, test/torch_np/numpy_tests/core/test_dtype.py::TestPickling::test_builtin_t2, test/torch_np/numpy_tests/core/test_dtype.py::TestPickling::test_builtin_t3, test/torch_np/numpy_tests/core/test_dtype.py::TestPickling::test_builtin_t4, test/torch_np/numpy_tests/core/test_dtype.py::TestPickling::test_pickle_types_DType11, test/torch_np/numpy_tests/core/test_dtype.py::TestPickling::test_pickle_types_bool__10, test/torch_np/numpy_tests/core/test_dtype.py::TestPickling::test_pickle_types_complex128_4, test/torch_np/numpy_tests/core/test_dtype.py::TestPickling::test_pickle_types_complex64_3, test/torch_np/numpy_tests/core/test_dtype.py::TestPickling::test_pickle_types_float16_0, test/torch_np/numpy_tests/core/test_dtype.py::TestPickling::test_pickle_types_float32_1, test/torch_np/numpy_tests/core/test_dtype.py::TestPickling::test_pickle_types_float64_2, test/torch_np/numpy_tests/core/test_dtype.py::TestPickling::test_pickle_types_int16_7, test/torch_np/numpy_tests/core/test_dtype.py::TestPickling::test_pickle_types_int32_8, test/torch_np/numpy_tests/core/test_dtype.py::TestPickling::test_pickle_types_int64_9, test/torch_np/numpy_tests/core/test_dtype.py::TestPickling::test_pickle_types_int8_6, test/torch_np/numpy_tests/core/test_dtype.py::TestPickling::test_pickle_types_uint8_5, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_complex_other_value_based_complex64_complex64_None, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_complex_other_value_based_float16_complex64_None, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_complex_other_value_based_float32_complex64_None, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_complex_other_value_based_other_4294967295_expected1_expected_weak1, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_complex_other_value_based_other_65535_expected0_expected_weak0, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_complex_scalar_value_based_other0_expected0, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_complex_scalar_value_based_other1_expected1, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_complex_scalar_value_based_other2_expected2, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_complex_scalar_value_based_other3_expected3, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_complex_scalar_value_based_other4_expected4, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_complex_scalar_value_based_other5_expected5, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_complex_scalar_value_based_other6_expected6, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_permutations_do_not_influence_result_dtypes0_expected0, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_permutations_do_not_influence_result_dtypes1_expected1, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_permutations_do_not_influence_result_dtypes2_expected2, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_permutations_do_not_influence_result_dtypes3_expected3, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_permutations_do_not_influence_result_dtypes4_expected4, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_permutations_do_not_influence_result_dtypes5_expected5, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_permutations_do_not_influence_result_dtypes6_expected6, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_permutations_do_not_influence_result_dtypes7_expected7, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_permutations_do_not_influence_result_dtypes8_expected8, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_permutations_do_not_influence_result_dtypes9_expected9, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_python_integer_promotion_val_18446744073709551616, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_python_integer_promotion_val_2, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_python_integer_promotion_val_200, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_python_integer_promotion_val_4294967296, test/torch_np/numpy_tests/core/test_dtype.py::TestPromotion::test_python_integer_promotion_val_9223372036854775808, test/torch_np/numpy_tests/core/test_dtype.py::TestMisc::test_dtypes_are_true, test/torch_np/numpy_tests/core/test_dtype.py::TestMisc::test_keyword_argument, test/torch_np/numpy_tests/core/test_dtype.py::TestFromDTypeAttribute::test_recursion, test/torch_np/numpy_tests/core/test_dtype.py::TestFromDTypeAttribute::test_simple, test/torch_np/numpy_tests/core/test_dtype.py::TestClassGetItem::test_dtype, test/torch_np/numpy_tests/core/test_dtype.py::TestClassGetItem::test_dtype_subclass_code_?, test/torch_np/numpy_tests/core/test_dtype.py::TestClassGetItem::test_dtype_subclass_code_B, test/torch_np/numpy_tests/core/test_dtype.py::TestClassGetItem::test_dtype_subclass_code_D, test/torch_np/numpy_tests/core/test_dtype.py::TestClassGetItem::test_dtype_subclass_code_F, test/torch_np/numpy_tests/core/test_dtype.py::TestClassGetItem::test_dtype_subclass_code_b, test/torch_np/numpy_tests/core/test_dtype.py::TestClassGetItem::test_dtype_subclass_code_d, test/torch_np/numpy_tests/core/test_dtype.py::TestClassGetItem::test_dtype_subclass_code_e, test/torch_np/numpy_tests/core/test_dtype.py::TestClassGetItem::test_dtype_subclass_code_f, test/torch_np/numpy_tests/core/test_dtype.py::TestClassGetItem::test_dtype_subclass_code_h, test/torch_np/numpy_tests/core/test_dtype.py::TestClassGetItem::test_dtype_subclass_code_i, test/torch_np/numpy_tests/core/test_dtype.py::TestClassGetItem::test_dtype_subclass_code_l, test/torch_np/numpy_tests/core/test_dtype.py::TestClassGetItem::test_subscript_scalar, test/torch_np/numpy_tests/core/test_dtype.py::TestClassGetItem::test_subscript_tuple_arg_len_0, test/torch_np/numpy_tests/core/test_dtype.py::TestClassGetItem::test_subscript_tuple_arg_len_1, test/torch_np/numpy_tests/core/test_dtype.py::TestClassGetItem::test_subscript_tuple_arg_len_2, test/torch_np/numpy_tests/core/test_dtype.py::TestClassGetItem::test_subscript_tuple_arg_len_3 2025-10-10T02:19:40.0341630Z 2025-10-10T02:19:43.9100366Z Running test_unary_ufuncs 1/1 ... [2025-10-10 02:19:43.909453] 2025-10-10T02:19:43.9100948Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:19:43.9102582Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_unary_ufuncs.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:19:43.909866] 2025-10-10T02:20:34.4825251Z 2025-10-10T02:20:34.4828757Z test_unary_ufuncs 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_unary_ufuncs_1.1_40b935a37851e832_.log 2025-10-10T02:20:35.3769230Z Running 25072 items in this shard: test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_abs_angle_complex_to_float_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_abs_angle_complex_to_float_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_abs_big_number_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_abs_signed_zero_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_abs_signed_zero_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_abs_zero_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_abs_zero_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bfloat16_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bfloat16_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bfloat16_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bfloat16_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bfloat16_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bfloat16_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bfloat16_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bfloat16_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bfloat16_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bfloat16_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bfloat16_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bfloat16_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bfloat16_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bool_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bool_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bool_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bool_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bool_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bool_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bool_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bool_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bool_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bool_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bool_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bool_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_bool_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_byte_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_byte_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_byte_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_byte_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_byte_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_byte_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_byte_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_byte_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_byte_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_byte_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_byte_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_byte_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cdouble_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cdouble_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cdouble_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cdouble_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cdouble_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cdouble_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cdouble_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cdouble_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cdouble_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cdouble_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cdouble_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cdouble_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cdouble_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cfloat_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cfloat_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cfloat_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cfloat_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cfloat_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cfloat_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cfloat_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cfloat_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cfloat_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cfloat_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cfloat_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cfloat_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_cfloat_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_chalf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_chalf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_chalf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_chalf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_chalf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_chalf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_chalf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_chalf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_chalf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_chalf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_chalf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_chalf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_chalf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_char_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_char_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_char_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_char_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_char_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_char_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_char_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_char_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_char_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_char_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_char_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_char_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_char_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_double_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_double_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_double_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_double_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_double_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_double_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_double_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_double_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_double_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_double_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_double_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_double_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_double_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_float_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_float_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_float_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_float_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_float_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_float_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_float_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_float_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_float_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_float_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_float_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_float_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_float_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_half_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_half_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_half_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_half_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_half_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_half_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_half_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_half_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_half_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_half_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_half_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_half_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_int_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_int_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_int_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_int_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_int_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_int_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_int_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_int_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_int_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_int_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_int_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_int_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_long_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_long_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_long_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_long_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_long_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_long_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_long_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_long_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_long_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_long_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_long_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_long_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_long_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_short_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_short_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_short_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_short_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_short_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_short_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_short_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_short_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_short_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_short_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_short_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_short_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_abs_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_abs_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_abs_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_abs_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_abs_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_abs_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_abs_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_acosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_asinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_atanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_ceil_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_ceil_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_ceil_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_ceil_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_ceil_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_physical_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_physical_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_physical_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_physical_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_physical_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_physical_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_conj_physical_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_cosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_deg2rad_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_deg2rad_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_deg2rad_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_deg2rad_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_deg2rad_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_deg2rad_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_digamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_digamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_digamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_digamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_digamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_digamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erfc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erfc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erfc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erfc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erfc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erfc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erfinv_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erfinv_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erfinv_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erfinv_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erfinv_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_erfinv_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_exp_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_expm1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_expm1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_expm1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_expm1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_expm1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_expm1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_fill_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_fill_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_fill_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_fill_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_fill_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_fill_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_fill_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_imag_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isfinite_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isfinite_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isfinite_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isfinite_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isfinite_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isfinite_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isfinite_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isinf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isnan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isnan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isnan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isnan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isnan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isnan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isneginf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isneginf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isneginf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isneginf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isneginf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isneginf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isposinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isposinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isposinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isposinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isposinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isposinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isreal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isreal_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isreal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isreal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isreal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isreal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_isreal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_lgamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_lgamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_lgamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_lgamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_lgamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_lgamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log10_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log10_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log10_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log10_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log10_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log10_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log1p_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log1p_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log1p_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log1p_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log1p_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log1p_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_log_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_neg_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_neg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_neg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_neg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_neg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_neg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_hardshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_hardshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_hardshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_hardshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_hardtanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_hardtanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_hardtanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_hardtanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_hardtanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_hardtanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_hardtanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_hardtanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_relu6_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_relu6_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_relu6_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_relu6_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_relu6_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_relu6_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_relu6_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_relu6_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_relu6_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_relu_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_relu_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_relu_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_relu_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_relu_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_softshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_softshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_softshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_softshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_tanhshrink_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_tanhshrink_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_tanhshrink_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_tanhshrink_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_tanhshrink_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_nn_functional_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_positive_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_positive_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_positive_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_positive_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_positive_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_positive_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rad2deg_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rad2deg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rad2deg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rad2deg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rad2deg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rad2deg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_real_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_real_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_real_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_real_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_real_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_real_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_real_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_reciprocal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_reciprocal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_reciprocal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_reciprocal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_reciprocal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_reciprocal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_round_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_round_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_round_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_round_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_round_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rsqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rsqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rsqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rsqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rsqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rsqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_rsqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sgn_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sgn_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sgn_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sgn_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sgn_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sgn_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sgn_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sigmoid_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sigmoid_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sigmoid_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sigmoid_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sigmoid_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sigmoid_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sigmoid_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_signbit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_signbit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_signbit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_signbit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_signbit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_signbit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_bessel_j1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_bessel_j1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_bessel_j1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_bessel_j1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_bessel_j1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_bessel_j1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_entr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_entr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_entr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_entr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_entr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_entr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_erfcx_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_erfcx_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_erfcx_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_erfcx_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_erfcx_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_erfcx_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i0e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i0e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i0e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i0e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i0e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i0e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i1e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i1e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i1e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i1e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i1e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_i1e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_log_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_log_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_log_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_log_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_log_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_log_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_logit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_logit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_logit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_logit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_logit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_logit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_5_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_5_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_5_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_5_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_multigammaln_mvlgamma_p_5_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_ndtri_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_ndtri_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_ndtri_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_ndtri_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_ndtri_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_ndtri_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_spherical_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_spherical_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_spherical_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_spherical_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_spherical_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_special_spherical_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_sqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_square_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_square_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_square_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_square_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_square_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_square_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_tanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_trunc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_trunc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_trunc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_trunc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing__refs_trunc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_abs_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_abs_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_abs_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_abs_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_abs_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_abs_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_abs_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_acosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_angle_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_angle_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_angle_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_angle_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_angle_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_angle_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_angle_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_angle_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_angle_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_angle_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_angle_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_asinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_atanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bfloat16_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bfloat16_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bfloat16_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bfloat16_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bfloat16_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bfloat16_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bfloat16_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bfloat16_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bfloat16_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bfloat16_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bfloat16_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bfloat16_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bfloat16_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bitwise_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bitwise_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bitwise_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bitwise_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bitwise_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bitwise_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bool_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bool_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bool_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bool_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bool_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bool_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bool_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bool_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bool_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bool_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bool_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bool_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_bool_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_byte_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_byte_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_byte_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_byte_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_byte_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_byte_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_byte_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_byte_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_byte_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_byte_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_byte_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_byte_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cdouble_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cdouble_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cdouble_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cdouble_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cdouble_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cdouble_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cdouble_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cdouble_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cdouble_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cdouble_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cdouble_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cdouble_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cdouble_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_ceil_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_ceil_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_ceil_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_ceil_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_ceil_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cfloat_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cfloat_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cfloat_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cfloat_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cfloat_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cfloat_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cfloat_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cfloat_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cfloat_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cfloat_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cfloat_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cfloat_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cfloat_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_chalf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_chalf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_chalf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_chalf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_chalf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_chalf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_chalf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_chalf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_chalf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_chalf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_chalf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_chalf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_chalf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_char_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_char_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_char_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_char_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_char_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_char_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_char_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_char_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_char_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_char_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_char_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_char_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_char_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_physical_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_physical_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_physical_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_physical_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_physical_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_physical_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_conj_physical_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_cosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_deg2rad_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_deg2rad_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_deg2rad_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_deg2rad_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_deg2rad_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_deg2rad_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_digamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_digamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_digamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_digamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_digamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_digamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_double_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_double_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_double_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_double_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_double_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_double_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_double_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_double_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_double_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_double_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_double_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_double_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_double_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erfc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erfc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erfc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erfc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erfc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erfc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erfinv_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erfinv_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erfinv_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erfinv_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erfinv_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_erfinv_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_exp_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_expm1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_expm1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_expm1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_expm1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_expm1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_expm1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_fill_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_fill_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_fill_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_fill_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_fill_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_fill_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_fill_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_float_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_float_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_float_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_float_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_float_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_float_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_float_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_float_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_float_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_float_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_float_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_float_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_float_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_floor_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_floor_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_floor_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_floor_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_floor_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_half_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_half_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_half_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_half_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_half_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_half_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_half_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_half_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_half_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_half_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_half_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_half_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_imag_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_int_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_int_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_int_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_int_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_int_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_int_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_int_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_int_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_int_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_int_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_int_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_int_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isfinite_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isfinite_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isfinite_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isfinite_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isfinite_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isfinite_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isfinite_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isinf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isnan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isnan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isnan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isnan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isnan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isnan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isneginf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isneginf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isneginf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isneginf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isneginf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isneginf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isposinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isposinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isposinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isposinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isposinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isposinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isreal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isreal_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isreal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isreal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isreal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isreal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_isreal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_jiterator_unary_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_jiterator_unary_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_jiterator_unary_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_jiterator_unary_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_jiterator_unary_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_jiterator_unary_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_jiterator_unary_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_jiterator_unary_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_jiterator_unary_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_jiterator_unary_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_jiterator_unary_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_jiterator_unary_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_lgamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_lgamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_lgamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_lgamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_lgamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_lgamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log10_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log10_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log10_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log10_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log10_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log10_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log1p_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log1p_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log1p_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log1p_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log1p_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log1p_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_log_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_logical_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_logical_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_logical_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_logical_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_logical_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_logical_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_logit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_logit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_logit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_logit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_logit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_logit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_long_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_long_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_long_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_long_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_long_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_long_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_long_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_long_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_long_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_long_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_long_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_long_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_long_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_neg_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_neg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_neg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_neg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_neg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_neg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_hardshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_hardshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_hardshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_hardshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_hardsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_hardsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_hardsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_hardsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_hardtanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_hardtanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_hardtanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_hardtanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_hardtanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_hardtanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_hardtanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_hardtanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_logsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_logsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_logsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_logsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_relu6_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_relu6_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_relu6_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_relu6_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_relu6_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_relu6_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_relu6_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_relu6_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_relu6_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_relu_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_relu_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_relu_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_relu_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_relu_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_rrelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_rrelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_rrelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_rrelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_silu_complex_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_silu_complex_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_silu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_silu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_silu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_silu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_softshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_softshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_softshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_softshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_softsign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_softsign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_softsign_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_softsign_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_softsign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_softsign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_softsign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_softsign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_softsign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_softsign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_softsign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_softsign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_tanhshrink_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_tanhshrink_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_tanhshrink_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_tanhshrink_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_tanhshrink_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_nn_functional_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_3_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_4_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_4_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_4_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_4_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_4_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_4_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_4_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_4_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_4_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_polygamma_polygamma_n_4_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_positive_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_positive_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_positive_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_positive_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_positive_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_positive_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rad2deg_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rad2deg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rad2deg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rad2deg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rad2deg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rad2deg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_real_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_real_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_real_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_real_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_real_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_real_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_real_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_reciprocal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_reciprocal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_reciprocal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_reciprocal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_reciprocal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_reciprocal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_round_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_round_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_round_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_round_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_round_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_round_decimals_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_round_decimals_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_round_decimals_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_round_decimals_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_round_decimals_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_round_decimals_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_round_decimals_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_round_decimals_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_round_decimals_neg_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_round_decimals_neg_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_round_decimals_neg_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_round_decimals_neg_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rsqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rsqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rsqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rsqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rsqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rsqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_rsqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sgn_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sgn_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sgn_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sgn_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sgn_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sgn_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sgn_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_short_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_short_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_short_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_short_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_short_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_short_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_short_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_short_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_short_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_short_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_short_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_short_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sigmoid_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sigmoid_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sigmoid_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sigmoid_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sigmoid_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sigmoid_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sigmoid_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_signbit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_signbit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_signbit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_signbit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_signbit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_signbit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_airy_ai_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_airy_ai_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_airy_ai_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_airy_ai_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_airy_ai_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_airy_ai_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_airy_ai_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_airy_ai_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_j1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_j1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_j1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_j1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_j1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_j1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_y0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_y0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_y0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_y0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_y0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_y0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_y0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_y0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_y1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_y1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_y1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_y1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_y1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_y1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_y1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_bessel_y1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_entr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_entr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_entr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_entr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_entr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_entr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_erfcx_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_erfcx_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_erfcx_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_erfcx_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_erfcx_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_erfcx_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i0e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i0e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i0e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i0e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i0e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i0e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i1e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i1e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i1e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i1e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i1e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_i1e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_log_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_log_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_log_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_log_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_log_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_log_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_k0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_k0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_k0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_k0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_k0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_k0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_k1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_k1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_k1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_k1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_k1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_modified_bessel_k1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_ndtri_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_ndtri_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_ndtri_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_ndtri_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_ndtri_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_ndtri_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_scaled_modified_bessel_k0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_scaled_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_scaled_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_scaled_modified_bessel_k0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_scaled_modified_bessel_k0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_scaled_modified_bessel_k0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_scaled_modified_bessel_k0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_scaled_modified_bessel_k0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_scaled_modified_bessel_k1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_scaled_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_scaled_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_scaled_modified_bessel_k1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_scaled_modified_bessel_k1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_scaled_modified_bessel_k1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_scaled_modified_bessel_k1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_scaled_modified_bessel_k1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_spherical_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_spherical_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_spherical_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_spherical_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_spherical_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_special_spherical_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_sqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_square_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_square_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_square_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_square_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_square_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_square_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_tanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_trunc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_trunc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_trunc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_trunc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_batch_vs_slicing_trunc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_complex_edge_values_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bfloat16_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bfloat16_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bfloat16_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bfloat16_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bfloat16_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bfloat16_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bfloat16_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bfloat16_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bfloat16_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bfloat16_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bfloat16_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bfloat16_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bfloat16_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bool_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bool_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bool_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bool_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bool_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bool_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bool_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bool_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bool_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bool_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bool_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bool_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_bool_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_byte_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_byte_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_byte_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_byte_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_byte_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_byte_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_byte_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_byte_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_byte_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_byte_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_byte_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_byte_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cdouble_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cdouble_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cdouble_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cdouble_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cdouble_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cdouble_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cdouble_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cdouble_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cdouble_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cdouble_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cdouble_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cdouble_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cdouble_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cfloat_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cfloat_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cfloat_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cfloat_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cfloat_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cfloat_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cfloat_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cfloat_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cfloat_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cfloat_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cfloat_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cfloat_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_cfloat_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_chalf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_chalf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_chalf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_chalf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_chalf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_chalf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_chalf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_chalf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_chalf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_chalf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_chalf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_chalf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_chalf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_char_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_char_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_char_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_char_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_char_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_char_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_char_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_char_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_char_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_char_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_char_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_char_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_char_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_double_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_double_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_double_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_double_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_double_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_double_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_double_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_double_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_double_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_double_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_double_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_double_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_double_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_float_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_float_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_float_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_float_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_float_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_float_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_float_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_float_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_float_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_float_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_float_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_float_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_float_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_half_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_half_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_half_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_half_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_half_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_half_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_half_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_half_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_half_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_half_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_half_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_half_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_int_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_int_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_int_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_int_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_int_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_int_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_int_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_int_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_int_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_int_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_int_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_int_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_long_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_long_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_long_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_long_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_long_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_long_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_long_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_long_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_long_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_long_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_long_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_long_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_long_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_short_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_short_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_short_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_short_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_short_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_short_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_short_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_short_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_short_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_short_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_short_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs__conversions_short_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_abs_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_abs_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_abs_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_abs_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_abs_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_abs_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_abs_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_acosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_asinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_atanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_bitwise_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_bitwise_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_bitwise_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_bitwise_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_bitwise_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_bitwise_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_ceil_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_ceil_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_ceil_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_ceil_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_ceil_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_physical_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_physical_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_physical_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_physical_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_physical_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_physical_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_conj_physical_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_cosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_deg2rad_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_deg2rad_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_deg2rad_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_deg2rad_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_deg2rad_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_deg2rad_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_digamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_digamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_digamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_digamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_digamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_digamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erfc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erfc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erfc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erfc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erfc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erfc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erfinv_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erfinv_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erfinv_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erfinv_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erfinv_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_erfinv_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_exp_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_expm1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_expm1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_expm1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_expm1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_expm1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_expm1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_fill_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_fill_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_fill_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_fill_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_fill_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_fill_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_fill_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_floor_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_floor_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_floor_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_floor_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_floor_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_imag_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isfinite_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isfinite_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isfinite_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isfinite_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isfinite_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isfinite_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isfinite_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isinf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isnan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isnan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isnan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isnan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isnan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isnan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isneginf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isneginf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isneginf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isneginf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isneginf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isneginf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isposinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isposinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isposinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isposinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isposinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isposinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isreal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isreal_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isreal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isreal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isreal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isreal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_isreal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_lgamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_lgamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_lgamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_lgamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_lgamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_lgamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log10_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log10_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log10_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log10_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log10_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log10_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log1p_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log1p_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log1p_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log1p_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log1p_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log1p_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_log_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_logical_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_logical_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_logical_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_logical_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_logical_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_logical_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_neg_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_neg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_neg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_neg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_neg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_neg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_hardshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_hardshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_hardshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_hardshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_hardtanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_hardtanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_hardtanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_hardtanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_hardtanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_hardtanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_hardtanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_hardtanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_relu6_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_relu6_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_relu6_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_relu6_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_relu6_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_relu6_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_relu6_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_relu6_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_relu6_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_relu_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_relu_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_relu_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_relu_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_relu_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_softshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_softshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_softshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_softshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_tanhshrink_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_tanhshrink_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_tanhshrink_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_tanhshrink_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_tanhshrink_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_nn_functional_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_positive_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_positive_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_positive_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_positive_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_positive_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_positive_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rad2deg_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rad2deg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rad2deg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rad2deg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rad2deg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rad2deg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_real_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_real_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_real_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_real_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_real_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_real_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_real_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_reciprocal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_reciprocal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_reciprocal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_reciprocal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_reciprocal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_reciprocal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_round_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_round_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_round_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_round_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_round_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rsqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rsqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rsqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rsqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rsqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rsqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_rsqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sgn_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sgn_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sgn_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sgn_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sgn_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sgn_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sgn_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sigmoid_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sigmoid_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sigmoid_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sigmoid_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sigmoid_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sigmoid_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sigmoid_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_signbit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_signbit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_signbit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_signbit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_signbit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_signbit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_bessel_j1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_bessel_j1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_bessel_j1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_bessel_j1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_bessel_j1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_bessel_j1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_entr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_entr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_entr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_entr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_entr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_entr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_erfcx_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_erfcx_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_erfcx_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_erfcx_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_erfcx_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_erfcx_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i0e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i0e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i0e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i0e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i0e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i0e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i1e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i1e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i1e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i1e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i1e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_i1e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_log_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_log_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_log_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_log_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_log_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_log_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_logit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_logit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_logit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_logit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_logit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_logit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_5_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_5_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_5_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_5_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_multigammaln_mvlgamma_p_5_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_ndtri_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_ndtri_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_ndtri_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_ndtri_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_ndtri_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_ndtri_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_spherical_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_spherical_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_spherical_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_spherical_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_spherical_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_special_spherical_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_sqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_square_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_square_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_square_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_square_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_square_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_square_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_tanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_trunc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_trunc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_trunc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_trunc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1__refs_trunc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_abs_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_abs_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_abs_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_abs_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_abs_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_abs_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_abs_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_acosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_angle_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_angle_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_angle_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_angle_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_angle_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_angle_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_angle_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_angle_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_angle_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_angle_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_angle_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_asinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_atanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bfloat16_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bfloat16_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bfloat16_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bfloat16_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bfloat16_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bfloat16_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bfloat16_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bfloat16_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bfloat16_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bfloat16_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bfloat16_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bfloat16_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bfloat16_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bitwise_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bitwise_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bitwise_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bitwise_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bitwise_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bitwise_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bool_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bool_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bool_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bool_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bool_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bool_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bool_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bool_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bool_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bool_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bool_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bool_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_bool_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_byte_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_byte_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_byte_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_byte_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_byte_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_byte_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_byte_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_byte_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_byte_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_byte_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_byte_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_byte_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cdouble_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cdouble_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cdouble_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cdouble_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cdouble_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cdouble_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cdouble_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cdouble_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cdouble_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cdouble_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cdouble_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cdouble_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cdouble_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_ceil_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_ceil_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_ceil_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_ceil_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_ceil_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cfloat_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cfloat_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cfloat_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cfloat_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cfloat_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cfloat_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cfloat_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cfloat_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cfloat_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cfloat_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cfloat_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cfloat_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cfloat_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_chalf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_chalf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_chalf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_chalf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_chalf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_chalf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_chalf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_chalf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_chalf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_chalf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_chalf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_chalf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_chalf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_char_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_char_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_char_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_char_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_char_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_char_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_char_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_char_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_char_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_char_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_char_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_char_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_char_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_physical_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_physical_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_physical_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_physical_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_physical_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_physical_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_conj_physical_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_cosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_deg2rad_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_deg2rad_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_deg2rad_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_deg2rad_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_deg2rad_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_deg2rad_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_digamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_digamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_digamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_digamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_digamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_digamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_double_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_double_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_double_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_double_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_double_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_double_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_double_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_double_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_double_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_double_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_double_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_double_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_double_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erfc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erfc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erfc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erfc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erfc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erfc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erfinv_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erfinv_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erfinv_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erfinv_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erfinv_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_erfinv_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_exp_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_expm1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_expm1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_expm1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_expm1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_expm1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_expm1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_fill_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_fill_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_fill_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_fill_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_fill_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_fill_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_fill_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_float_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_float_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_float_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_float_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_float_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_float_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_float_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_float_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_float_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_float_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_float_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_float_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_float_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_floor_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_floor_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_floor_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_floor_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_floor_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_half_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_half_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_half_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_half_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_half_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_half_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_half_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_half_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_half_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_half_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_half_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_half_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_imag_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_int_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_int_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_int_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_int_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_int_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_int_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_int_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_int_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_int_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_int_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_int_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_int_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isfinite_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isfinite_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isfinite_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isfinite_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isfinite_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isfinite_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isfinite_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isinf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isnan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isnan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isnan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isnan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isnan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isnan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isneginf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isneginf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isneginf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isneginf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isneginf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isneginf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isposinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isposinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isposinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isposinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isposinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isposinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isreal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isreal_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isreal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isreal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isreal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isreal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_isreal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_jiterator_unary_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_jiterator_unary_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_jiterator_unary_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_jiterator_unary_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_jiterator_unary_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_jiterator_unary_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_jiterator_unary_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_jiterator_unary_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_jiterator_unary_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_jiterator_unary_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_jiterator_unary_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_jiterator_unary_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bfloat16_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bfloat16_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bfloat16_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bfloat16_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bfloat16_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bfloat16_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bfloat16_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bfloat16_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bfloat16_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bfloat16_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bfloat16_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bfloat16_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bfloat16_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bool_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bool_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bool_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bool_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bool_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bool_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bool_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bool_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bool_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bool_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bool_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bool_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_bool_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_byte_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_byte_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_byte_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_byte_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_byte_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_byte_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_byte_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_byte_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_byte_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_byte_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_byte_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_byte_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cdouble_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cdouble_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cdouble_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cdouble_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cdouble_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cdouble_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cdouble_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cdouble_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cdouble_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cdouble_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cdouble_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cdouble_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cdouble_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cfloat_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cfloat_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cfloat_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cfloat_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cfloat_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cfloat_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cfloat_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cfloat_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cfloat_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cfloat_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cfloat_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cfloat_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_cfloat_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_chalf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_chalf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_chalf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_chalf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_chalf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_chalf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_chalf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_chalf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_chalf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_chalf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_chalf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_chalf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_chalf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_char_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_char_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_char_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_char_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_char_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_char_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_char_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_char_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_char_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_char_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_char_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_char_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_char_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_double_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_double_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_double_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_double_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_double_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_double_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_double_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_double_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_double_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_double_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_double_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_double_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_double_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_float_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_float_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_float_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_float_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_float_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_float_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_float_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_float_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_float_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_float_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_float_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_float_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_float_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_half_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_half_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_half_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_half_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_half_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_half_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_half_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_half_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_half_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_half_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_half_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_half_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_int_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_int_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_int_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_int_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_int_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_int_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_int_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_int_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_int_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_int_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_int_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_int_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_long_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_long_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_long_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_long_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_long_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_long_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_long_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_long_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_long_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_long_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_long_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_long_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_long_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_short_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_short_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_short_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_short_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_short_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_short_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_short_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_short_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_short_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_short_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_short_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_short_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_abs_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_abs_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_abs_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_abs_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_abs_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_abs_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_abs_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_acosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_asinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_atanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_ceil_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_ceil_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_ceil_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_ceil_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_ceil_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_physical_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_physical_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_physical_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_physical_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_physical_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_physical_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_conj_physical_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_cosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_deg2rad_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_deg2rad_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_deg2rad_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_deg2rad_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_deg2rad_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_deg2rad_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_digamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_digamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_digamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_digamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_digamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_digamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erfc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erfc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erfc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erfc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erfc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erfc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erfinv_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erfinv_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erfinv_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erfinv_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erfinv_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_erfinv_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_exp_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_expm1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_expm1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_expm1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_expm1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_expm1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_expm1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_fill_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_fill_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_fill_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_fill_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_fill_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_fill_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_fill_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_imag_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isfinite_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isfinite_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isfinite_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isfinite_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isfinite_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isfinite_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isfinite_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isinf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isnan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isnan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isnan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isnan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isnan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isnan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isneginf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isneginf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isneginf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isneginf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isneginf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isneginf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isposinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isposinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isposinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isposinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isposinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isposinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isreal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isreal_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isreal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isreal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isreal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isreal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_isreal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_lgamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_lgamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_lgamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_lgamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_lgamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_lgamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log10_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log10_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log10_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log10_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log10_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log10_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log1p_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log1p_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log1p_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log1p_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log1p_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log1p_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_log_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_neg_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_neg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_neg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_neg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_neg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_neg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_hardshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_hardshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_hardshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_hardshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_hardtanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_hardtanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_hardtanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_hardtanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_hardtanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_hardtanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_hardtanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_hardtanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_relu6_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_relu6_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_relu6_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_relu6_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_relu6_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_relu6_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_relu6_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_relu6_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_relu6_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_relu_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_relu_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_relu_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_relu_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_relu_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_softshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_softshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_softshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_softshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_tanhshrink_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_tanhshrink_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_tanhshrink_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_tanhshrink_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_tanhshrink_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_nn_functional_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_positive_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_positive_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_positive_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_positive_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_positive_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_positive_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rad2deg_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rad2deg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rad2deg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rad2deg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rad2deg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rad2deg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_real_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_real_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_real_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_real_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_real_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_real_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_real_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_reciprocal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_reciprocal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_reciprocal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_reciprocal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_reciprocal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_reciprocal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_round_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_round_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_round_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_round_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_round_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sgn_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sgn_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sgn_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sgn_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sgn_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sgn_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sgn_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sigmoid_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sigmoid_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sigmoid_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sigmoid_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sigmoid_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sigmoid_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sigmoid_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_signbit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_signbit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_signbit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_signbit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_signbit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_signbit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_bessel_j1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_bessel_j1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_bessel_j1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_bessel_j1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_bessel_j1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_bessel_j1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_entr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_entr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_entr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_entr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_entr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_entr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_erfcx_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_erfcx_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_erfcx_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_erfcx_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_erfcx_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_erfcx_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i0e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i0e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i0e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i0e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i0e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i0e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i1e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i1e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i1e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i1e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i1e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_i1e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_log_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_log_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_log_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_log_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_log_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_log_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_logit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_logit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_logit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_logit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_logit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_logit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_5_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_5_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_5_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_5_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_multigammaln_mvlgamma_p_5_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_ndtri_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_ndtri_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_ndtri_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_ndtri_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_ndtri_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_ndtri_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_spherical_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_spherical_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_spherical_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_spherical_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_spherical_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_spherical_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_sqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_square_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_square_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_square_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_square_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_square_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_square_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_tanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_trunc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_trunc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_trunc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_trunc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim__refs_trunc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_abs_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_abs_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_abs_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_abs_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_abs_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_abs_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_abs_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_acosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_angle_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_angle_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_angle_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_angle_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_angle_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_angle_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_angle_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_angle_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_angle_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_angle_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_angle_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_asinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_atanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bfloat16_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bfloat16_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bfloat16_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bfloat16_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bfloat16_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bfloat16_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bfloat16_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bfloat16_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bfloat16_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bfloat16_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bfloat16_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bfloat16_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bfloat16_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bool_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bool_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bool_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bool_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bool_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bool_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bool_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bool_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bool_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bool_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bool_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bool_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_bool_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_byte_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_byte_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_byte_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_byte_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_byte_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_byte_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_byte_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_byte_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_byte_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_byte_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_byte_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_byte_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cdouble_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cdouble_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cdouble_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cdouble_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cdouble_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cdouble_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cdouble_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cdouble_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cdouble_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cdouble_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cdouble_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cdouble_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cdouble_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_ceil_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_ceil_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_ceil_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_ceil_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_ceil_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cfloat_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cfloat_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cfloat_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cfloat_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cfloat_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cfloat_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cfloat_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cfloat_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cfloat_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cfloat_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cfloat_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cfloat_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cfloat_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_chalf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_chalf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_chalf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_chalf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_chalf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_chalf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_chalf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_chalf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_chalf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_chalf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_chalf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_chalf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_chalf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_char_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_char_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_char_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_char_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_char_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_char_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_char_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_char_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_char_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_char_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_char_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_char_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_char_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_physical_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_physical_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_physical_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_physical_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_physical_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_physical_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_conj_physical_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_cosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_deg2rad_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_deg2rad_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_deg2rad_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_deg2rad_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_deg2rad_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_deg2rad_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_digamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_digamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_digamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_digamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_digamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_digamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_double_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_double_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_double_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_double_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_double_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_double_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_double_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_double_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_double_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_double_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_double_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_double_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_double_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erfc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erfc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erfc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erfc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erfc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erfc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erfinv_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erfinv_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erfinv_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erfinv_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erfinv_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_erfinv_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_exp_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_expm1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_expm1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_expm1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_expm1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_expm1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_expm1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_fill_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_fill_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_fill_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_fill_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_fill_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_fill_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_fill_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_float_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_float_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_float_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_float_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_float_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_float_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_float_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_float_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_float_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_float_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_float_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_float_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_float_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_floor_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_floor_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_floor_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_floor_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_floor_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_half_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_half_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_half_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_half_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_half_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_half_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_half_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_half_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_half_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_half_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_half_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_half_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_imag_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_int_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_int_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_int_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_int_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_int_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_int_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_int_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_int_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_int_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_int_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_int_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_int_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isfinite_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isfinite_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isfinite_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isfinite_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isfinite_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isfinite_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isfinite_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isinf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isnan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isnan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isnan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isnan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isnan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isnan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isneginf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isneginf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isneginf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isneginf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isneginf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isneginf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isposinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isposinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isposinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isposinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isposinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isposinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isreal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isreal_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isreal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isreal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isreal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isreal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_isreal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_unary_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_unary_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_unary_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_unary_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_unary_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_unary_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_unary_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_unary_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_unary_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_unary_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_unary_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_unary_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_lgamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_lgamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_lgamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_lgamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_lgamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_lgamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log10_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log10_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log10_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log10_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log10_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log10_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log1p_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log1p_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log1p_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log1p_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log1p_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log1p_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_log_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_logical_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_logical_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_logical_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_logical_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_logical_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_logical_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_logit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_logit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_logit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_logit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_logit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_logit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_long_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_long_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_long_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_long_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_long_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_long_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_long_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_long_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_long_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_long_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_long_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_long_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_long_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_neg_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_neg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_neg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_neg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_neg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_neg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_hardshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_hardshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_hardshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_hardshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_hardsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_hardsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_hardsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_hardsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_hardtanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_hardtanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_hardtanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_hardtanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_hardtanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_hardtanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_hardtanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_hardtanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_logsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_logsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_logsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_logsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_relu6_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_relu6_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_relu6_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_relu6_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_relu6_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_relu6_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_relu6_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_relu6_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_relu6_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_relu_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_relu_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_relu_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_relu_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_relu_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_rrelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_rrelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_rrelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_rrelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_silu_complex_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_silu_complex_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_silu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_silu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_silu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_silu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_softshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_softshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_softshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_softshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_softsign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_softsign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_softsign_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_softsign_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_softsign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_softsign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_softsign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_softsign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_softsign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_softsign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_softsign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_softsign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_tanhshrink_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_tanhshrink_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_tanhshrink_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_tanhshrink_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_tanhshrink_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_nn_functional_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_3_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_4_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_4_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_4_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_4_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_4_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_4_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_4_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_4_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_4_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_polygamma_polygamma_n_4_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_positive_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_positive_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_positive_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_positive_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_positive_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_positive_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rad2deg_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rad2deg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rad2deg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rad2deg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rad2deg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rad2deg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_real_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_real_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_real_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_real_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_real_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_real_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_real_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_reciprocal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_reciprocal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_reciprocal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_reciprocal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_reciprocal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_reciprocal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_round_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_round_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_round_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_round_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_round_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_round_decimals_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_round_decimals_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_round_decimals_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_round_decimals_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_round_decimals_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_round_decimals_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_round_decimals_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_round_decimals_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_round_decimals_neg_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_round_decimals_neg_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_round_decimals_neg_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_round_decimals_neg_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rsqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rsqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rsqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rsqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rsqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rsqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_rsqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sgn_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sgn_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sgn_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sgn_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sgn_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sgn_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sgn_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_short_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_short_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_short_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_short_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_short_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_short_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_short_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_short_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_short_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_short_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_short_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_short_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sigmoid_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sigmoid_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sigmoid_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sigmoid_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sigmoid_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sigmoid_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sigmoid_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_signbit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_signbit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_signbit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_signbit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_signbit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_signbit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_airy_ai_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_airy_ai_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_airy_ai_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_airy_ai_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_airy_ai_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_airy_ai_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_airy_ai_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_airy_ai_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_j1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_j1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_j1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_j1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_j1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_j1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_y0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_y0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_y0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_y0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_y0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_y0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_y0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_y0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_y1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_y1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_y1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_y1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_y1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_y1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_y1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_bessel_y1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_entr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_entr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_entr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_entr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_entr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_entr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_erfcx_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_erfcx_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_erfcx_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_erfcx_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_erfcx_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_erfcx_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i0e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i0e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i0e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i0e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i0e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i0e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i1e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i1e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i1e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i1e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i1e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_i1e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_log_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_log_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_log_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_log_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_log_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_log_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_k0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_k0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_k0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_k0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_k0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_k0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_k1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_k1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_k1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_k1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_k1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_modified_bessel_k1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_ndtri_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_ndtri_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_ndtri_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_ndtri_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_ndtri_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_ndtri_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_scaled_modified_bessel_k0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_scaled_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_scaled_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_scaled_modified_bessel_k0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_scaled_modified_bessel_k0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_scaled_modified_bessel_k0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_scaled_modified_bessel_k0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_scaled_modified_bessel_k0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_scaled_modified_bessel_k1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_scaled_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_scaled_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_scaled_modified_bessel_k1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_scaled_modified_bessel_k1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_scaled_modified_bessel_k1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_scaled_modified_bessel_k1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_scaled_modified_bessel_k1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_spherical_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_spherical_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_spherical_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_spherical_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_spherical_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_special_spherical_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_sqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_square_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_square_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_square_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_square_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_square_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_square_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_tanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_trunc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_trunc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_trunc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_trunc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_large_dim_trunc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_lgamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_lgamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_lgamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_lgamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_lgamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_lgamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log10_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log10_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log10_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log10_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log10_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log10_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log1p_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log1p_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log1p_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log1p_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log1p_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log1p_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_log_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_logical_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_logical_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_logical_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_logical_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_logical_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_logical_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_logit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_logit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_logit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_logit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_logit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_logit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_long_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_long_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_long_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_long_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_long_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_long_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_long_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_long_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_long_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_long_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_long_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_long_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_long_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_neg_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_neg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_neg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_neg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_neg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_neg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_hardshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_hardshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_hardshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_hardshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_hardsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_hardsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_hardsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_hardsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_hardtanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_hardtanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_hardtanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_hardtanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_hardtanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_hardtanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_hardtanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_hardtanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_logsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_logsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_logsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_logsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_relu6_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_relu6_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_relu6_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_relu6_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_relu6_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_relu6_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_relu6_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_relu6_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_relu6_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_relu_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_relu_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_relu_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_relu_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_relu_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_rrelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_rrelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_rrelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_rrelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_silu_complex_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_silu_complex_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_silu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_silu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_silu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_silu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_softshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_softshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_softshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_softshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_softsign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_softsign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_softsign_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_softsign_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_softsign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_softsign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_softsign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_softsign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_softsign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_softsign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_softsign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_softsign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_tanhshrink_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_tanhshrink_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_tanhshrink_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_tanhshrink_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_tanhshrink_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_nn_functional_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_3_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_4_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_4_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_4_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_4_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_4_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_4_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_4_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_4_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_4_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_polygamma_polygamma_n_4_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_positive_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_positive_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_positive_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_positive_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_positive_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_positive_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rad2deg_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rad2deg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rad2deg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rad2deg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rad2deg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rad2deg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_real_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_real_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_real_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_real_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_real_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_real_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_real_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_reciprocal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_reciprocal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_reciprocal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_reciprocal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_reciprocal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_reciprocal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_round_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_round_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_round_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_round_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_round_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_round_decimals_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_round_decimals_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_round_decimals_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_round_decimals_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_round_decimals_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_round_decimals_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_round_decimals_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_round_decimals_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_round_decimals_neg_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_round_decimals_neg_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_round_decimals_neg_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_round_decimals_neg_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rsqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rsqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rsqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rsqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rsqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rsqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_rsqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sgn_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sgn_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sgn_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sgn_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sgn_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sgn_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sgn_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_short_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_short_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_short_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_short_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_short_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_short_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_short_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_short_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_short_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_short_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_short_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_short_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sigmoid_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sigmoid_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sigmoid_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sigmoid_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sigmoid_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sigmoid_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sigmoid_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_signbit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_signbit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_signbit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_signbit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_signbit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_signbit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_airy_ai_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_airy_ai_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_airy_ai_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_airy_ai_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_airy_ai_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_airy_ai_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_airy_ai_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_airy_ai_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_j1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_j1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_j1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_j1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_j1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_j1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_y0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_y0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_y0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_y0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_y0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_y0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_y0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_y0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_y1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_y1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_y1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_y1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_y1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_y1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_y1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_bessel_y1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_entr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_entr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_entr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_entr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_entr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_entr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_erfcx_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_erfcx_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_erfcx_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_erfcx_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_erfcx_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_erfcx_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i0e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i0e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i0e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i0e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i0e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i0e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i1e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i1e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i1e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i1e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i1e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_i1e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_log_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_log_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_log_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_log_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_log_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_log_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_k0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_k0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_k0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_k0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_k0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_k0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_k1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_k1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_k1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_k1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_k1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_modified_bessel_k1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_ndtri_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_ndtri_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_ndtri_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_ndtri_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_ndtri_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_ndtri_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_scaled_modified_bessel_k0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_scaled_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_scaled_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_scaled_modified_bessel_k0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_scaled_modified_bessel_k0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_scaled_modified_bessel_k0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_scaled_modified_bessel_k0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_scaled_modified_bessel_k0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_scaled_modified_bessel_k1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_scaled_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_scaled_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_scaled_modified_bessel_k1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_scaled_modified_bessel_k1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_scaled_modified_bessel_k1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_scaled_modified_bessel_k1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_scaled_modified_bessel_k1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_spherical_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_spherical_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_spherical_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_spherical_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_spherical_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_special_spherical_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_sqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_square_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_square_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_square_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_square_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_square_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_square_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_tanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_trunc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_trunc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_trunc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_trunc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_size1_trunc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bfloat16_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bfloat16_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bfloat16_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bfloat16_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bfloat16_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bfloat16_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bfloat16_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bfloat16_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bfloat16_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bfloat16_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bfloat16_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bfloat16_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bfloat16_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bool_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bool_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bool_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bool_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bool_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bool_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bool_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bool_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bool_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bool_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bool_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bool_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_bool_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_byte_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_byte_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_byte_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_byte_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_byte_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_byte_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_byte_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_byte_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_byte_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_byte_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_byte_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_byte_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cdouble_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cdouble_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cdouble_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cdouble_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cdouble_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cdouble_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cdouble_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cdouble_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cdouble_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cdouble_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cdouble_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cdouble_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cdouble_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cfloat_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cfloat_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cfloat_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cfloat_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cfloat_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cfloat_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cfloat_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cfloat_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cfloat_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cfloat_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cfloat_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cfloat_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_cfloat_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_chalf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_chalf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_chalf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_chalf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_chalf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_chalf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_chalf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_chalf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_chalf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_chalf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_chalf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_chalf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_chalf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_char_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_char_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_char_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_char_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_char_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_char_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_char_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_char_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_char_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_char_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_char_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_char_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_char_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_double_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_double_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_double_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_double_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_double_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_double_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_double_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_double_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_double_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_double_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_double_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_double_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_double_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_float_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_float_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_float_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_float_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_float_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_float_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_float_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_float_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_float_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_float_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_float_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_float_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_float_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_half_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_half_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_half_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_half_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_half_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_half_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_half_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_half_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_half_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_half_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_half_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_half_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_int_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_int_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_int_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_int_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_int_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_int_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_int_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_int_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_int_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_int_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_int_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_int_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_long_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_long_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_long_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_long_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_long_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_long_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_long_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_long_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_long_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_long_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_long_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_long_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_long_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_short_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_short_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_short_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_short_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_short_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_short_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_short_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_short_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_short_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_short_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_short_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_short_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_abs_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_abs_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_abs_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_abs_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_abs_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_abs_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_abs_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_acosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_asinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_atanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_ceil_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_ceil_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_ceil_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_ceil_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_ceil_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_physical_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_physical_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_physical_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_physical_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_physical_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_physical_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_conj_physical_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_cosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_deg2rad_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_deg2rad_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_deg2rad_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_deg2rad_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_deg2rad_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_deg2rad_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_digamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_digamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_digamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_digamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_digamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_digamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erfc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erfc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erfc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erfc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erfc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erfc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erfinv_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erfinv_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erfinv_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erfinv_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erfinv_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_erfinv_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_exp_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_expm1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_expm1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_expm1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_expm1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_expm1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_expm1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_fill_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_fill_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_fill_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_fill_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_fill_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_fill_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_fill_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_imag_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isfinite_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isfinite_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isfinite_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isfinite_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isfinite_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isfinite_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isfinite_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isinf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isnan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isnan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isnan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isnan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isnan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isnan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isneginf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isneginf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isneginf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isneginf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isneginf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isneginf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isposinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isposinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isposinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isposinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isposinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isposinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isreal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isreal_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isreal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isreal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isreal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isreal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_isreal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_lgamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_lgamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_lgamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_lgamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_lgamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_lgamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log10_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log10_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log10_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log10_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log10_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log10_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log1p_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log1p_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log1p_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log1p_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log1p_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log1p_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_log_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_neg_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_neg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_neg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_neg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_neg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_neg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_hardshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_hardshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_hardshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_hardshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_hardtanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_hardtanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_hardtanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_hardtanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_hardtanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_hardtanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_hardtanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_hardtanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_relu6_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_relu6_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_relu6_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_relu6_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_relu6_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_relu6_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_relu6_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_relu6_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_relu6_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_relu_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_relu_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_relu_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_relu_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_relu_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_softshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_softshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_softshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_softshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_tanhshrink_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_tanhshrink_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_tanhshrink_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_tanhshrink_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_tanhshrink_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_nn_functional_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_positive_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_positive_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_positive_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_positive_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_positive_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_positive_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rad2deg_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rad2deg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rad2deg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rad2deg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rad2deg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rad2deg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_real_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_real_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_real_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_real_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_real_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_real_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_real_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_reciprocal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_reciprocal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_reciprocal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_reciprocal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_reciprocal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_reciprocal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_round_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_round_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_round_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_round_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_round_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rsqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rsqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rsqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rsqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rsqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rsqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_rsqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sgn_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sgn_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sgn_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sgn_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sgn_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sgn_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sgn_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sigmoid_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sigmoid_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sigmoid_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sigmoid_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sigmoid_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sigmoid_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sigmoid_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_signbit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_signbit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_signbit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_signbit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_signbit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_signbit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_bessel_j1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_bessel_j1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_bessel_j1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_bessel_j1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_bessel_j1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_bessel_j1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_entr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_entr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_entr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_entr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_entr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_entr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_erfcx_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_erfcx_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_erfcx_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_erfcx_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_erfcx_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_erfcx_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i0e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i0e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i0e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i0e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i0e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i0e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i1e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i1e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i1e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i1e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i1e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_i1e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_log_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_log_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_log_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_log_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_log_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_log_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_logit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_logit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_logit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_logit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_logit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_logit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_5_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_5_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_5_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_5_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_multigammaln_mvlgamma_p_5_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_ndtri_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_ndtri_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_ndtri_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_ndtri_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_ndtri_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_ndtri_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_spherical_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_spherical_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_spherical_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_spherical_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_spherical_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_special_spherical_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_sqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_square_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_square_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_square_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_square_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_square_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_square_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_tanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_trunc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_trunc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_trunc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_trunc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other__refs_trunc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_abs_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_abs_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_abs_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_abs_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_abs_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_abs_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_abs_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_acosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_angle_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_angle_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_angle_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_angle_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_angle_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_angle_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_angle_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_angle_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_angle_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_angle_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_angle_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_asinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_atanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bfloat16_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bfloat16_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bfloat16_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bfloat16_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bfloat16_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bfloat16_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bfloat16_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bfloat16_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bfloat16_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bfloat16_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bfloat16_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bfloat16_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bfloat16_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bitwise_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bitwise_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bitwise_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bitwise_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bitwise_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bitwise_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bool_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bool_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bool_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bool_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bool_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bool_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bool_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bool_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bool_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bool_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bool_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bool_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_bool_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_byte_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_byte_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_byte_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_byte_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_byte_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_byte_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_byte_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_byte_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_byte_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_byte_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_byte_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_byte_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cdouble_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cdouble_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cdouble_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cdouble_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cdouble_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cdouble_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cdouble_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cdouble_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cdouble_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cdouble_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cdouble_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cdouble_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cdouble_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_ceil_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_ceil_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_ceil_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_ceil_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_ceil_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cfloat_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cfloat_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cfloat_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cfloat_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cfloat_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cfloat_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cfloat_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cfloat_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cfloat_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cfloat_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cfloat_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cfloat_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cfloat_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_chalf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_chalf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_chalf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_chalf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_chalf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_chalf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_chalf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_chalf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_chalf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_chalf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_chalf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_chalf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_chalf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_char_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_char_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_char_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_char_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_char_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_char_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_char_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_char_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_char_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_char_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_char_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_char_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_char_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_physical_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_physical_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_physical_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_physical_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_physical_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_physical_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_conj_physical_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_cosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_deg2rad_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_deg2rad_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_deg2rad_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_deg2rad_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_deg2rad_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_deg2rad_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_digamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_digamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_digamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_digamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_digamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_digamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_double_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_double_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_double_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_double_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_double_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_double_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_double_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_double_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_double_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_double_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_double_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_double_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_double_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erfc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erfc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erfc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erfc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erfc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erfc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erfinv_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erfinv_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erfinv_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erfinv_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erfinv_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_erfinv_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_exp_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_expm1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_expm1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_expm1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_expm1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_expm1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_expm1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_fill_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_fill_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_fill_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_fill_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_fill_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_fill_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_fill_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_float_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_float_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_float_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_float_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_float_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_float_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_float_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_float_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_float_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_float_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_float_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_float_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_float_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_floor_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_floor_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_floor_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_floor_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_floor_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_half_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_half_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_half_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_half_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_half_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_half_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_half_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_half_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_half_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_half_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_half_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_half_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_imag_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_int_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_int_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_int_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_int_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_int_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_int_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_int_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_int_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_int_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_int_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_int_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_int_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isfinite_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isfinite_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isfinite_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isfinite_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isfinite_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isfinite_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isfinite_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isinf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isnan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isnan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isnan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isnan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isnan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isnan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isneginf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isneginf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isneginf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isneginf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isneginf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isneginf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isposinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isposinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isposinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isposinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isposinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isposinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isreal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isreal_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isreal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isreal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isreal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isreal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_isreal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_jiterator_unary_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_jiterator_unary_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_jiterator_unary_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_jiterator_unary_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_jiterator_unary_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_jiterator_unary_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_jiterator_unary_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_jiterator_unary_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_jiterator_unary_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_jiterator_unary_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_jiterator_unary_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_jiterator_unary_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_lgamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_lgamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_lgamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_lgamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_lgamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_lgamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log10_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log10_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log10_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log10_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log10_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log10_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log1p_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log1p_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log1p_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log1p_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log1p_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log1p_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_log_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_logical_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_logical_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_logical_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_logical_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_logical_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_logical_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_logit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_logit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_logit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_logit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_logit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_logit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_long_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_long_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_long_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_long_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_long_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_long_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_long_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_long_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_long_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_long_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_long_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_long_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_long_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_neg_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_neg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_neg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_neg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_neg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_neg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_hardshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_hardshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_hardshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_hardshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_hardsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_hardsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_hardsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_hardsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_hardtanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_hardtanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_hardtanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_hardtanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_hardtanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_hardtanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_hardtanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_hardtanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_logsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_logsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_logsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_logsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_relu6_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_relu6_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_relu6_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_relu6_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_relu6_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_relu6_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_relu6_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_relu6_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_relu6_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_relu_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_relu_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_relu_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_relu_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_relu_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_rrelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_rrelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_rrelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_rrelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_silu_complex_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_silu_complex_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_silu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_silu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_silu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_silu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_softshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_softshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_softshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_softshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_softsign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_softsign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_softsign_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_softsign_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_softsign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_softsign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_softsign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_softsign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_softsign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_softsign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_softsign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_softsign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_tanhshrink_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_tanhshrink_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_tanhshrink_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_tanhshrink_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_tanhshrink_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_nn_functional_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_3_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_4_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_4_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_4_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_4_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_4_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_4_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_4_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_4_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_4_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_polygamma_polygamma_n_4_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_positive_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_positive_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_positive_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_positive_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_positive_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_positive_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rad2deg_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rad2deg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rad2deg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rad2deg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rad2deg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rad2deg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_real_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_real_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_real_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_real_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_real_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_real_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_real_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_reciprocal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_reciprocal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_reciprocal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_reciprocal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_reciprocal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_reciprocal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_round_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_round_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_round_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_round_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_round_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_round_decimals_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_round_decimals_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_round_decimals_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_round_decimals_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_round_decimals_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_round_decimals_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_round_decimals_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_round_decimals_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_round_decimals_neg_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_round_decimals_neg_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_round_decimals_neg_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_round_decimals_neg_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rsqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rsqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rsqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rsqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rsqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rsqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_rsqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sgn_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sgn_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sgn_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sgn_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sgn_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sgn_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sgn_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_short_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_short_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_short_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_short_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_short_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_short_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_short_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_short_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_short_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_short_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_short_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_short_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sigmoid_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sigmoid_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sigmoid_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sigmoid_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sigmoid_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sigmoid_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sigmoid_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_signbit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_signbit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_signbit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_signbit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_signbit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_signbit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_airy_ai_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_airy_ai_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_airy_ai_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_airy_ai_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_airy_ai_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_airy_ai_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_airy_ai_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_airy_ai_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_j1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_j1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_j1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_j1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_j1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_j1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_y0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_y0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_y0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_y0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_y0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_y0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_y0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_y0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_y1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_y1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_y1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_y1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_y1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_y1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_y1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_bessel_y1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_entr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_entr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_entr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_entr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_entr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_entr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_erfcx_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_erfcx_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_erfcx_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_erfcx_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_erfcx_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_erfcx_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i0e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i0e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i0e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i0e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i0e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i0e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i1e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i1e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i1e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i1e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i1e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_i1e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_log_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_log_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_log_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_log_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_log_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_log_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_k0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_k0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_k0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_k0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_k0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_k0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_k1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_k1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_k1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_k1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_k1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_modified_bessel_k1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_ndtri_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_ndtri_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_ndtri_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_ndtri_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_ndtri_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_ndtri_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_scaled_modified_bessel_k0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_scaled_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_scaled_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_scaled_modified_bessel_k0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_scaled_modified_bessel_k0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_scaled_modified_bessel_k0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_scaled_modified_bessel_k0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_scaled_modified_bessel_k0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_scaled_modified_bessel_k1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_scaled_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_scaled_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_scaled_modified_bessel_k1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_scaled_modified_bessel_k1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_scaled_modified_bessel_k1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_scaled_modified_bessel_k1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_scaled_modified_bessel_k1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_spherical_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_spherical_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_spherical_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_spherical_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_spherical_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_special_spherical_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_sqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_square_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_square_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_square_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_square_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_square_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_square_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_tanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_trunc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_trunc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_trunc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_trunc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_every_other_trunc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bfloat16_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bfloat16_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bfloat16_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bfloat16_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bfloat16_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bfloat16_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bfloat16_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bfloat16_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bfloat16_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bfloat16_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bfloat16_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bfloat16_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bfloat16_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bool_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bool_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bool_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bool_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bool_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bool_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bool_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bool_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bool_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bool_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bool_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bool_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_bool_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_byte_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_byte_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_byte_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_byte_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_byte_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_byte_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_byte_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_byte_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_byte_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_byte_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_byte_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_byte_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cdouble_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cdouble_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cdouble_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cdouble_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cdouble_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cdouble_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cdouble_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cdouble_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cdouble_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cdouble_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cdouble_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cdouble_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cdouble_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cfloat_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cfloat_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cfloat_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cfloat_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cfloat_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cfloat_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cfloat_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cfloat_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cfloat_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cfloat_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cfloat_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cfloat_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_cfloat_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_chalf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_chalf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_chalf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_chalf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_chalf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_chalf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_chalf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_chalf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_chalf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_chalf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_chalf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_chalf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_chalf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_char_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_char_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_char_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_char_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_char_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_char_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_char_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_char_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_char_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_char_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_char_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_char_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_char_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_double_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_double_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_double_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_double_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_double_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_double_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_double_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_double_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_double_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_double_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_double_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_double_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_double_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_float_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_float_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_float_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_float_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_float_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_float_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_float_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_float_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_float_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_float_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_float_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_float_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_float_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_half_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_half_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_half_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_half_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_half_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_half_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_half_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_half_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_half_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_half_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_half_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_half_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_int_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_int_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_int_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_int_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_int_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_int_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_int_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_int_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_int_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_int_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_int_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_int_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_long_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_long_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_long_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_long_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_long_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_long_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_long_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_long_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_long_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_long_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_long_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_long_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_long_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_short_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_short_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_short_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_short_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_short_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_short_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_short_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_short_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_short_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_short_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_short_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_short_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_abs_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_abs_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_abs_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_abs_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_abs_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_abs_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_abs_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_acosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_asinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_atanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_ceil_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_ceil_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_ceil_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_ceil_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_ceil_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_physical_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_physical_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_physical_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_physical_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_physical_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_physical_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_conj_physical_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_cosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_deg2rad_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_deg2rad_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_deg2rad_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_deg2rad_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_deg2rad_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_deg2rad_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_digamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_digamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_digamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_digamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_digamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_digamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erfc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erfc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erfc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erfc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erfc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erfc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erfinv_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erfinv_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erfinv_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erfinv_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erfinv_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_erfinv_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_exp_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_expm1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_expm1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_expm1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_expm1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_expm1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_expm1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_fill_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_fill_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_fill_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_fill_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_fill_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_fill_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_fill_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_imag_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isfinite_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isfinite_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isfinite_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isfinite_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isfinite_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isfinite_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isfinite_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isinf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isnan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isnan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isnan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isnan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isnan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isnan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isneginf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isneginf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isneginf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isneginf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isneginf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isneginf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isposinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isposinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isposinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isposinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isposinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isposinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isreal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isreal_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isreal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isreal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isreal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isreal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_isreal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_lgamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_lgamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_lgamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_lgamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_lgamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_lgamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log10_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log10_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log10_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log10_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log10_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log10_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log1p_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log1p_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log1p_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log1p_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log1p_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log1p_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_log_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_neg_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_neg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_neg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_neg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_neg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_neg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_hardshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_hardshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_hardshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_hardshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_hardtanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_hardtanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_hardtanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_hardtanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_hardtanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_hardtanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_hardtanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_hardtanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_relu6_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_relu6_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_relu6_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_relu6_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_relu6_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_relu6_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_relu6_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_relu6_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_relu6_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_relu_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_relu_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_relu_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_relu_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_relu_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_softshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_softshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_softshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_softshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_tanhshrink_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_tanhshrink_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_tanhshrink_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_tanhshrink_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_tanhshrink_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_nn_functional_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_positive_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_positive_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_positive_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_positive_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_positive_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_positive_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rad2deg_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rad2deg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rad2deg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rad2deg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rad2deg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rad2deg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_real_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_real_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_real_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_real_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_real_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_real_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_real_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_reciprocal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_reciprocal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_reciprocal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_reciprocal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_reciprocal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_reciprocal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_round_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_round_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_round_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_round_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_round_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rsqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rsqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rsqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rsqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rsqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rsqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_rsqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sgn_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sgn_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sgn_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sgn_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sgn_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sgn_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sgn_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sigmoid_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sigmoid_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sigmoid_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sigmoid_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sigmoid_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sigmoid_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sigmoid_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_signbit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_signbit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_signbit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_signbit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_signbit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_signbit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_bessel_j1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_bessel_j1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_bessel_j1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_bessel_j1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_bessel_j1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_bessel_j1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_entr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_entr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_entr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_entr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_entr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_entr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_erfcx_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_erfcx_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_erfcx_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_erfcx_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_erfcx_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_erfcx_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i0e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i0e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i0e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i0e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i0e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i0e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i1e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i1e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i1e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i1e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i1e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_i1e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_log_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_log_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_log_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_log_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_log_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_log_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_logit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_logit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_logit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_logit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_logit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_logit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_5_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_5_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_5_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_5_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_multigammaln_mvlgamma_p_5_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_ndtri_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_ndtri_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_ndtri_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_ndtri_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_ndtri_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_ndtri_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_spherical_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_spherical_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_spherical_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_spherical_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_spherical_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_special_spherical_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_sqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_square_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_square_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_square_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_square_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_square_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_square_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_tanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_trunc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_trunc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_trunc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_trunc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed__refs_trunc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_abs_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_abs_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_abs_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_abs_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_abs_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_abs_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_abs_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_acosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_angle_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_angle_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_angle_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_angle_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_angle_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_angle_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_angle_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_angle_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_angle_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_angle_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_angle_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_asinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_atanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bfloat16_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bfloat16_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bfloat16_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bfloat16_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bfloat16_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bfloat16_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bfloat16_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bfloat16_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bfloat16_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bfloat16_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bfloat16_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bfloat16_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bfloat16_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bitwise_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bitwise_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bitwise_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bitwise_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bitwise_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bitwise_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bool_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bool_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bool_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bool_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bool_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bool_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bool_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bool_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bool_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bool_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bool_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bool_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_bool_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_byte_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_byte_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_byte_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_byte_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_byte_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_byte_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_byte_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_byte_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_byte_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_byte_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_byte_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_byte_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cdouble_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cdouble_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cdouble_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cdouble_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cdouble_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cdouble_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cdouble_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cdouble_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cdouble_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cdouble_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cdouble_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cdouble_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cdouble_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_ceil_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_ceil_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_ceil_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_ceil_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_ceil_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cfloat_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cfloat_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cfloat_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cfloat_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cfloat_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cfloat_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cfloat_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cfloat_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cfloat_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cfloat_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cfloat_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cfloat_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cfloat_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_chalf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_chalf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_chalf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_chalf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_chalf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_chalf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_chalf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_chalf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_chalf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_chalf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_chalf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_chalf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_chalf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_char_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_char_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_char_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_char_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_char_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_char_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_char_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_char_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_char_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_char_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_char_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_char_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_char_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_physical_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_physical_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_physical_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_physical_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_physical_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_physical_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_conj_physical_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_cosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_deg2rad_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_deg2rad_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_deg2rad_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_deg2rad_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_deg2rad_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_deg2rad_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_digamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_digamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_digamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_digamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_digamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_digamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_double_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_double_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_double_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_double_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_double_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_double_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_double_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_double_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_double_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_double_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_double_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_double_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_double_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erfc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erfc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erfc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erfc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erfc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erfc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erfinv_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erfinv_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erfinv_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erfinv_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erfinv_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_erfinv_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_exp_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_expm1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_expm1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_expm1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_expm1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_expm1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_expm1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_fill_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_fill_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_fill_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_fill_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_fill_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_fill_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_fill_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_float_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_float_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_float_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_float_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_float_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_float_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_float_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_float_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_float_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_float_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_float_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_float_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_float_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_floor_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_floor_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_floor_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_floor_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_floor_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_half_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_half_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_half_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_half_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_half_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_half_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_half_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_half_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_half_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_half_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_half_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_half_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_imag_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_int_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_int_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_int_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_int_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_int_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_int_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_int_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_int_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_int_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_int_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_int_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_int_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isfinite_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isfinite_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isfinite_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isfinite_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isfinite_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isfinite_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isfinite_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isinf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isnan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isnan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isnan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isnan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isnan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isnan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isneginf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isneginf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isneginf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isneginf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isneginf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isneginf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isposinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isposinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isposinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isposinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isposinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isposinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isreal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isreal_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isreal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isreal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isreal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isreal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_isreal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_jiterator_unary_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_jiterator_unary_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_jiterator_unary_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_jiterator_unary_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_jiterator_unary_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_jiterator_unary_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_jiterator_unary_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_jiterator_unary_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_jiterator_unary_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_jiterator_unary_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_jiterator_unary_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_jiterator_unary_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_lgamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_lgamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_lgamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_lgamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_lgamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_lgamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log10_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log10_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log10_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log10_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log10_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log10_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log1p_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log1p_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log1p_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log1p_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log1p_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log1p_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_log_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_logical_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_logical_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_logical_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_logical_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_logical_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_logical_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_logit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_logit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_logit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_logit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_logit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_logit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_long_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_long_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_long_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_long_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_long_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_long_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_long_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_long_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_long_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_long_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_long_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_long_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_long_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_neg_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_neg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_neg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_neg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_neg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_neg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_hardshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_hardshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_hardshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_hardshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_hardsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_hardsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_hardsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_hardsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_hardtanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_hardtanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_hardtanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_hardtanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_hardtanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_hardtanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_hardtanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_hardtanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_logsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_logsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_logsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_logsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_relu6_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_relu6_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_relu6_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_relu6_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_relu6_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_relu6_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_relu6_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_relu6_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_relu6_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_relu_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_relu_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_relu_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_relu_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_relu_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_rrelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_rrelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_rrelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_rrelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_silu_complex_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_silu_complex_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_silu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_silu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_silu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_silu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_softshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_softshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_softshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_softshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_softsign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_softsign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_softsign_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_softsign_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_softsign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_softsign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_softsign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_softsign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_softsign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_softsign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_softsign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_softsign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_tanhshrink_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_tanhshrink_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_tanhshrink_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_tanhshrink_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_tanhshrink_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_nn_functional_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_3_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_4_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_4_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_4_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_4_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_4_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_4_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_4_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_4_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_4_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_polygamma_polygamma_n_4_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_positive_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_positive_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_positive_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_positive_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_positive_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_positive_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rad2deg_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rad2deg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rad2deg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rad2deg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rad2deg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rad2deg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_real_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_real_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_real_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_real_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_real_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_real_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_real_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_reciprocal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_reciprocal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_reciprocal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_reciprocal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_reciprocal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_reciprocal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_round_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_round_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_round_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_round_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_round_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_round_decimals_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_round_decimals_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_round_decimals_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_round_decimals_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_round_decimals_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_round_decimals_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_round_decimals_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_round_decimals_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_round_decimals_neg_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_round_decimals_neg_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_round_decimals_neg_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_round_decimals_neg_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rsqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rsqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rsqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rsqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rsqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rsqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_rsqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sgn_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sgn_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sgn_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sgn_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sgn_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sgn_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sgn_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_short_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_short_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_short_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_short_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_short_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_short_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_short_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_short_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_short_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_short_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_short_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_short_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sigmoid_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sigmoid_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sigmoid_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sigmoid_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sigmoid_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sigmoid_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sigmoid_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_signbit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_signbit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_signbit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_signbit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_signbit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_signbit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_airy_ai_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_airy_ai_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_airy_ai_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_airy_ai_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_airy_ai_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_airy_ai_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_airy_ai_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_airy_ai_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_j1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_j1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_j1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_j1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_j1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_j1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_y0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_y0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_y0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_y0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_y0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_y0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_y0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_y0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_y1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_y1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_y1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_y1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_y1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_y1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_y1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_bessel_y1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_entr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_entr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_entr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_entr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_entr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_entr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_erfcx_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_erfcx_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_erfcx_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_erfcx_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_erfcx_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_erfcx_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i0e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i0e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i0e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i0e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i0e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i0e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i1e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i1e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i1e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i1e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i1e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_i1e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_log_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_log_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_log_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_log_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_log_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_log_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_k0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_k0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_k0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_k0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_k0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_k0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_k1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_k1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_k1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_k1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_k1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_modified_bessel_k1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_ndtri_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_ndtri_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_ndtri_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_ndtri_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_ndtri_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_ndtri_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_scaled_modified_bessel_k0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_scaled_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_scaled_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_scaled_modified_bessel_k0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_scaled_modified_bessel_k0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_scaled_modified_bessel_k0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_scaled_modified_bessel_k0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_scaled_modified_bessel_k0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_scaled_modified_bessel_k1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_scaled_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_scaled_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_scaled_modified_bessel_k1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_scaled_modified_bessel_k1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_scaled_modified_bessel_k1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_scaled_modified_bessel_k1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_scaled_modified_bessel_k1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_spherical_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_spherical_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_spherical_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_spherical_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_spherical_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_special_spherical_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_sqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_square_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_square_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_square_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_square_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_square_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_square_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_tanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_trunc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_trunc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_trunc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_trunc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_contig_vs_transposed_trunc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_digamma_special_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_digamma_special_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_exp_slow_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_special_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_special_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_special_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_special_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_special_multigammaln_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_special_multigammaln_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_special_multigammaln_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_special_multigammaln_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_special_multigammaln_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_special_multigammaln_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_special_multigammaln_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains__refs_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_float_domains_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_frexp_assert_raises_cuda, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_hardshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_hardshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_hardshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_hardshrink_edge_cases_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_hardshrink_edge_cases_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_hardshrink_edge_cases_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_hardsigmoid_backward_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_hardsigmoid_backward_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_hardsigmoid_backward_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_hardsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_hardsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_hardsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_hardswish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_hardswish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_hardswish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_i0_range1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_i0_range1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_i0_range1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_i0_range1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_i0_range2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_i0_range2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_i0_range2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_i0_range2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_i0_range3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_i0_special_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_i0_special_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_i0_special_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_i0_special_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_igamma_common_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_igamma_common_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_igamma_edge_cases_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_igamma_edge_cases_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_igammac_common_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_igammac_common_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_igammac_edge_cases_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_igammac_edge_cases_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_isposinf_isneginf_non_boolean_output_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_isposinf_isneginf_non_boolean_output_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_isposinf_isneginf_non_boolean_output_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_isposinf_isneginf_non_boolean_output_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_isposinf_isneginf_non_boolean_output_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_isposinf_isneginf_non_boolean_output_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_isposinf_isneginf_non_boolean_output_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_isposinf_isneginf_non_boolean_output_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_isposinf_isneginf_non_boolean_output_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_isposinf_isneginf_non_boolean_output_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_isposinf_isneginf_non_boolean_output_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_log1p_complex_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_log1p_complex_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_nan_to_num_bfloat16_cuda, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_nan_to_num_complex_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_nan_to_num_complex_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_nan_to_num_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_nan_to_num_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_narrow_dtypes_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_narrow_dtypes_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bfloat16_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bfloat16_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bfloat16_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bfloat16_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bfloat16_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bfloat16_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bfloat16_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bfloat16_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bfloat16_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bfloat16_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bfloat16_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bfloat16_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bfloat16_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bool_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bool_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bool_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bool_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bool_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bool_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bool_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bool_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bool_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bool_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bool_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bool_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_bool_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_byte_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_byte_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_byte_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_byte_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_byte_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_byte_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_byte_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_byte_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_byte_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_byte_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_byte_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_byte_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cdouble_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cdouble_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cdouble_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cdouble_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cdouble_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cdouble_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cdouble_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cdouble_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cdouble_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cdouble_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cdouble_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cdouble_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cdouble_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cfloat_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cfloat_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cfloat_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cfloat_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cfloat_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cfloat_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cfloat_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cfloat_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cfloat_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cfloat_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cfloat_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cfloat_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_cfloat_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_chalf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_chalf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_chalf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_chalf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_chalf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_chalf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_chalf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_chalf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_chalf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_chalf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_chalf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_chalf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_chalf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_char_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_char_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_char_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_char_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_char_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_char_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_char_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_char_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_char_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_char_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_char_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_char_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_char_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_double_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_double_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_double_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_double_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_double_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_double_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_double_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_double_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_double_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_double_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_double_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_double_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_double_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_float_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_float_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_float_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_float_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_float_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_float_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_float_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_float_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_float_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_float_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_float_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_float_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_float_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_half_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_half_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_half_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_half_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_half_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_half_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_half_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_half_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_half_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_half_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_half_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_half_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_int_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_int_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_int_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_int_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_int_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_int_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_int_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_int_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_int_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_int_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_int_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_int_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_long_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_long_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_long_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_long_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_long_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_long_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_long_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_long_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_long_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_long_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_long_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_long_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_long_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_short_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_short_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_short_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_short_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_short_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_short_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_short_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_short_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_short_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_short_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_short_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs__conversions_short_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_abs_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_abs_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_abs_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_abs_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_abs_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_abs_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_abs_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_acosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_asinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_atanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_bitwise_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_bitwise_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_bitwise_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_bitwise_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_bitwise_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_bitwise_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_ceil_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_ceil_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_ceil_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_ceil_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_ceil_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_physical_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_physical_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_physical_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_physical_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_physical_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_physical_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_conj_physical_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_cosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_deg2rad_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_deg2rad_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_deg2rad_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_deg2rad_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_deg2rad_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_deg2rad_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_digamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_digamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_digamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_digamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_digamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_digamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erfc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erfc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erfc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erfc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erfc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erfc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erfinv_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erfinv_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erfinv_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erfinv_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erfinv_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_erfinv_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_exp_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_expm1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_expm1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_expm1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_expm1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_expm1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_expm1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_fill_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_fill_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_fill_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_fill_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_fill_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_fill_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_fill_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_floor_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_floor_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_floor_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_floor_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_floor_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_imag_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isfinite_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isfinite_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isfinite_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isfinite_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isfinite_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isfinite_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isfinite_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isinf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isnan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isnan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isnan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isnan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isnan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isnan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isneginf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isneginf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isneginf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isneginf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isneginf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isneginf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isposinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isposinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isposinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isposinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isposinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isposinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isreal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isreal_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isreal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isreal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isreal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isreal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_isreal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_lgamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_lgamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_lgamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_lgamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_lgamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_lgamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log10_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log10_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log10_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log10_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log10_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log10_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log1p_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log1p_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log1p_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log1p_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log1p_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log1p_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_log_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_logical_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_logical_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_logical_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_logical_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_logical_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_logical_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_neg_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_neg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_neg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_neg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_neg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_neg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_hardshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_hardshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_hardshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_hardshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_hardtanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_hardtanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_hardtanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_hardtanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_hardtanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_hardtanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_hardtanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_hardtanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_relu6_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_relu6_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_relu6_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_relu6_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_relu6_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_relu6_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_relu6_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_relu6_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_relu6_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_relu_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_relu_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_relu_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_relu_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_relu_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_softshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_softshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_softshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_softshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_tanhshrink_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_tanhshrink_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_tanhshrink_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_tanhshrink_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_tanhshrink_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_nn_functional_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_positive_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_positive_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_positive_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_positive_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_positive_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_positive_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rad2deg_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rad2deg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rad2deg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rad2deg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rad2deg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rad2deg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_real_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_real_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_real_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_real_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_real_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_real_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_real_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_reciprocal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_reciprocal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_reciprocal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_reciprocal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_reciprocal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_reciprocal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_round_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_round_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_round_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_round_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_round_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rsqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rsqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rsqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rsqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rsqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rsqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_rsqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sgn_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sgn_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sgn_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sgn_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sgn_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sgn_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sgn_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sigmoid_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sigmoid_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sigmoid_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sigmoid_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sigmoid_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sigmoid_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sigmoid_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_signbit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_signbit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_signbit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_signbit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_signbit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_signbit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_bessel_j1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_bessel_j1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_bessel_j1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_bessel_j1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_bessel_j1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_bessel_j1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_entr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_entr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_entr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_entr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_entr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_entr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_erfcx_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_erfcx_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_erfcx_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_erfcx_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_erfcx_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_erfcx_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i0e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i0e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i0e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i0e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i0e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i0e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i1e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i1e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i1e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i1e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i1e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_i1e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_log_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_log_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_log_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_log_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_log_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_log_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_logit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_logit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_logit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_logit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_logit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_logit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_5_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_5_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_5_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_5_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_multigammaln_mvlgamma_p_5_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_ndtri_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_ndtri_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_ndtri_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_ndtri_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_ndtri_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_ndtri_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_spherical_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_spherical_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_spherical_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_spherical_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_spherical_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_special_spherical_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_sqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_square_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_square_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_square_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_square_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_square_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_square_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_tanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_trunc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_trunc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_trunc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_trunc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig__refs_trunc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_abs_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_abs_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_abs_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_abs_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_abs_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_abs_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_abs_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_acosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_angle_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_angle_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_angle_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_angle_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_angle_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_angle_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_angle_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_angle_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_angle_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_angle_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_angle_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_asinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_atanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bfloat16_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bfloat16_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bfloat16_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bfloat16_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bfloat16_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bfloat16_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bfloat16_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bfloat16_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bfloat16_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bfloat16_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bfloat16_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bfloat16_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bfloat16_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bitwise_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bitwise_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bitwise_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bitwise_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bitwise_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bitwise_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bool_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bool_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bool_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bool_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bool_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bool_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bool_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bool_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bool_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bool_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bool_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bool_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_bool_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_byte_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_byte_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_byte_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_byte_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_byte_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_byte_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_byte_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_byte_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_byte_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_byte_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_byte_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_byte_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cdouble_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cdouble_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cdouble_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cdouble_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cdouble_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cdouble_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cdouble_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cdouble_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cdouble_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cdouble_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cdouble_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cdouble_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cdouble_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_ceil_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_ceil_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_ceil_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_ceil_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_ceil_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cfloat_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cfloat_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cfloat_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cfloat_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cfloat_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cfloat_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cfloat_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cfloat_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cfloat_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cfloat_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cfloat_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cfloat_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cfloat_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_chalf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_chalf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_chalf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_chalf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_chalf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_chalf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_chalf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_chalf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_chalf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_chalf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_chalf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_chalf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_chalf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_char_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_char_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_char_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_char_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_char_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_char_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_char_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_char_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_char_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_char_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_char_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_char_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_char_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_physical_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_physical_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_physical_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_physical_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_physical_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_physical_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_conj_physical_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_cosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_deg2rad_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_deg2rad_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_deg2rad_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_deg2rad_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_deg2rad_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_deg2rad_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_digamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_digamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_digamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_digamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_digamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_digamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_double_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_double_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_double_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_double_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_double_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_double_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_double_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_double_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_double_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_double_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_double_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_double_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_double_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erfc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erfc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erfc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erfc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erfc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erfc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erfinv_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erfinv_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erfinv_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erfinv_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erfinv_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_erfinv_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_exp_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bfloat16_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bfloat16_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bfloat16_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bfloat16_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bfloat16_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bfloat16_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bfloat16_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bfloat16_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bfloat16_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bfloat16_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bfloat16_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bfloat16_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bfloat16_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bool_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bool_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bool_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bool_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bool_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bool_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bool_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bool_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bool_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bool_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bool_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bool_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_bool_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_byte_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_byte_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_byte_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_byte_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_byte_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_byte_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_byte_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_byte_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_byte_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_byte_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_byte_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_byte_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cdouble_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cdouble_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cdouble_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cdouble_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cdouble_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cdouble_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cdouble_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cdouble_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cdouble_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cdouble_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cdouble_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cdouble_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cdouble_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cfloat_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cfloat_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cfloat_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cfloat_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cfloat_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cfloat_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cfloat_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cfloat_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cfloat_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cfloat_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cfloat_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cfloat_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_cfloat_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_chalf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_chalf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_chalf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_chalf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_chalf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_chalf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_chalf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_chalf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_chalf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_chalf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_chalf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_chalf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_chalf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_char_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_char_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_char_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_char_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_char_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_char_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_char_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_char_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_char_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_char_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_char_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_char_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_char_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_double_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_double_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_double_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_double_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_double_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_double_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_double_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_double_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_double_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_double_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_double_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_double_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_double_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_float_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_float_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_float_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_float_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_float_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_float_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_float_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_float_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_float_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_float_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_float_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_float_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_float_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_half_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_half_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_half_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_half_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_half_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_half_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_half_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_half_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_half_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_half_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_half_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_half_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_int_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_int_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_int_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_int_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_int_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_int_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_int_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_int_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_int_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_int_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_int_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_int_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_long_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_long_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_long_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_long_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_long_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_long_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_long_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_long_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_long_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_long_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_long_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_long_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_long_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_short_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_short_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_short_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_short_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_short_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_short_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_short_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_short_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_short_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_short_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_short_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs__conversions_short_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_abs_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_abs_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_abs_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_abs_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_abs_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_abs_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_abs_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_acosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_asinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_atanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_ceil_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_ceil_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_ceil_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_ceil_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_ceil_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_physical_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_physical_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_physical_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_physical_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_physical_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_physical_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_conj_physical_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_cosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_deg2rad_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_deg2rad_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_deg2rad_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_deg2rad_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_deg2rad_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_deg2rad_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_digamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_digamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_digamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_digamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_digamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_digamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erfc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erfc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erfc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erfc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erfc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erfc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erfinv_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erfinv_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erfinv_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erfinv_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erfinv_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_erfinv_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_exp_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_expm1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_expm1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_expm1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_expm1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_expm1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_expm1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_fill_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_fill_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_fill_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_fill_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_fill_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_fill_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_fill_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_floor_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_floor_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_floor_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_floor_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_floor_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_imag_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isfinite_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isfinite_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isfinite_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isfinite_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isfinite_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isfinite_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isfinite_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isinf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isnan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isnan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isnan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isnan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isnan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isnan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isneginf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isneginf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isneginf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isneginf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isneginf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isneginf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isposinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isposinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isposinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isposinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isposinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isposinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isreal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isreal_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isreal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isreal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isreal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isreal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_isreal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_lgamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_lgamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_lgamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_lgamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_lgamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_lgamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log10_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log10_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log10_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log10_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log10_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log10_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log1p_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log1p_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log1p_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log1p_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log1p_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log1p_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_log_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_logical_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_logical_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_logical_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_logical_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_logical_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_logical_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_neg_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_neg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_neg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_neg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_neg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_neg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_hardshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_hardshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_hardshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_hardshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_hardtanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_hardtanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_hardtanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_hardtanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_hardtanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_hardtanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_hardtanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_hardtanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_relu6_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_relu6_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_relu6_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_relu6_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_relu6_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_relu6_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_relu6_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_relu6_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_relu6_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_relu_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_relu_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_relu_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_relu_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_relu_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_softshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_softshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_softshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_softshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_tanhshrink_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_tanhshrink_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_tanhshrink_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_tanhshrink_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_tanhshrink_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_nn_functional_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_positive_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_positive_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_positive_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_positive_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_positive_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_positive_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rad2deg_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rad2deg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rad2deg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rad2deg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rad2deg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rad2deg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_real_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_real_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_real_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_real_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_real_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_real_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_real_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_reciprocal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_reciprocal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_reciprocal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_reciprocal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_reciprocal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_reciprocal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_round_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_round_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_round_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_round_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_round_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rsqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rsqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rsqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rsqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rsqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rsqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_rsqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sgn_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sgn_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sgn_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sgn_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sgn_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sgn_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sgn_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sigmoid_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sigmoid_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sigmoid_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sigmoid_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sigmoid_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sigmoid_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sigmoid_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_signbit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_signbit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_signbit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_signbit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_signbit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_signbit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_bessel_j1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_bessel_j1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_bessel_j1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_bessel_j1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_bessel_j1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_bessel_j1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_entr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_entr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_entr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_entr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_entr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_entr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_erfcx_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_erfcx_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_erfcx_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_erfcx_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_erfcx_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_erfcx_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i0e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i0e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i0e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i0e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i0e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i0e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i1e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i1e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i1e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i1e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i1e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_i1e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_log_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_log_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_log_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_log_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_log_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_log_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_logit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_logit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_logit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_logit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_logit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_logit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_5_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_5_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_5_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_5_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_multigammaln_mvlgamma_p_5_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_ndtri_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_ndtri_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_ndtri_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_ndtri_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_ndtri_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_ndtri_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_spherical_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_spherical_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_spherical_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_spherical_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_spherical_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_special_spherical_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_sqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_square_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_square_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_square_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_square_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_square_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_square_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_tanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_trunc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_trunc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_trunc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_trunc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand__refs_trunc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_abs_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_abs_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_abs_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_abs_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_abs_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_abs_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_abs_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_acosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_angle_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_angle_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_angle_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_angle_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_angle_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_angle_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_angle_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_angle_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_angle_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_angle_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_angle_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_asinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_atanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bfloat16_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bfloat16_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bfloat16_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bfloat16_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bfloat16_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bfloat16_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bfloat16_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bfloat16_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bfloat16_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bfloat16_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bfloat16_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bfloat16_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bfloat16_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bitwise_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bitwise_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bitwise_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bitwise_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bitwise_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bitwise_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bool_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bool_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bool_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bool_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bool_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bool_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bool_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bool_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bool_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bool_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bool_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bool_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_bool_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_byte_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_byte_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_byte_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_byte_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_byte_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_byte_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_byte_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_byte_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_byte_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_byte_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_byte_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_byte_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cdouble_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cdouble_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cdouble_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cdouble_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cdouble_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cdouble_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cdouble_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cdouble_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cdouble_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cdouble_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cdouble_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cdouble_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cdouble_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_ceil_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_ceil_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_ceil_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_ceil_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_ceil_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cfloat_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cfloat_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cfloat_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cfloat_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cfloat_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cfloat_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cfloat_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cfloat_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cfloat_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cfloat_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cfloat_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cfloat_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cfloat_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_chalf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_chalf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_chalf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_chalf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_chalf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_chalf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_chalf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_chalf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_chalf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_chalf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_chalf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_chalf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_chalf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_char_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_char_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_char_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_char_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_char_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_char_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_char_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_char_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_char_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_char_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_char_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_char_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_char_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_physical_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_physical_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_physical_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_physical_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_physical_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_physical_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_conj_physical_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_cosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_deg2rad_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_deg2rad_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_deg2rad_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_deg2rad_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_deg2rad_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_deg2rad_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_digamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_digamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_digamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_digamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_digamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_digamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_double_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_double_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_double_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_double_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_double_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_double_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_double_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_double_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_double_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_double_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_double_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_double_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_double_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erfc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erfc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erfc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erfc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erfc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erfc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erfinv_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erfinv_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erfinv_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erfinv_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erfinv_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_erfinv_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_exp_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_expm1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_expm1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_expm1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_expm1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_expm1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_expm1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_fill_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_fill_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_fill_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_fill_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_fill_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_fill_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_fill_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_float_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_float_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_float_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_float_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_float_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_float_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_float_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_float_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_float_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_float_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_float_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_float_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_float_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_floor_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_floor_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_floor_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_floor_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_floor_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_half_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_half_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_half_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_half_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_half_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_half_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_half_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_half_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_half_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_half_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_half_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_half_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_imag_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_int_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_int_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_int_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_int_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_int_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_int_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_int_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_int_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_int_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_int_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_int_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_int_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isfinite_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isfinite_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isfinite_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isfinite_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isfinite_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isfinite_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isfinite_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isinf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isnan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isnan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isnan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isnan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isnan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isnan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isneginf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isneginf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isneginf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isneginf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isneginf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isneginf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isposinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isposinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isposinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isposinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isposinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isposinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isreal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isreal_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isreal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isreal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isreal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isreal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_isreal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_jiterator_unary_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_jiterator_unary_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_jiterator_unary_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_jiterator_unary_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_jiterator_unary_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_jiterator_unary_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_jiterator_unary_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_jiterator_unary_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_jiterator_unary_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_jiterator_unary_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_jiterator_unary_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_jiterator_unary_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_lgamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_lgamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_lgamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_lgamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_lgamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_lgamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log10_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log10_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log10_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log10_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log10_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log10_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log1p_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log1p_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log1p_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log1p_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log1p_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log1p_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_log_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_logical_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_logical_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_logical_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_logical_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_logical_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_logical_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_logit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_logit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_logit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_logit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_logit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_logit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_long_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_long_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_long_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_long_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_long_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_long_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_long_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_long_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_long_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_long_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_long_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_long_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_long_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_neg_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_neg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_neg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_neg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_neg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_neg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_hardshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_hardshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_hardshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_hardshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_hardsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_hardsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_hardsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_hardsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_hardtanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_hardtanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_hardtanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_hardtanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_hardtanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_hardtanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_hardtanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_hardtanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_logsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_logsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_logsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_logsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_relu6_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_relu6_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_relu6_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_relu6_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_relu6_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_relu6_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_relu6_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_relu6_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_relu6_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_relu_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_relu_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_relu_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_relu_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_relu_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_rrelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_rrelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_rrelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_rrelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_silu_complex_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_silu_complex_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_silu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_silu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_silu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_silu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_softshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_softshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_softshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_softshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_softsign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_softsign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_softsign_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_softsign_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_softsign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_softsign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_softsign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_softsign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_softsign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_softsign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_softsign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_softsign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_tanhshrink_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_tanhshrink_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_tanhshrink_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_tanhshrink_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_tanhshrink_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_nn_functional_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_3_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_4_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_4_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_4_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_4_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_4_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_4_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_4_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_4_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_4_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_polygamma_polygamma_n_4_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_positive_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_positive_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_positive_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_positive_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_positive_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_positive_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rad2deg_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rad2deg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rad2deg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rad2deg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rad2deg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rad2deg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_real_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_real_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_real_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_real_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_real_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_real_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_real_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_reciprocal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_reciprocal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_reciprocal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_reciprocal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_reciprocal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_reciprocal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_round_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_round_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_round_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_round_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_round_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_round_decimals_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_round_decimals_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_round_decimals_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_round_decimals_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_round_decimals_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_round_decimals_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_round_decimals_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_round_decimals_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_round_decimals_neg_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_round_decimals_neg_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_round_decimals_neg_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_round_decimals_neg_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rsqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rsqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rsqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rsqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rsqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rsqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_rsqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sgn_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sgn_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sgn_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sgn_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sgn_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sgn_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sgn_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_short_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_short_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_short_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_short_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_short_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_short_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_short_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_short_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_short_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_short_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_short_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_short_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sigmoid_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sigmoid_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sigmoid_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sigmoid_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sigmoid_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sigmoid_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sigmoid_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_signbit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_signbit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_signbit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_signbit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_signbit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_signbit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_airy_ai_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_airy_ai_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_airy_ai_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_airy_ai_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_airy_ai_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_airy_ai_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_airy_ai_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_airy_ai_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_j1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_j1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_j1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_j1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_j1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_j1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_y0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_y0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_y0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_y0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_y0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_y0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_y0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_y0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_y1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_y1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_y1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_y1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_y1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_y1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_y1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_bessel_y1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_entr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_entr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_entr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_entr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_entr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_entr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_erfcx_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_erfcx_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_erfcx_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_erfcx_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_erfcx_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_erfcx_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i0e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i0e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i0e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i0e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i0e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i0e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i1e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i1e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i1e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i1e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i1e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_i1e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_log_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_log_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_log_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_log_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_log_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_log_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_k0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_k0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_k0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_k0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_k0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_k0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_k1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_k1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_k1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_k1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_k1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_modified_bessel_k1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_ndtri_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_ndtri_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_ndtri_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_ndtri_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_ndtri_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_ndtri_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_scaled_modified_bessel_k0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_scaled_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_scaled_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_scaled_modified_bessel_k0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_scaled_modified_bessel_k0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_scaled_modified_bessel_k0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_scaled_modified_bessel_k0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_scaled_modified_bessel_k0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_scaled_modified_bessel_k1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_scaled_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_scaled_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_scaled_modified_bessel_k1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_scaled_modified_bessel_k1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_scaled_modified_bessel_k1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_scaled_modified_bessel_k1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_scaled_modified_bessel_k1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_spherical_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_spherical_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_spherical_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_spherical_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_spherical_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_special_spherical_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_sqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_square_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_square_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_square_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_square_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_square_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_square_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_tanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_trunc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_trunc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_trunc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_trunc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expand_trunc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expm1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expm1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expm1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expm1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expm1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_expm1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_fill_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_fill_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_fill_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_fill_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_fill_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_fill_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_fill_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_float_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_float_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_float_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_float_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_float_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_float_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_float_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_float_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_float_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_float_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_float_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_float_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_float_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_floor_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_floor_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_floor_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_floor_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_floor_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_half_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_half_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_half_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_half_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_half_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_half_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_half_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_half_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_half_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_half_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_half_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_half_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_imag_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bfloat16_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bfloat16_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bfloat16_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bfloat16_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bfloat16_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bfloat16_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bfloat16_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bfloat16_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bfloat16_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bfloat16_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bfloat16_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bfloat16_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bfloat16_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bool_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bool_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bool_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bool_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bool_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bool_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bool_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bool_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bool_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bool_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bool_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bool_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_bool_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_byte_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_byte_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_byte_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_byte_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_byte_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_byte_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_byte_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_byte_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_byte_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_byte_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_byte_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_byte_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cdouble_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cdouble_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cdouble_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cdouble_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cdouble_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cdouble_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cdouble_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cdouble_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cdouble_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cdouble_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cdouble_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cdouble_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cdouble_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cfloat_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cfloat_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cfloat_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cfloat_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cfloat_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cfloat_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cfloat_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cfloat_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cfloat_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cfloat_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cfloat_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cfloat_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_cfloat_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_chalf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_chalf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_chalf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_chalf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_chalf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_chalf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_chalf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_chalf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_chalf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_chalf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_chalf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_chalf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_chalf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_char_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_char_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_char_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_char_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_char_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_char_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_char_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_char_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_char_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_char_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_char_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_char_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_char_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_double_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_double_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_double_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_double_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_double_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_double_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_double_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_double_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_double_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_double_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_double_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_double_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_double_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_float_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_float_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_float_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_float_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_float_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_float_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_float_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_float_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_float_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_float_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_float_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_float_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_float_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_half_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_half_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_half_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_half_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_half_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_half_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_half_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_half_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_half_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_half_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_half_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_half_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_int_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_int_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_int_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_int_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_int_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_int_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_int_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_int_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_int_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_int_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_int_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_int_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_long_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_long_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_long_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_long_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_long_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_long_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_long_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_long_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_long_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_long_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_long_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_long_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_long_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_short_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_short_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_short_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_short_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_short_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_short_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_short_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_short_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_short_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_short_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_short_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs__conversions_short_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_abs_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_abs_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_abs_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_abs_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_abs_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_abs_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_abs_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_acosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_asinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_atanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_bitwise_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_bitwise_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_bitwise_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_bitwise_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_bitwise_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_bitwise_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_ceil_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_ceil_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_ceil_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_ceil_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_ceil_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_physical_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_physical_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_physical_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_physical_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_physical_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_physical_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_conj_physical_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_cosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_deg2rad_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_deg2rad_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_deg2rad_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_deg2rad_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_deg2rad_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_deg2rad_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_digamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_digamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_digamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_digamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_digamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_digamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erfc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erfc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erfc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erfc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erfc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erfc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erfinv_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erfinv_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erfinv_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erfinv_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erfinv_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_erfinv_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_exp_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_expm1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_expm1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_expm1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_expm1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_expm1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_expm1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_fill_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_fill_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_fill_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_fill_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_fill_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_fill_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_fill_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_floor_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_floor_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_floor_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_floor_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_floor_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_imag_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isfinite_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isfinite_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isfinite_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isfinite_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isfinite_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isfinite_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isfinite_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isinf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isnan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isnan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isnan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isnan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isnan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isnan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isneginf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isneginf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isneginf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isneginf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isneginf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isneginf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isposinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isposinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isposinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isposinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isposinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isposinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isreal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isreal_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isreal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isreal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isreal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isreal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_isreal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_lgamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_lgamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_lgamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_lgamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_lgamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_lgamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log10_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log10_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log10_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log10_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log10_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log10_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log1p_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log1p_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log1p_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log1p_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log1p_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log1p_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_log_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_logical_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_logical_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_logical_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_logical_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_logical_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_logical_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_neg_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_neg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_neg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_neg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_neg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_neg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_hardshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_hardshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_hardshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_hardshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_hardtanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_hardtanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_hardtanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_hardtanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_hardtanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_hardtanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_hardtanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_hardtanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_relu6_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_relu6_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_relu6_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_relu6_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_relu6_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_relu6_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_relu6_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_relu6_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_relu6_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_relu_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_relu_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_relu_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_relu_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_relu_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_softshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_softshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_softshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_softshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_tanhshrink_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_tanhshrink_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_tanhshrink_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_tanhshrink_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_tanhshrink_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_nn_functional_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_positive_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_positive_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_positive_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_positive_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_positive_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_positive_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rad2deg_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rad2deg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rad2deg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rad2deg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rad2deg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rad2deg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_real_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_real_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_real_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_real_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_real_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_real_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_real_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_reciprocal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_reciprocal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_reciprocal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_reciprocal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_reciprocal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_reciprocal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_round_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_round_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_round_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_round_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_round_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rsqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rsqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rsqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rsqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rsqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rsqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_rsqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sgn_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sgn_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sgn_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sgn_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sgn_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sgn_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sgn_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sigmoid_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sigmoid_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sigmoid_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sigmoid_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sigmoid_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sigmoid_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sigmoid_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_signbit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_signbit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_signbit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_signbit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_signbit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_signbit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_bessel_j1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_bessel_j1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_bessel_j1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_bessel_j1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_bessel_j1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_bessel_j1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_entr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_entr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_entr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_entr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_entr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_entr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_erfcx_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_erfcx_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_erfcx_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_erfcx_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_erfcx_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_erfcx_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i0e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i0e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i0e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i0e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i0e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i0e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i1e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i1e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i1e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i1e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i1e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_i1e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_log_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_log_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_log_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_log_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_log_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_log_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_logit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_logit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_logit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_logit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_logit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_logit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_5_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_5_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_5_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_5_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_multigammaln_mvlgamma_p_5_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_ndtri_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_ndtri_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_ndtri_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_ndtri_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_ndtri_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_ndtri_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_spherical_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_spherical_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_spherical_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_spherical_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_spherical_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_special_spherical_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_sqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_square_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_square_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_square_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_square_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_square_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_square_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_tanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_trunc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_trunc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_trunc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_trunc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index__refs_trunc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_abs_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_abs_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_abs_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_abs_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_abs_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_abs_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_abs_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_acosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_angle_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_angle_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_angle_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_angle_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_angle_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_angle_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_angle_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_angle_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_angle_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_angle_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_angle_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_asinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_atanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bfloat16_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bfloat16_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bfloat16_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bfloat16_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bfloat16_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bfloat16_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bfloat16_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bfloat16_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bfloat16_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bfloat16_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bfloat16_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bfloat16_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bfloat16_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bitwise_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bitwise_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bitwise_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bitwise_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bitwise_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bitwise_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bool_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bool_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bool_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bool_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bool_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bool_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bool_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bool_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bool_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bool_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bool_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bool_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_bool_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_byte_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_byte_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_byte_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_byte_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_byte_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_byte_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_byte_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_byte_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_byte_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_byte_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_byte_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_byte_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cdouble_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cdouble_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cdouble_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cdouble_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cdouble_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cdouble_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cdouble_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cdouble_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cdouble_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cdouble_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cdouble_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cdouble_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cdouble_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_ceil_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_ceil_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_ceil_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_ceil_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_ceil_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cfloat_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cfloat_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cfloat_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cfloat_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cfloat_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cfloat_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cfloat_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cfloat_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cfloat_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cfloat_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cfloat_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cfloat_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cfloat_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_chalf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_chalf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_chalf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_chalf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_chalf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_chalf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_chalf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_chalf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_chalf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_chalf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_chalf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_chalf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_chalf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_char_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_char_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_char_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_char_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_char_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_char_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_char_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_char_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_char_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_char_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_char_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_char_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_char_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_physical_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_physical_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_physical_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_physical_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_physical_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_physical_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_conj_physical_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_cosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_deg2rad_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_deg2rad_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_deg2rad_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_deg2rad_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_deg2rad_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_deg2rad_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_digamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_digamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_digamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_digamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_digamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_digamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_double_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_double_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_double_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_double_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_double_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_double_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_double_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_double_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_double_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_double_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_double_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_double_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_double_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erfc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erfc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erfc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erfc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erfc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erfc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erfinv_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erfinv_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erfinv_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erfinv_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erfinv_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_erfinv_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_exp_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_expm1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_expm1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_expm1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_expm1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_expm1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_expm1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_fill_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_fill_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_fill_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_fill_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_fill_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_fill_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_fill_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_float_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_float_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_float_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_float_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_float_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_float_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_float_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_float_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_float_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_float_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_float_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_float_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_float_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_floor_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_floor_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_floor_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_floor_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_floor_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_half_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_half_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_half_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_half_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_half_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_half_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_half_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_half_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_half_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_half_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_half_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_half_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_imag_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_int_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_int_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_int_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_int_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_int_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_int_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_int_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_int_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_int_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_int_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_int_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_int_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isfinite_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isfinite_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isfinite_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isfinite_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isfinite_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isfinite_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isfinite_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isinf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isnan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isnan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isnan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isnan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isnan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isnan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isneginf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isneginf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isneginf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isneginf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isneginf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isneginf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isposinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isposinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isposinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isposinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isposinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isposinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isreal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isreal_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isreal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isreal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isreal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isreal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_isreal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_jiterator_unary_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_jiterator_unary_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_jiterator_unary_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_jiterator_unary_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_jiterator_unary_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_jiterator_unary_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_jiterator_unary_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_jiterator_unary_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_jiterator_unary_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_jiterator_unary_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_jiterator_unary_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_jiterator_unary_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_lgamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_lgamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_lgamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_lgamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_lgamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_lgamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log10_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log10_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log10_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log10_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log10_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log10_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log1p_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log1p_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log1p_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log1p_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log1p_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log1p_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_log_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_logical_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_logical_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_logical_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_logical_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_logical_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_logical_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_logit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_logit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_logit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_logit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_logit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_logit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_long_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_long_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_long_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_long_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_long_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_long_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_long_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_long_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_long_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_long_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_long_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_long_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_long_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_neg_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_neg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_neg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_neg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_neg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_neg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_hardshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_hardshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_hardshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_hardshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_hardsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_hardsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_hardsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_hardsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_hardtanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_hardtanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_hardtanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_hardtanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_hardtanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_hardtanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_hardtanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_hardtanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_logsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_logsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_logsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_logsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_relu6_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_relu6_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_relu6_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_relu6_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_relu6_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_relu6_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_relu6_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_relu6_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_relu6_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_relu_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_relu_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_relu_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_relu_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_relu_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_rrelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_rrelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_rrelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_rrelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_silu_complex_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_silu_complex_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_silu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_silu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_silu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_silu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_softshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_softshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_softshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_softshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_softsign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_softsign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_softsign_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_softsign_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_softsign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_softsign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_softsign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_softsign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_softsign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_softsign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_softsign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_softsign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_tanhshrink_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_tanhshrink_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_tanhshrink_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_tanhshrink_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_tanhshrink_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_nn_functional_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_3_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_4_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_4_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_4_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_4_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_4_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_4_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_4_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_4_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_4_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_polygamma_polygamma_n_4_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_positive_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_positive_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_positive_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_positive_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_positive_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_positive_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rad2deg_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rad2deg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rad2deg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rad2deg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rad2deg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rad2deg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_real_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_real_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_real_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_real_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_real_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_real_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_real_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_reciprocal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_reciprocal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_reciprocal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_reciprocal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_reciprocal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_reciprocal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_round_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_round_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_round_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_round_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_round_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_round_decimals_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_round_decimals_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_round_decimals_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_round_decimals_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_round_decimals_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_round_decimals_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_round_decimals_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_round_decimals_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_round_decimals_neg_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_round_decimals_neg_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_round_decimals_neg_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_round_decimals_neg_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rsqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rsqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rsqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rsqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rsqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rsqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_rsqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sgn_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sgn_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sgn_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sgn_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sgn_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sgn_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sgn_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_short_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_short_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_short_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_short_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_short_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_short_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_short_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_short_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_short_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_short_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_short_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_short_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sigmoid_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sigmoid_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sigmoid_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sigmoid_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sigmoid_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sigmoid_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sigmoid_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_signbit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_signbit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_signbit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_signbit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_signbit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_signbit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_airy_ai_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_airy_ai_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_airy_ai_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_airy_ai_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_airy_ai_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_airy_ai_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_airy_ai_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_airy_ai_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_j1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_j1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_j1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_j1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_j1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_j1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_y0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_y0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_y0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_y0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_y0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_y0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_y0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_y0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_y1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_y1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_y1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_y1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_y1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_y1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_y1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_bessel_y1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_entr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_entr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_entr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_entr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_entr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_entr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_erfcx_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_erfcx_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_erfcx_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_erfcx_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_erfcx_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_erfcx_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i0e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i0e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i0e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i0e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i0e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i0e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i1e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i1e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i1e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i1e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i1e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_i1e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_log_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_log_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_log_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_log_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_log_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_log_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_k0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_k0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_k0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_k0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_k0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_k0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_k1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_k1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_k1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_k1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_k1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_modified_bessel_k1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_ndtri_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_ndtri_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_ndtri_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_ndtri_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_ndtri_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_ndtri_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_scaled_modified_bessel_k0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_scaled_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_scaled_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_scaled_modified_bessel_k0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_scaled_modified_bessel_k0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_scaled_modified_bessel_k0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_scaled_modified_bessel_k0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_scaled_modified_bessel_k0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_scaled_modified_bessel_k1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_scaled_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_scaled_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_scaled_modified_bessel_k1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_scaled_modified_bessel_k1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_scaled_modified_bessel_k1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_scaled_modified_bessel_k1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_scaled_modified_bessel_k1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_spherical_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_spherical_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_spherical_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_spherical_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_spherical_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_special_spherical_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_sqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_square_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_square_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_square_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_square_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_square_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_square_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_tanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_trunc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_trunc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_trunc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_trunc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_index_trunc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_int_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_int_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_int_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_int_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_int_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_int_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_int_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_int_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_int_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_int_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_int_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_int_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isfinite_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isfinite_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isfinite_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isfinite_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isfinite_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isfinite_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isfinite_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isinf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isnan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isnan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isnan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isnan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isnan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isnan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isneginf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isneginf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isneginf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isneginf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isneginf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isneginf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isposinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isposinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isposinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isposinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isposinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isposinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isreal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isreal_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isreal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isreal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isreal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isreal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_isreal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_jiterator_unary_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_jiterator_unary_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_jiterator_unary_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_jiterator_unary_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_jiterator_unary_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_jiterator_unary_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_jiterator_unary_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_jiterator_unary_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_jiterator_unary_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_jiterator_unary_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_jiterator_unary_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_jiterator_unary_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_lgamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_lgamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_lgamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_lgamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_lgamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_lgamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log10_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log10_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log10_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log10_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log10_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log10_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log1p_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log1p_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log1p_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log1p_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log1p_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log1p_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_log_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_logical_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_logical_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_logical_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_logical_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_logical_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_logical_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_logit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_logit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_logit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_logit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_logit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_logit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_long_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_long_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_long_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_long_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_long_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_long_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_long_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_long_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_long_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_long_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_long_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_long_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_long_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_neg_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_neg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_neg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_neg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_neg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_neg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_hardshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_hardshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_hardshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_hardshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_hardsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_hardsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_hardsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_hardsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_hardtanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_hardtanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_hardtanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_hardtanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_hardtanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_hardtanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_hardtanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_hardtanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_logsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_logsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_logsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_logsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_relu6_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_relu6_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_relu6_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_relu6_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_relu6_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_relu6_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_relu6_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_relu6_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_relu6_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_relu_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_relu_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_relu_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_relu_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_relu_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_rrelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_rrelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_rrelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_rrelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_silu_complex_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_silu_complex_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_silu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_silu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_silu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_silu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_softshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_softshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_softshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_softshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_softsign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_softsign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_softsign_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_softsign_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_softsign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_softsign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_softsign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_softsign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_softsign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_softsign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_softsign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_softsign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_tanhshrink_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_tanhshrink_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_tanhshrink_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_tanhshrink_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_tanhshrink_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_nn_functional_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_3_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_4_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_4_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_4_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_4_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_4_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_4_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_4_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_4_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_4_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_polygamma_polygamma_n_4_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_positive_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_positive_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_positive_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_positive_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_positive_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_positive_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rad2deg_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rad2deg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rad2deg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rad2deg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rad2deg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rad2deg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_real_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_real_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_real_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_real_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_real_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_real_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_real_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_reciprocal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_reciprocal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_reciprocal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_reciprocal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_reciprocal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_reciprocal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_round_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_round_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_round_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_round_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_round_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_round_decimals_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_round_decimals_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_round_decimals_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_round_decimals_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_round_decimals_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_round_decimals_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_round_decimals_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_round_decimals_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_round_decimals_neg_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_round_decimals_neg_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_round_decimals_neg_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_round_decimals_neg_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rsqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rsqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rsqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rsqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rsqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rsqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_rsqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sgn_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sgn_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sgn_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sgn_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sgn_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sgn_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sgn_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_short_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_short_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_short_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_short_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_short_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_short_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_short_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_short_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_short_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_short_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_short_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_short_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sigmoid_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sigmoid_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sigmoid_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sigmoid_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sigmoid_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sigmoid_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sigmoid_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_signbit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_signbit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_signbit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_signbit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_signbit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_signbit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_airy_ai_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_airy_ai_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_airy_ai_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_airy_ai_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_airy_ai_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_airy_ai_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_airy_ai_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_airy_ai_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_j1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_j1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_j1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_j1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_j1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_j1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_y0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_y0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_y0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_y0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_y0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_y0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_y0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_y0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_y1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_y1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_y1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_y1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_y1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_y1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_y1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_bessel_y1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_entr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_entr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_entr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_entr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_entr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_entr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_erfcx_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_erfcx_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_erfcx_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_erfcx_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_erfcx_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_erfcx_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i0e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i0e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i0e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i0e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i0e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i0e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i1e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i1e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i1e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i1e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i1e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_i1e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_log_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_log_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_log_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_log_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_log_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_log_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_k0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_k0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_k0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_k0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_k0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_k0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_k1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_k1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_k1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_k1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_k1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_modified_bessel_k1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_ndtri_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_ndtri_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_ndtri_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_ndtri_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_ndtri_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_ndtri_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_scaled_modified_bessel_k0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_scaled_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_scaled_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_scaled_modified_bessel_k0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_scaled_modified_bessel_k0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_scaled_modified_bessel_k0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_scaled_modified_bessel_k0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_scaled_modified_bessel_k0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_scaled_modified_bessel_k1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_scaled_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_scaled_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_scaled_modified_bessel_k1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_scaled_modified_bessel_k1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_scaled_modified_bessel_k1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_scaled_modified_bessel_k1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_scaled_modified_bessel_k1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_spherical_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_spherical_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_spherical_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_spherical_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_spherical_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_special_spherical_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_sqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_square_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_square_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_square_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_square_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_square_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_square_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_tanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_trunc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_trunc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_trunc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_trunc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_non_contig_trunc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_nonzero_empty_cuda, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_nonzero_large_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_nonzero_static_cuda, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_nonzero_static_large_cuda, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_op_invert_cuda, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_polygamma_neg_cuda, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_multigammaln_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_multigammaln_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_multigammaln_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_multigammaln_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_multigammaln_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_multigammaln_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_multigammaln_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal__refs_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_angle_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_angle_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_angle_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_angle_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_jiterator_unary_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_jiterator_unary_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_jiterator_unary_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_jiterator_unary_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_jiterator_unary_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_jiterator_unary_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_hardsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_hardsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_hardsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_hardsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_logsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_logsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_logsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_logsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_silu_complex_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_silu_complex_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_silu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_silu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_silu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_silu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_softsign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_softsign_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_softsign_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_softsign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_softsign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_softsign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_polygamma_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_polygamma_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_polygamma_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_polygamma_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_polygamma_polygamma_n_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_polygamma_polygamma_n_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_polygamma_polygamma_n_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_polygamma_polygamma_n_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_polygamma_polygamma_n_2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_polygamma_polygamma_n_2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_polygamma_polygamma_n_2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_polygamma_polygamma_n_2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_polygamma_polygamma_n_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_polygamma_polygamma_n_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_polygamma_polygamma_n_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_polygamma_polygamma_n_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_polygamma_polygamma_n_4_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_polygamma_polygamma_n_4_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_polygamma_polygamma_n_4_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_polygamma_polygamma_n_4_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_round_decimals_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_round_decimals_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_round_decimals_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_round_decimals_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_round_decimals_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_round_decimals_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_round_decimals_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_round_decimals_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_round_decimals_neg_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_round_decimals_neg_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_round_decimals_neg_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_round_decimals_neg_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_airy_ai_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_airy_ai_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_bessel_y0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_bessel_y0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_bessel_y1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_bessel_y1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_modified_bessel_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_modified_bessel_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_modified_bessel_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_modified_bessel_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_scaled_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_scaled_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_scaled_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_scaled_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_extremal_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_abs_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_abs_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_abs_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_abs_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_abs_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_abs_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_abs_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_acosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_asinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_atanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_bitwise_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_bitwise_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_bitwise_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_bitwise_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_bitwise_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_bitwise_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_ceil_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_ceil_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_ceil_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_ceil_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_ceil_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_physical_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_physical_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_physical_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_physical_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_physical_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_physical_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_conj_physical_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_cosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_deg2rad_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_deg2rad_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_deg2rad_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_deg2rad_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_deg2rad_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_deg2rad_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_digamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_digamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_digamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_digamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_digamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_digamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erfc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erfc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erfc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erfc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erfc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erfc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erfinv_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erfinv_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erfinv_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erfinv_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erfinv_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_erfinv_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_exp_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_expm1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_expm1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_expm1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_expm1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_expm1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_expm1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_fill_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_fill_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_fill_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_fill_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_fill_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_fill_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_fill_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_floor_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_floor_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_floor_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_floor_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_floor_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_imag_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isfinite_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isfinite_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isfinite_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isfinite_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isfinite_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isfinite_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isfinite_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isinf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isnan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isnan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isnan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isnan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isnan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isnan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isneginf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isneginf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isneginf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isneginf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isneginf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isneginf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isposinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isposinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isposinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isposinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isposinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isposinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isreal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isreal_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isreal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isreal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isreal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isreal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_isreal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_lgamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_lgamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_lgamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_lgamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_lgamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_lgamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log10_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log10_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log10_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log10_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log10_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log10_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log1p_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log1p_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log1p_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log1p_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log1p_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log1p_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_log_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_logical_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_logical_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_logical_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_logical_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_logical_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_logical_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_neg_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_neg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_neg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_neg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_neg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_neg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_relu_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_relu_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_relu_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_relu_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_relu_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_tanhshrink_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_tanhshrink_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_tanhshrink_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_tanhshrink_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_tanhshrink_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_nn_functional_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_positive_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_positive_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_positive_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_positive_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_positive_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_positive_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rad2deg_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rad2deg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rad2deg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rad2deg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rad2deg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rad2deg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_real_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_real_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_real_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_real_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_real_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_real_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_real_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_reciprocal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_reciprocal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_reciprocal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_reciprocal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_reciprocal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_reciprocal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_round_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_round_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_round_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_round_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_round_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rsqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rsqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rsqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rsqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rsqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rsqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_rsqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sgn_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sgn_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sgn_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sgn_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sgn_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sgn_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sgn_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sigmoid_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sigmoid_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sigmoid_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sigmoid_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sigmoid_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sigmoid_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sigmoid_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_signbit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_signbit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_signbit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_signbit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_signbit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_signbit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_bessel_j1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_bessel_j1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_bessel_j1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_bessel_j1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_bessel_j1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_bessel_j1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_entr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_entr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_entr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_entr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_entr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_entr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_erfcx_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_erfcx_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_erfcx_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_erfcx_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_erfcx_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_erfcx_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i0e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i0e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i0e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i0e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i0e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i0e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i1e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i1e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i1e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i1e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i1e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_i1e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_log_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_log_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_log_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_log_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_log_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_log_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_logit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_logit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_logit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_logit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_logit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_logit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_5_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_5_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_5_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_5_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_multigammaln_mvlgamma_p_5_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_ndtri_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_ndtri_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_ndtri_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_ndtri_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_ndtri_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_ndtri_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_spherical_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_spherical_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_spherical_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_spherical_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_spherical_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_special_spherical_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_sqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_square_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_square_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_square_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_square_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_square_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_square_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_tanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_trunc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_trunc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_trunc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_trunc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large__refs_trunc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_abs_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_abs_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_abs_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_abs_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_abs_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_abs_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_abs_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_acosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_angle_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_angle_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_angle_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_angle_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_angle_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_angle_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_angle_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_angle_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_angle_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_angle_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_angle_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_asinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_atanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_bitwise_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_bitwise_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_bitwise_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_bitwise_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_bitwise_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_bitwise_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_ceil_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_ceil_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_ceil_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_ceil_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_ceil_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_physical_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_physical_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_physical_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_physical_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_physical_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_physical_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_conj_physical_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_cosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_deg2rad_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_deg2rad_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_deg2rad_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_deg2rad_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_deg2rad_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_deg2rad_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_digamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_digamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_digamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_digamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_digamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_digamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erfc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erfc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erfc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erfc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erfc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erfc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erfinv_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erfinv_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erfinv_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erfinv_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erfinv_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_erfinv_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_exp_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_expm1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_expm1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_expm1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_expm1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_expm1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_expm1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_fill_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_fill_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_fill_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_fill_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_fill_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_fill_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_fill_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_floor_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_floor_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_floor_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_floor_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_floor_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_imag_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isfinite_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isfinite_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isfinite_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isfinite_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isfinite_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isfinite_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isfinite_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isinf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isnan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isnan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isnan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isnan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isnan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isnan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isneginf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isneginf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isneginf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isneginf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isneginf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isneginf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isposinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isposinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isposinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isposinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isposinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isposinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isreal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isreal_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isreal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isreal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isreal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isreal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_isreal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_jiterator_unary_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_jiterator_unary_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_jiterator_unary_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_jiterator_unary_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_jiterator_unary_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_jiterator_unary_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_jiterator_unary_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_jiterator_unary_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_jiterator_unary_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_jiterator_unary_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_jiterator_unary_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_jiterator_unary_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_lgamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_lgamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_lgamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_lgamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_lgamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_lgamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log10_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log10_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log10_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log10_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log10_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log10_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log1p_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log1p_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log1p_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log1p_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log1p_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log1p_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_log_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_logical_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_logical_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_logical_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_logical_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_logical_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_logical_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_logit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_logit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_logit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_logit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_logit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_logit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_neg_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_neg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_neg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_neg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_neg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_neg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_hardsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_hardsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_hardsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_hardsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_logsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_logsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_logsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_logsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_relu_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_relu_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_relu_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_relu_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_relu_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_silu_complex_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_silu_complex_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_silu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_silu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_silu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_silu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_softsign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_softsign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_softsign_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_softsign_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_softsign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_softsign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_softsign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_softsign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_softsign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_softsign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_softsign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_softsign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_tanhshrink_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_tanhshrink_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_tanhshrink_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_tanhshrink_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_tanhshrink_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_nn_functional_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_3_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_4_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_4_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_4_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_4_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_4_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_4_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_4_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_4_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_4_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_polygamma_polygamma_n_4_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_positive_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_positive_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_positive_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_positive_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_positive_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_positive_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rad2deg_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rad2deg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rad2deg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rad2deg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rad2deg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rad2deg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_real_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_real_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_real_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_real_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_real_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_real_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_real_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_reciprocal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_reciprocal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_reciprocal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_reciprocal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_reciprocal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_reciprocal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_round_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_round_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_round_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_round_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_round_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_round_decimals_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_round_decimals_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_round_decimals_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_round_decimals_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_round_decimals_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_round_decimals_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_round_decimals_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_round_decimals_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_round_decimals_neg_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_round_decimals_neg_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_round_decimals_neg_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_round_decimals_neg_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rsqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rsqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rsqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rsqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rsqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rsqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_rsqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sgn_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sgn_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sgn_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sgn_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sgn_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sgn_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sgn_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sigmoid_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sigmoid_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sigmoid_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sigmoid_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sigmoid_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sigmoid_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sigmoid_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_signbit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_signbit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_signbit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_signbit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_signbit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_signbit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_airy_ai_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_airy_ai_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_airy_ai_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_airy_ai_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_airy_ai_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_airy_ai_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_airy_ai_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_airy_ai_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_j1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_j1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_j1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_j1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_j1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_j1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_y0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_y0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_y0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_y0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_y0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_y0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_y0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_y0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_y1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_y1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_y1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_y1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_y1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_y1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_y1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_bessel_y1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_entr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_entr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_entr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_entr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_entr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_entr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_erfcx_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_erfcx_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_erfcx_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_erfcx_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_erfcx_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_erfcx_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i0e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i0e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i0e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i0e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i0e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i0e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i1e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i1e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i1e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i1e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i1e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_i1e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_log_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_log_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_log_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_log_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_log_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_log_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_k0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_k0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_k0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_k0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_k0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_k0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_k1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_k1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_k1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_k1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_k1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_modified_bessel_k1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_ndtri_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_ndtri_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_ndtri_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_ndtri_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_ndtri_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_ndtri_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_scaled_modified_bessel_k0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_scaled_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_scaled_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_scaled_modified_bessel_k0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_scaled_modified_bessel_k0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_scaled_modified_bessel_k0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_scaled_modified_bessel_k0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_scaled_modified_bessel_k0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_scaled_modified_bessel_k1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_scaled_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_scaled_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_scaled_modified_bessel_k1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_scaled_modified_bessel_k1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_scaled_modified_bessel_k1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_scaled_modified_bessel_k1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_scaled_modified_bessel_k1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_spherical_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_spherical_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_spherical_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_spherical_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_spherical_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_special_spherical_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_sqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_square_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_square_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_square_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_square_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_square_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_square_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_tanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_trunc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_trunc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_trunc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_trunc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_large_trunc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_abs_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_abs_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_abs_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_abs_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_abs_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_abs_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_abs_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_acosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_asinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_atanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_bitwise_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_bitwise_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_bitwise_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_bitwise_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_bitwise_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_bitwise_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_ceil_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_ceil_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_ceil_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_ceil_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_ceil_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_physical_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_physical_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_physical_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_physical_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_physical_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_physical_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_conj_physical_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_cosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_deg2rad_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_deg2rad_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_deg2rad_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_deg2rad_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_deg2rad_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_deg2rad_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_digamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_digamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_digamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_digamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_digamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_digamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erfc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erfc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erfc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erfc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erfc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erfc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erfinv_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erfinv_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erfinv_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erfinv_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erfinv_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_erfinv_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_exp_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_expm1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_expm1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_expm1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_expm1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_expm1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_expm1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_fill_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_fill_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_fill_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_fill_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_fill_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_fill_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_fill_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_floor_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_floor_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_floor_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_floor_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_floor_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_imag_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isfinite_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isfinite_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isfinite_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isfinite_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isfinite_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isfinite_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isfinite_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isinf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isnan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isnan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isnan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isnan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isnan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isnan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isneginf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isneginf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isneginf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isneginf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isneginf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isneginf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isposinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isposinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isposinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isposinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isposinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isposinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isreal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isreal_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isreal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isreal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isreal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isreal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_isreal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_lgamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_lgamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_lgamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_lgamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_lgamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_lgamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log10_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log10_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log10_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log10_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log10_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log10_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log1p_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log1p_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log1p_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log1p_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log1p_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log1p_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_log_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_logical_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_logical_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_logical_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_logical_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_logical_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_logical_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_neg_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_neg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_neg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_neg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_neg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_neg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_relu_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_relu_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_relu_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_relu_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_relu_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_tanhshrink_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_tanhshrink_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_tanhshrink_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_tanhshrink_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_tanhshrink_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_nn_functional_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_positive_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_positive_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_positive_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_positive_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_positive_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_positive_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rad2deg_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rad2deg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rad2deg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rad2deg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rad2deg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rad2deg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_real_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_real_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_real_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_real_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_real_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_real_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_real_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_reciprocal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_reciprocal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_reciprocal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_reciprocal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_reciprocal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_reciprocal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_round_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_round_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_round_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_round_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_round_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rsqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rsqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rsqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rsqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rsqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rsqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_rsqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sgn_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sgn_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sgn_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sgn_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sgn_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sgn_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sgn_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sigmoid_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sigmoid_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sigmoid_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sigmoid_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sigmoid_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sigmoid_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sigmoid_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_signbit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_signbit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_signbit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_signbit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_signbit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_signbit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_bessel_j1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_bessel_j1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_bessel_j1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_bessel_j1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_bessel_j1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_bessel_j1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_entr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_entr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_entr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_entr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_entr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_entr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_erfcx_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_erfcx_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_erfcx_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_erfcx_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_erfcx_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_erfcx_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i0e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i0e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i0e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i0e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i0e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i0e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i1e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i1e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i1e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i1e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i1e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_i1e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_log_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_log_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_log_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_log_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_log_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_log_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_logit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_logit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_logit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_logit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_logit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_logit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_5_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_5_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_5_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_5_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_multigammaln_mvlgamma_p_5_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_ndtri_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_ndtri_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_ndtri_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_ndtri_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_ndtri_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_ndtri_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_spherical_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_spherical_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_spherical_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_spherical_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_spherical_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_special_spherical_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_sqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_square_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_square_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_square_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_square_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_square_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_square_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_tanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_trunc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_trunc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_trunc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_trunc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal__refs_trunc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_abs_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_abs_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_abs_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_abs_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_abs_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_abs_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_abs_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_acosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_angle_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_angle_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_angle_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_angle_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_angle_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_angle_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_angle_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_angle_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_angle_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_angle_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_angle_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_asinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_atanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_bitwise_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_bitwise_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_bitwise_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_bitwise_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_bitwise_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_bitwise_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_ceil_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_ceil_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_ceil_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_ceil_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_ceil_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_physical_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_physical_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_physical_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_physical_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_physical_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_physical_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_conj_physical_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_cosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_deg2rad_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_deg2rad_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_deg2rad_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_deg2rad_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_deg2rad_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_deg2rad_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_digamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_digamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_digamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_digamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_digamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_digamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erfc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erfc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erfc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erfc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erfc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erfc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erfinv_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erfinv_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erfinv_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erfinv_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erfinv_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_erfinv_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_exp_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_expm1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_expm1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_expm1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_expm1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_expm1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_expm1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_fill_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_fill_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_fill_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_fill_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_fill_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_fill_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_fill_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_floor_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_floor_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_floor_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_floor_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_floor_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_imag_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isfinite_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isfinite_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isfinite_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isfinite_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isfinite_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isfinite_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isfinite_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isinf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isnan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isnan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isnan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isnan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isnan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isnan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isneginf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isneginf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isneginf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isneginf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isneginf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isneginf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isposinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isposinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isposinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isposinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isposinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isposinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isreal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isreal_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isreal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isreal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isreal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isreal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_isreal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_jiterator_unary_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_jiterator_unary_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_jiterator_unary_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_jiterator_unary_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_jiterator_unary_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_jiterator_unary_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_jiterator_unary_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_jiterator_unary_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_jiterator_unary_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_jiterator_unary_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_jiterator_unary_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_jiterator_unary_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_lgamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_lgamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_lgamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_lgamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_lgamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_lgamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log10_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log10_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log10_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log10_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log10_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log10_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log1p_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log1p_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log1p_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log1p_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log1p_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log1p_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_log_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_logical_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_logical_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_logical_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_logical_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_logical_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_logical_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_logit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_logit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_logit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_logit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_logit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_logit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_neg_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_neg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_neg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_neg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_neg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_neg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_hardsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_hardsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_hardsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_hardsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_logsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_logsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_logsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_logsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_relu_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_relu_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_relu_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_relu_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_relu_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_silu_complex_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_silu_complex_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_silu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_silu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_silu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_silu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_softsign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_softsign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_softsign_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_softsign_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_softsign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_softsign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_softsign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_softsign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_softsign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_softsign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_softsign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_softsign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_tanhshrink_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_tanhshrink_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_tanhshrink_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_tanhshrink_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_tanhshrink_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_nn_functional_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_3_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_4_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_4_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_4_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_4_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_4_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_4_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_4_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_4_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_4_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_polygamma_polygamma_n_4_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_positive_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_positive_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_positive_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_positive_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_positive_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_positive_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rad2deg_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rad2deg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rad2deg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rad2deg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rad2deg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rad2deg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_real_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_real_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_real_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_real_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_real_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_real_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_real_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_reciprocal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_reciprocal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_reciprocal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_reciprocal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_reciprocal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_reciprocal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_round_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_round_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_round_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_round_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_round_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_round_decimals_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_round_decimals_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_round_decimals_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_round_decimals_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_round_decimals_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_round_decimals_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_round_decimals_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_round_decimals_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_round_decimals_neg_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_round_decimals_neg_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_round_decimals_neg_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_round_decimals_neg_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rsqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rsqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rsqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rsqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rsqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rsqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_rsqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sgn_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sgn_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sgn_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sgn_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sgn_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sgn_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sgn_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sigmoid_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sigmoid_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sigmoid_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sigmoid_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sigmoid_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sigmoid_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sigmoid_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_signbit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_signbit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_signbit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_signbit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_signbit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_signbit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_airy_ai_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_airy_ai_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_airy_ai_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_airy_ai_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_airy_ai_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_airy_ai_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_airy_ai_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_airy_ai_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_j1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_j1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_j1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_j1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_j1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_j1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_y0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_y0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_y0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_y0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_y0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_y0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_y0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_y0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_y1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_y1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_y1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_y1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_y1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_y1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_y1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_bessel_y1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_entr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_entr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_entr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_entr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_entr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_entr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_erfcx_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_erfcx_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_erfcx_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_erfcx_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_erfcx_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_erfcx_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i0e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i0e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i0e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i0e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i0e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i0e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i1e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i1e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i1e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i1e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i1e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_i1e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_log_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_log_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_log_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_log_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_log_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_log_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_k0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_k0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_k0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_k0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_k0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_k0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_k1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_k1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_k1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_k1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_k1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_modified_bessel_k1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_ndtri_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_ndtri_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_ndtri_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_ndtri_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_ndtri_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_ndtri_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_scaled_modified_bessel_k0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_scaled_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_scaled_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_scaled_modified_bessel_k0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_scaled_modified_bessel_k0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_scaled_modified_bessel_k0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_scaled_modified_bessel_k0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_scaled_modified_bessel_k0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_scaled_modified_bessel_k1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_scaled_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_scaled_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_scaled_modified_bessel_k1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_scaled_modified_bessel_k1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_scaled_modified_bessel_k1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_scaled_modified_bessel_k1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_scaled_modified_bessel_k1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_spherical_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_spherical_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_spherical_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_spherical_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_spherical_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_special_spherical_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_sqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_square_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_square_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_square_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_square_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_square_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_square_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_tanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_trunc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_trunc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_trunc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_trunc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_normal_trunc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_abs_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_abs_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_abs_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_abs_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_abs_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_abs_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_abs_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_acosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_asinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_atanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_bitwise_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_bitwise_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_bitwise_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_bitwise_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_bitwise_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_bitwise_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_ceil_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_ceil_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_ceil_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_ceil_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_ceil_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_physical_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_physical_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_physical_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_physical_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_physical_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_physical_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_conj_physical_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_cosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_deg2rad_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_deg2rad_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_deg2rad_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_deg2rad_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_deg2rad_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_deg2rad_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_digamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_digamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_digamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_digamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_digamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_digamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erfc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erfc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erfc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erfc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erfc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erfc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erfinv_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erfinv_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erfinv_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erfinv_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erfinv_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_erfinv_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_exp_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_expm1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_expm1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_expm1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_expm1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_expm1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_expm1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_fill_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_fill_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_fill_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_fill_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_fill_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_fill_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_fill_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_floor_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_floor_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_floor_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_floor_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_floor_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_imag_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isfinite_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isfinite_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isfinite_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isfinite_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isfinite_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isfinite_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isfinite_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isinf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isnan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isnan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isnan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isnan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isnan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isnan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isneginf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isneginf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isneginf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isneginf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isneginf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isneginf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isposinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isposinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isposinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isposinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isposinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isposinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isreal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isreal_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isreal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isreal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isreal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isreal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_isreal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_lgamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_lgamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_lgamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_lgamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_lgamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_lgamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log10_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log10_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log10_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log10_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log10_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log10_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log1p_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log1p_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log1p_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log1p_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log1p_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log1p_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_log_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_logical_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_logical_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_logical_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_logical_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_logical_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_logical_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_neg_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_neg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_neg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_neg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_neg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_neg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_relu_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_relu_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_relu_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_relu_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_relu_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_tanhshrink_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_tanhshrink_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_tanhshrink_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_tanhshrink_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_tanhshrink_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_nn_functional_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_positive_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_positive_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_positive_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_positive_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_positive_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_positive_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rad2deg_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rad2deg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rad2deg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rad2deg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rad2deg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rad2deg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_real_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_real_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_real_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_real_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_real_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_real_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_real_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_reciprocal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_reciprocal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_reciprocal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_reciprocal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_reciprocal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_reciprocal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_round_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_round_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_round_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_round_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_round_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rsqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rsqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rsqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rsqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rsqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rsqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_rsqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sgn_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sgn_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sgn_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sgn_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sgn_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sgn_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sgn_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sigmoid_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sigmoid_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sigmoid_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sigmoid_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sigmoid_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sigmoid_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sigmoid_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_signbit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_signbit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_signbit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_signbit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_signbit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_signbit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_bessel_j1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_bessel_j1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_bessel_j1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_bessel_j1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_bessel_j1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_bessel_j1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_entr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_entr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_entr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_entr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_entr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_entr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_erfcx_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_erfcx_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_erfcx_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_erfcx_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_erfcx_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_erfcx_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i0e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i0e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i0e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i0e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i0e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i0e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i1e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i1e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i1e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i1e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i1e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_i1e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_log_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_log_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_log_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_log_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_log_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_log_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_logit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_logit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_logit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_logit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_logit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_logit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_5_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_5_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_5_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_5_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_multigammaln_mvlgamma_p_5_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_ndtri_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_ndtri_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_ndtri_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_ndtri_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_ndtri_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_ndtri_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_spherical_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_spherical_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_spherical_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_spherical_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_spherical_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_special_spherical_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_sqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_square_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_square_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_square_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_square_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_square_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_square_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_tanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_trunc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_trunc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_trunc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_trunc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small__refs_trunc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_abs_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_abs_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_abs_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_abs_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_abs_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_abs_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_abs_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_abs_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_abs_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_abs_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_abs_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_abs_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_abs_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_acosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_angle_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_angle_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_angle_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_angle_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_angle_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_angle_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_angle_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_angle_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_angle_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_angle_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_angle_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_asinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_atanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_bitwise_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_bitwise_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_bitwise_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_bitwise_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_bitwise_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_bitwise_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_ceil_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_ceil_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_ceil_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_ceil_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_ceil_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_ceil_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_ceil_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_ceil_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_ceil_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_physical_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_physical_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_physical_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_physical_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_physical_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_physical_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_physical_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_physical_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_physical_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_physical_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_physical_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_physical_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_conj_physical_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cos_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cos_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cos_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cos_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cos_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cos_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cos_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cos_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cos_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cos_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cos_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cos_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cos_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cosh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cosh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cosh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cosh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cosh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cosh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cosh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cosh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cosh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cosh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cosh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cosh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_cosh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_deg2rad_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_deg2rad_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_deg2rad_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_deg2rad_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_deg2rad_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_deg2rad_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_deg2rad_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_deg2rad_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_deg2rad_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_deg2rad_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_digamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_digamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_digamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_digamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_digamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_digamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_digamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_digamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_digamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_digamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erfc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erfc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erfc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erfc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erfc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erfc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erfc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erfc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erfc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erfc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erfinv_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erfinv_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erfinv_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erfinv_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erfinv_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erfinv_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erfinv_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erfinv_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erfinv_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_erfinv_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_exp_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_expm1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_expm1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_expm1_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_expm1_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_expm1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_expm1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_expm1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_expm1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_expm1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_expm1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_expm1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_expm1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_fill_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_fill_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_fill_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_fill_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_fill_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_fill_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_fill_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_fill_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_fill_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_fill_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_fill_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_fill_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_fill_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_floor_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_floor_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_floor_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_floor_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_floor_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_floor_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_floor_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_floor_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_floor_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_frac_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_frac_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_frac_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_frac_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_frexp_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_frexp_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_frexp_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_frexp_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_i0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_i0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_imag_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_imag_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_imag_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isfinite_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isfinite_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isfinite_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isfinite_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isfinite_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isfinite_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isfinite_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isfinite_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isfinite_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isfinite_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isfinite_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isfinite_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isfinite_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isinf_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isinf_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isinf_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isnan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isnan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isnan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isnan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isnan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isnan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isnan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isnan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isnan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isnan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isnan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isnan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isneginf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isneginf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isneginf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isneginf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isneginf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isneginf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isneginf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isneginf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isneginf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isneginf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isposinf_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isposinf_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isposinf_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isposinf_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isposinf_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isposinf_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isposinf_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isposinf_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isposinf_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isposinf_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isreal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isreal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isreal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isreal_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isreal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isreal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isreal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isreal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isreal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isreal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isreal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isreal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_isreal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_jiterator_unary_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_jiterator_unary_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_jiterator_unary_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_jiterator_unary_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_jiterator_unary_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_jiterator_unary_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_jiterator_unary_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_jiterator_unary_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_jiterator_unary_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_jiterator_unary_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_jiterator_unary_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_jiterator_unary_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_lgamma_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_lgamma_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_lgamma_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_lgamma_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_lgamma_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_lgamma_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_lgamma_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_lgamma_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_lgamma_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_lgamma_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log10_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log10_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log10_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log10_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log10_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log10_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log10_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log10_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log10_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log10_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log10_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log10_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log1p_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log1p_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log1p_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log1p_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log1p_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log1p_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log1p_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log1p_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log1p_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log1p_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log1p_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log1p_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log2_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log2_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_log_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_logical_not_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_logical_not_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_logical_not_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_logical_not_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_logical_not_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_logical_not_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_logical_not_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_logical_not_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_logical_not_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_logical_not_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_logical_not_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_logical_not_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_logit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_logit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_logit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_logit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_logit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_logit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_logit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_logit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_logit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_logit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nan_to_num_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nan_to_num_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nan_to_num_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nan_to_num_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nan_to_num_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nan_to_num_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nan_to_num_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nan_to_num_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nan_to_num_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nan_to_num_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_neg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_neg_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_neg_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_neg_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_neg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_neg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_neg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_neg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_neg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_neg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_neg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_neg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_celu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_celu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_celu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_celu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_elu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_elu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_elu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_elu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_hardsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_hardsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_hardsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_hardsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_logsigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_logsigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_logsigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_logsigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_mish_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_mish_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_mish_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_mish_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_prelu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_prelu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_prelu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_prelu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_relu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_relu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_relu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_relu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_relu_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_relu_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_relu_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_relu_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_relu_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_selu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_selu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_selu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_selu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_silu_complex_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_silu_complex_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_silu_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_silu_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_silu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_silu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_softplus_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_softplus_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_softplus_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_softplus_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_softsign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_softsign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_softsign_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_softsign_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_softsign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_softsign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_softsign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_softsign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_softsign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_softsign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_softsign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_softsign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_tanhshrink_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_tanhshrink_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_tanhshrink_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_tanhshrink_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_tanhshrink_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_tanhshrink_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_tanhshrink_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_tanhshrink_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_tanhshrink_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_tanhshrink_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_tanhshrink_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_threshold_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_threshold_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_nn_functional_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_2_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_2_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_2_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_2_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_2_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_2_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_2_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_2_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_2_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_2_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_3_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_3_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_3_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_3_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_3_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_3_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_4_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_4_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_4_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_4_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_4_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_4_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_4_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_4_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_4_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_polygamma_polygamma_n_4_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_positive_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_positive_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_positive_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_positive_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_positive_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_positive_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_positive_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_positive_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_positive_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_positive_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_positive_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_positive_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rad2deg_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rad2deg_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rad2deg_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rad2deg_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rad2deg_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rad2deg_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rad2deg_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rad2deg_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rad2deg_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rad2deg_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_real_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_real_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_real_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_real_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_real_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_real_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_real_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_real_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_real_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_real_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_real_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_real_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_real_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_reciprocal_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_reciprocal_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_reciprocal_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_reciprocal_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_reciprocal_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_reciprocal_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_reciprocal_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_reciprocal_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_reciprocal_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_reciprocal_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_reciprocal_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_reciprocal_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_round_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_round_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_round_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_round_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_round_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_round_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_round_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_round_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_round_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_round_decimals_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_round_decimals_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_round_decimals_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_round_decimals_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_round_decimals_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_round_decimals_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_round_decimals_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_round_decimals_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_round_decimals_neg_3_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_round_decimals_neg_3_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_round_decimals_neg_3_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_round_decimals_neg_3_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rsqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rsqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rsqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rsqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rsqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rsqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rsqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rsqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rsqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rsqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rsqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rsqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_rsqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sgn_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sgn_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sgn_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sgn_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sgn_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sgn_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sgn_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sgn_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sgn_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sgn_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sgn_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sgn_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sgn_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sigmoid_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sigmoid_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sigmoid_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sigmoid_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sigmoid_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sigmoid_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sigmoid_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sigmoid_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sigmoid_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sigmoid_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sigmoid_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sigmoid_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sigmoid_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sign_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sign_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sign_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sign_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sign_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sign_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sign_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sign_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sign_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sign_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_signbit_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_signbit_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_signbit_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_signbit_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_signbit_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_signbit_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_signbit_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_signbit_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_signbit_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_signbit_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sin_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sin_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sin_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sin_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sin_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sin_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sin_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sin_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sin_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sin_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sin_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sin_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sin_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinc_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinc_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinc_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sinh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_airy_ai_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_airy_ai_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_airy_ai_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_airy_ai_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_airy_ai_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_airy_ai_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_airy_ai_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_airy_ai_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_j1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_j1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_j1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_j1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_j1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_j1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_j1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_j1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_y0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_y0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_y0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_y0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_y0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_y0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_y0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_y0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_y1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_y1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_y1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_y1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_y1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_y1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_y1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_bessel_y1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_entr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_entr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_entr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_entr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_entr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_entr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_entr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_entr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_entr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_entr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_erfcx_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_erfcx_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_erfcx_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_erfcx_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_erfcx_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_erfcx_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_erfcx_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_erfcx_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i0e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i0e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i0e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i0e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i0e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i0e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i0e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i0e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i0e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i0e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i1_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i1_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i1e_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i1e_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i1e_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i1e_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i1e_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i1e_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i1e_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i1e_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i1e_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_i1e_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_log_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_log_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_log_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_log_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_log_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_log_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_log_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_log_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_i0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_i0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_i0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_i0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_i0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_i0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_i0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_i0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_i1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_i1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_i1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_i1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_i1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_i1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_i1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_i1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_k0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_k0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_k0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_k0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_k0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_k0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_k1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_k1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_k1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_k1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_k1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_modified_bessel_k1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_ndtr_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_ndtr_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_ndtr_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_ndtr_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_ndtr_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_ndtr_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_ndtr_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_ndtr_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_ndtr_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_ndtr_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_ndtri_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_ndtri_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_ndtri_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_ndtri_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_ndtri_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_ndtri_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_ndtri_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_ndtri_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_scaled_modified_bessel_k0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_scaled_modified_bessel_k0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_scaled_modified_bessel_k0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_scaled_modified_bessel_k0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_scaled_modified_bessel_k0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_scaled_modified_bessel_k0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_scaled_modified_bessel_k0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_scaled_modified_bessel_k0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_scaled_modified_bessel_k1_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_scaled_modified_bessel_k1_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_scaled_modified_bessel_k1_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_scaled_modified_bessel_k1_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_scaled_modified_bessel_k1_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_scaled_modified_bessel_k1_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_scaled_modified_bessel_k1_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_scaled_modified_bessel_k1_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_spherical_bessel_j0_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_spherical_bessel_j0_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_spherical_bessel_j0_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_spherical_bessel_j0_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_spherical_bessel_j0_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_spherical_bessel_j0_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_spherical_bessel_j0_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_special_spherical_bessel_j0_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sqrt_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sqrt_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sqrt_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sqrt_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sqrt_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sqrt_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sqrt_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sqrt_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sqrt_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sqrt_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sqrt_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sqrt_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_sqrt_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_square_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_square_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_square_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_square_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_square_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_square_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_square_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_square_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_square_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_square_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_square_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_square_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tan_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tan_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tan_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tan_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tan_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tan_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tan_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tan_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tan_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tan_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tan_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tan_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tan_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tanh_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tanh_cuda_bool, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tanh_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tanh_cuda_complex32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tanh_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tanh_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tanh_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tanh_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tanh_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tanh_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tanh_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tanh_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_tanh_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_trunc_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_trunc_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_trunc_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_trunc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_trunc_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_trunc_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_trunc_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_trunc_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_reference_numerics_small_trunc_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_silu_complex_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_silu_complex_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_silu_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_silu_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_sinc_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_special_i0_i1_vs_scipy_cuda_bfloat16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_special_i0_i1_vs_scipy_cuda_float16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_special_i0_i1_vs_scipy_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_special_i0_i1_vs_scipy_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_special_log_ndtr_vs_scipy_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_special_log_ndtr_vs_scipy_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_special_ndtr_vs_scipy_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_special_ndtr_vs_scipy_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_threshold_cuda_complex128, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_threshold_cuda_complex64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_threshold_cuda_float32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_threshold_cuda_float64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_threshold_cuda_int16, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_threshold_cuda_int32, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_threshold_cuda_int64, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_threshold_cuda_int8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_threshold_cuda_uint8, test/test_unary_ufuncs.py::TestUnaryUfuncsCUDA::test_unary_out_op_mem_overlap_cuda_float64 2025-10-10T02:20:36.2558146Z 2025-10-10T02:20:38.4402912Z Running optim/test_optim 1/1 ... [2025-10-10 02:20:38.439793] 2025-10-10T02:20:38.4403366Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:20:38.4406351Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'optim/test_optim.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:20:38.440245] 2025-10-10T02:20:41.5064972Z 2025-10-10T02:20:41.5065931Z functorch/test_ops 2/2 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_ops_2.2_7be5e9b7b2602bbb_.log 2025-10-10T02:20:41.6852089Z Running 5117 items in this shard: test/functorch/test_ops.py::TestOperatorsCUDA::test_extremal_numerics_binary_cross_entropy_cuda, test/functorch/test_ops.py::TestOperatorsCUDA::test_extremal_numerics_log_softmax_cuda, test/functorch/test_ops.py::TestOperatorsCUDA::test_extremal_numerics_nll_loss_cuda, test/functorch/test_ops.py::TestOperatorsCUDA::test_extremal_numerics_softmax_cuda, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_H_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_T_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad___getitem___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad___getitem___functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad___rmatmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad___rmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad__chunk_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad__native_batch_norm_legit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad__segment_reduce_lengths_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad__segment_reduce_offsets_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad__unsafe_masked_index_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad__upsample_bilinear2d_aa_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_abs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_acos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_acosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_addcmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_addmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_addmv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_alias_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_allclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_argsort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_argwhere_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_as_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_asin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_asinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_atan2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_atan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_atanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_bmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_bool_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_bool_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_broadcast_shapes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_broadcast_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_bucketize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_byte_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_byte_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cauchy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cdouble_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cfloat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_chalf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_char_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_clamp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_clamp_max_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_clone_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_column_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_combinations_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_conj_physical_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_copysign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_count_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cov_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cumulative_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_diagflat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_diagonal_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_diff_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_digamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_dist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_div_floor_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_double_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_dstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_empty_permuted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_erf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_erfc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_expand_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_expand_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_expand_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_expm1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_eye_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_fft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_fft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_fftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_fftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_hfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_ifft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_ifftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_ifftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_ihfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_irfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_rfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_flatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_flip_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_float_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fmod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_frexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_gather_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_ge_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_geqrf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_gt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_half_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_half_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_hsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_hypot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_igamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_igammac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_index_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_index_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_index_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_index_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_inner_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_int_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_isclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_isfinite_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_isin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_isneginf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_isposinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_item_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_le_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_eig_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_eigvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_eigvalsh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_inv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_lstsq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_lu_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_lu_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_matrix_rank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_pinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_pinv_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_solve_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_solve_triangular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_tensorsolve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_vander_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_vecdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_vector_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_log1p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_log_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_log_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logaddexp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logical_and_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logical_or_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logical_xor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_long_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_lu_unpack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_masked_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_masked_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_masked_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_masked_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_masked_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_masked_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_masked_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_masked_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_masked_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_matrix_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_max_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_min_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_min_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_minimum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_mm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_movedim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_mul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_multinomial_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_mv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_mvlgamma_mvlgamma_p_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_mvlgamma_mvlgamma_p_5_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nan_to_num_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nanmean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nanmedian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nanquantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_narrow_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_native_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_ne_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_new_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_new_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_new_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_alpha_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_conv1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_conv2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_conv2d_stride_padding_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_conv3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_conv_transpose1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_gelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_grid_sample_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_hardshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_huber_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_instance_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_interpolate_trilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_leaky_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_logsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_max_unpool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_max_unpool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_mse_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_pad_circular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_pad_constant_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_pad_reflect_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_pdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_poisson_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_prelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_silu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_softmin_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_softplus_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_tanhshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_threshold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_upsample_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_norm_fro_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_normal_in_place_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_ormqr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_outer_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_pca_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_permute_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_pinverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_quantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_rad2deg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_randint_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_randn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_real_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_remainder_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_reshape_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_reshape_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_resolve_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_round_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_round_decimals_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_round_decimals_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_rsub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_scalar_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_scatter_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_scatter_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_sgn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_sign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_signal_windows_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_signal_windows_gaussian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_signal_windows_general_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_signal_windows_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_signal_windows_hann_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_signal_windows_kaiser_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_signbit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_slice_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_sort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_sparse_mm_reduce_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_sparse_sampled_addmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_airy_ai_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_bessel_y0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_bessel_y1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_entr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_hermite_polynomial_h_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_hermite_polynomial_he_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_i0e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_legendre_polynomial_p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_modified_bessel_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_shifted_chebyshev_polynomial_u_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_shifted_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_xlog1py_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_zeta_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_split_list_args_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_split_with_sizes_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_split_with_sizes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_squeeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_std_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_stft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_sub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_sum_to_size_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_svd_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_tanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_tensordot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_tile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_to_sparse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_torch_ops_aten__efficient_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_transpose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_triangular_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_tril_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_unbind_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_unflatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_unique_consecutive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_unique_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_unsafe_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_unsqueeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_var_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_vdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_view_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_vsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_xlogy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_zero__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_zeros_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_H_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_T_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp___getitem___functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp___radd___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp___rdiv___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp__chunk_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp__softmax_backward_data_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp__unsafe_masked_index_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp__upsample_bilinear2d_aa_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_acos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_acosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_addcdiv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_addr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_alias_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_all_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_any_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_argsort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_as_strided_partial_views_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_asin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_asinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_atan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_atanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_atleast_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_bfloat16_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_block_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_broadcast_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_byte_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cauchy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_ceil_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cfloat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cholesky_inverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cholesky_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_column_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_combinations_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_conj_physical_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cov_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cummin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cumulative_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_diagonal_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_diff_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_digamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_div_floor_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_div_trunc_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_dstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_einsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_empty_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_equal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_exp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_expand_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_expand_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_expm1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_eye_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_fftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_ifft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_ifftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_ihfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_ihfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_irfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_irfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_rfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_rfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_flatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_flip_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fliplr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_floor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_floor_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fmod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_frac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_frexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_ge_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_gradient_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_gt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_histc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_hsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_hstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_index_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_index_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_index_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_index_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_index_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_inner_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_int_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_isclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_isfinite_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_isnan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_isneginf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_isposinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_lerp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_lgamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_cholesky_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_eig_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_eigvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_eigvalsh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_householder_product_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_inv_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_lu_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_norm_subgradients_at_zero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_pinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_solve_triangular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_svdvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_tensorsolve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_vector_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_log_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_log_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_log_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_logaddexp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_logical_and_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_logspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_mH_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_mT_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_matmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_matrix_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_meshgrid_list_of_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_min_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_min_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_mul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_mv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_mvlgamma_mvlgamma_p_5_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nanmean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nanmedian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nansum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_native_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_ne_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_new_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_new_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nextafter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_adaptive_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_adaptive_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_alpha_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_batch_norm_without_cudnn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv2d_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv2d_stride_padding_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv2d_stride_padding_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv2d_strided_padding_dilation_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv_transpose1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv_transpose2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_dropout3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_embedding_bag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_embedding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_embedding_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_grid_sample_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_group_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_hinge_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_huber_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_interpolate_area_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_interpolate_nearest-exact_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_margin_ranking_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_max_unpool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_max_unpool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_mse_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_pad_constant_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_pad_reflect_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_pairwise_distance_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_pixel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_prelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_relu6_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_rms_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_silu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_softplus_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_softshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_softsign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_tanhshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_threshold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_upsample_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_norm_nuc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_normal_in_place_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_ops_aten_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_ormqr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_outer_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_permute_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_permute_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_pinverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_polygamma_polygamma_n_2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_pow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_quantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_rad2deg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_randint_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_real_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_reshape_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_reshape_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_roll_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_rot90_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_round_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_round_decimals_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_rsqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_scalar_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_scatter_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_scatter_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_scatter_reduce_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_searchsorted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_select_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_sgn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_short_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_short_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_sign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_signal_windows_blackman_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_signal_windows_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_signal_windows_gaussian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_signal_windows_general_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_signal_windows_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_signal_windows_hann_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_signal_windows_kaiser_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_signbit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_sin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_sinc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_slice_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_sort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_bessel_j1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_bessel_y1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_legendre_polynomial_p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_ndtri_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_polygamma_special_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_shifted_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_xlog1py_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_split_with_sizes_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_sqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_square_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_squeeze_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_squeeze_multiple_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_std_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_std_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_std_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_t_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_take_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_tan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_to_sparse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_torch_ops_aten__safe_softmax_default_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_tril_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_unflatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_uniform_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_unique_consecutive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_unique_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_unsafe_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_unsafe_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_unsqueeze_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_var_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_var_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_var_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_view_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_view_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_view_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_vstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpjvpvmap_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpjvpvmap_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpjvpvmap_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpjvpvmap_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpjvpvmap_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpjvpvmap_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpjvpvmap_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp___getitem___functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp__native_batch_norm_legit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_acosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_addbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_addcdiv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_addcmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_addmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_addmm_decomposed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_addmv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_addr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_aminmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_argsort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_as_strided_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_as_strided_partial_views_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_atan2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_atanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_atleast_1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_atleast_3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_baddbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_bernoulli_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_bool_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_broadcast_shapes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_broadcast_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_bucketize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_byte_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cartesian_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cauchy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_ceil_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_char_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_char_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cholesky_inverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cholesky_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_clamp_min_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_column_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_conj_physical_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_constant_pad_nd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_contiguous_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_corrcoef_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_count_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cov_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cummin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_deg2rad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_diag_embed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_diagflat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_diagonal_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_diff_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_double_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_dstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_einsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_empty_permuted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_erf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_erfc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_erfinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_expand_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_expand_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_expm1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_fftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_fftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_ifft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_ihfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_irfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_rfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_rfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_rfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_float_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_float_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_floor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fmod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_gather_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_ge_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_geqrf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_gt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_half_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_heaviside_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_histc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_hstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_hypot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_igamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_igammac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_index_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_index_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_index_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_index_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_index_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_isin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_isnan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_isposinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_isreal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_jiterator_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_kron_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_le_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_lerp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_det_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_eig_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_eigh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_inv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_ldl_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_lu_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_lu_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_matrix_rank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_pinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_pinv_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_svdvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_tensorinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_log10_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_log1p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_log_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_logaddexp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_logical_and_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_logspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_long_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_long_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_lt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_mT_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_matrix_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_max_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_max_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_maximum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_meshgrid_variadic_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_min_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_msort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_mul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_multinomial_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_mv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nanmedian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_narrow_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_native_dropout_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_native_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_ne_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_new_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_new_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_new_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nextafter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_adaptive_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_alpha_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_conv1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_conv2d_stride_padding_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_conv2d_stride_padding_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_conv3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_conv_transpose1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_conv_transpose2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_conv_transpose3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_ctc_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_dropout2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_elu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_embedding_bag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_embedding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_embedding_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_gaussian_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_gelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_glu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_hardswish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_hardtanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_huber_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_interpolate_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_interpolate_nearest-exact_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_interpolate_trilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_kl_div_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_leaky_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_logsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_margin_ranking_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_max_unpool1d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_mse_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_mse_loss_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_multi_head_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_pad_circular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_pixel_unshuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_poisson_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_prelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_scaled_dot_product_attention_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_selu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_smooth_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_softmin_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_softplus_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_tanhshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_triplet_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_norm_fro_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_polar_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_polygamma_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_polygamma_polygamma_n_2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_polygamma_polygamma_n_4_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_positive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_pow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_rad2deg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_rand_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_randint_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_randn_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_ravel_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_real_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_remainder_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_repeat_interleave_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_resize__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_roll_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_round_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_round_decimals_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_round_decimals_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_round_decimals_neg_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_rsqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_scalar_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_scatter_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_scatter_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_sgn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_sign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_signal_windows_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_signal_windows_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_signal_windows_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_signal_windows_nuttall_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_signbit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_sinc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_slice_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_sort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_sparse_sampled_addmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_airy_ai_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_chebyshev_polynomial_u_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_hermite_polynomial_h_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_hermite_polynomial_he_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_i1e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_legendre_polynomial_p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_modified_bessel_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_modified_bessel_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_polygamma_special_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_shifted_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_spherical_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_xlog1py_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_zeta_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_split_list_args_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_split_with_sizes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_sqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_square_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_squeeze_multiple_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_std_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_std_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_std_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_stft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_take_along_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_tan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_tanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_tensordot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_tile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_topk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_transpose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_trapz_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_tril_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_true_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_trunc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_unflatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_uniform_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_unique_consecutive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_unique_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_unsafe_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_var_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_var_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_view_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_view_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_view_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_vstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_zero__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_zeros_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjpvmap_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjpvmap_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjpvmap_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjpvmap_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjpvmap_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjpvmap_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjpvmap_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjpvmap_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjpvmap_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmap_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmap_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmap_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmap_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmap_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmap_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmapvmap_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmapvmap_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmapvmap_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmapvmap_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmapvmap_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_bool_raises_argmax_cuda_bool, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_bool_raises_argmin_cuda_bool, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_bool_raises_ceil_cuda_bool, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_amax_cuda_complex128, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_amax_cuda_complex32, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_amin_cuda_complex128, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_amin_cuda_complex32, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_amin_cuda_complex64, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_argmin_cuda_complex128, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_argmin_cuda_complex32, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_argmin_cuda_complex64, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_ceil_cuda_complex128, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_ceil_cuda_complex64, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_clamp_cuda_complex128, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_clamp_cuda_complex32, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_clamp_cuda_complex64, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_floor_cuda_complex32, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_ge_cuda_complex32, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_gt_cuda_complex32, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_le_cuda_complex128, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_le_cuda_complex64, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_lt_cuda_complex32, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_maximum_cuda_complex32, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_maximum_cuda_complex64, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_minimum_cuda_complex128, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_sort_cuda_complex128, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_topk_cuda_complex128, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_T_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_T_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_broadcast_to_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_conj_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_conj_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_contiguous_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_diagonal_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_diagonal_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_expand_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_list_return_dsplit_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_list_return_hsplit_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_list_return_hsplit_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_list_return_split_list_args_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_list_return_split_list_args_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_list_return_unbind_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_list_return_vsplit_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_mT_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_permute_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_real_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_real_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_reshape_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_reshape_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_resolve_conj_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_resolve_conj_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_resolve_neg_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_select_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_select_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_squeeze_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_squeeze_multiple_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_transpose_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_unflatten_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_unsqueeze_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_view_as_complex_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_view_as_complex_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_view_as_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_view_as_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_view_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_T_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp___rmatmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp___rmod___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp___rsub___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp__native_batch_norm_legit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp__unsafe_masked_index_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp__upsample_bilinear2d_aa_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_abs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_acos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_acosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_alias_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_all_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_allclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_aminmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_angle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_any_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_arange_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_argsort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_as_strided_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_as_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_as_strided_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_asin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_asinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_atleast_1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_atleast_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_atleast_3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_baddbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_bernoulli_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_block_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_bool_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_bool_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_broadcast_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_byte_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_cartesian_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_chalf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_char_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_cholesky_inverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_cholesky_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_conj_physical_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_constant_pad_nd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_contiguous_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_copysign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_corrcoef_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_cos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_cosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_diagflat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_diagonal_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_diagonal_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_div_floor_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_div_no_rounding_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_double_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_einsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_erf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_erfc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_exp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_expand_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_expand_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_fft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_fftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_hfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_ifft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_ihfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_ihfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_ihfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_irfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_rfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_rfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fliplr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_flipud_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fmod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_frac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_full_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_gather_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_ge_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_geqrf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_grid_sampler_3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_gt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_half_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_hash_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_hypot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_index_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_index_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_index_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_index_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_isnan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_isneginf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_isreal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_jiterator_4inputs_with_extra_args_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_jiterator_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_lerp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_lgamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_cond_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_eig_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_eigh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_eigvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_householder_product_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_inv_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_matrix_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_matrix_rank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_multi_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_norm_subgradients_at_zero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_pinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_solve_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_vander_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_log10_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_log2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_log_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_log_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_logdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_logit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_logspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_logspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_long_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_mT_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_max_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_maximum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_min_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_minimum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_mm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_msort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_mul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_mvlgamma_mvlgamma_p_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_mvlgamma_mvlgamma_p_5_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nan_to_num_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nansum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_narrow_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_native_dropout_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_native_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_new_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_new_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_new_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nextafter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_adaptive_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_alpha_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_conv1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_conv2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_conv2d_strided_padding_dilation_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_conv3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_ctc_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_dropout2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_dropout3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_embedding_bag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_embedding_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_gaussian_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_gelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_glu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_grid_sample_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_group_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_hardswish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_hardtanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_hinge_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_interpolate_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_interpolate_trilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_local_response_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_margin_ranking_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_max_unpool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_multi_head_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_multi_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_pad_circular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_pad_constant_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_pad_reflect_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_pad_replicate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_poisson_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_prelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_rms_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_rrelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_smooth_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_softshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_softsign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_upsample_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nonzero_static_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_norm_fro_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_norm_nuc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_ops_aten__new_zeros_with_same_feature_meta_functorchonly_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_pca_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_polygamma_polygamma_n_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_polygamma_polygamma_n_4_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_pow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_quantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_rad2deg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_randint_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_randn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_randn_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_reciprocal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_renorm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_reshape_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_resize__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_resize_as__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_resolve_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_roll_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_round_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_round_decimals_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_rsub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_scatter_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_scatter_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_scatter_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_scatter_reduce_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_select_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_sgn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_short_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_short_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_signal_windows_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_signal_windows_hann_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_signal_windows_kaiser_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_signbit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_sin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_sinc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_slice_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_sort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_sparse_sampled_addmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_airy_ai_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_bessel_y0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_chebyshev_polynomial_u_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_hermite_polynomial_h_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_hermite_polynomial_he_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_i0e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_legendre_polynomial_p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_modified_bessel_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_ndtri_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_shifted_chebyshev_polynomial_u_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_shifted_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_zeta_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_sqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_squeeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_std_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_stft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_sum_to_size_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_take_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_tanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_tensordot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_to_sparse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_topk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_torch_ops_aten__safe_softmax_default_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_transpose_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_transpose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_triangular_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_tril_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_triu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_trunc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_unbind_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_uniform_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_unique_consecutive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_unique_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_unsafe_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_var_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_var_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_var_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_view_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_view_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_view_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_zero__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp___getitem___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp___getitem___functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp___rdiv___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp___rmatmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp___rmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp__batch_norm_with_update_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp__unsafe_masked_index_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp__upsample_bilinear2d_aa_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_addcmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_addmm_decomposed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_addr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_alias_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_angle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_any_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_argwhere_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_as_strided_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_asin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_atan2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_atan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_atleast_3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_bernoulli_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_bfloat16_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_block_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_broadcast_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_broadcast_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_bucketize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_byte_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_ceil_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_cfloat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_chalf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_char_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_cholesky_inverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_clamp_min_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_clone_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_combinations_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_constant_pad_nd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_corrcoef_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_cos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_cosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_cov_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_cumulative_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_deg2rad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_diag_embed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_diagonal_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_dist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_div_floor_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_dstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_einsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_empty_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_eq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_equal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_erfc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_exp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_expand_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_expand_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_expm1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_eye_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_fft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_fft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_hfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_ifft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_ifft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_ifftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_ifftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_ihfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_ihfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_irfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_irfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_irfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_rfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_flatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_flip_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_flipud_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_floor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_floor_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_frac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_full_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_gather_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_geometric_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_geqrf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_gradient_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_grid_sampler_3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_half_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_hash_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_heaviside_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_hsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_hstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_hypot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_igamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_index_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_index_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_index_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_index_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_inner_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_int_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_isfinite_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_isin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_isnan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_isneginf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_item_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_jiterator_4inputs_with_extra_args_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_kron_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_le_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_cholesky_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_eigh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_eigvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_eigvalsh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_inv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_ldl_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_lstsq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_lu_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_matrix_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_pinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_pinv_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_slogdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_tensorinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_vector_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_log1p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_log2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_log_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_log_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_log_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_logaddexp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_logical_or_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_logit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_logspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_logspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_long_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_long_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_lu_unpack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_mH_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_mT_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_masked_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_masked_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_masked_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_masked_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_masked_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_masked_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_matmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_matrix_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_max_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_max_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_meshgrid_list_of_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_min_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_minimum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_mm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_movedim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_msort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_mul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_multinomial_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_mv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_mvlgamma_mvlgamma_p_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nanmean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nansum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_native_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_new_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_new_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nextafter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_adaptive_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_adaptive_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_conv2d_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_conv2d_stride_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_conv_transpose3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_dropout2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_dropout3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_elu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_embedding_bag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_gaussian_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_glu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_grid_sample_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_hardshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_hardswish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_hardtanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_hinge_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_instance_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_interpolate_area_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_interpolate_nearest-exact_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_leaky_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_max_unpool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_max_unpool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_mse_loss_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_multi_head_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_multi_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_pad_replicate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_pdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_pixel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_pixel_unshuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_relu6_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_silu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_softplus_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_softshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_softsign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_threshold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nonzero_static_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_norm_fro_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_normal_number_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_ones_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_ops_aten_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_ormqr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_outer_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_polygamma_polygamma_n_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_pow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_rad2deg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_randint_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_randn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_reshape_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_resolve_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_resolve_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_roll_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_rot90_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_round_decimals_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_round_decimals_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_rsqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_rsub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_scalar_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_scatter_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_scatter_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_searchsorted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_select_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_signal_windows_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_signal_windows_hann_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_signal_windows_kaiser_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_signbit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_sinc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_sinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_airy_ai_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_bessel_j1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_bessel_y0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_bessel_y1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_chebyshev_polynomial_u_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_hermite_polynomial_h_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_hermite_polynomial_he_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_i0e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_i1e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_legendre_polynomial_p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_ndtri_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_polygamma_special_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_shifted_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_shifted_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_spherical_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_xlog1py_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_zeta_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_square_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_squeeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_squeeze_multiple_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_std_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_std_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_sub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_t_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_take_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_tan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_tensor_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_torch_ops_aten__efficient_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_torch_ops_aten__safe_softmax_default_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_transpose_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_trapz_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_triangular_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_triu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_unflatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_unfold_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_unique_consecutive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_unique_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_unsafe_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_unsafe_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_unsqueeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_vdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_view_as_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_view_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_vsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_vstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_xlogy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_zero__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_zeros_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjpvmap_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjpvmap_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjpvmap_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjpvmap_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjpvmap_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjpvmap_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_H_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap___radd___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap___rdiv___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap___rmatmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap__upsample_bilinear2d_aa_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_addbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_addcdiv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_addr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_all_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_any_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_argsort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_as_strided_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_as_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_atan2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_atan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_atanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_atleast_3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_baddbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_block_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_bool_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_broadcast_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_byte_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cartesian_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cauchy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cfloat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_chalf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_char_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_char_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cholesky_inverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cholesky_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_clamp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_clamp_max_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_combinations_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_conj_physical_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_corrcoef_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cov_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cummin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cumulative_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_diag_embed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_diagonal_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_diagonal_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_digamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_dist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_div_floor_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_div_no_rounding_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_div_trunc_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_dstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_einsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_eq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_erfc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_erfinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_expm1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fft_fft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fft_fft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fft_fftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fft_hfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fft_hfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fft_ifft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fft_ihfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fft_irfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fft_rfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_flip_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_float_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_floor_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fmod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_frac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_frexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_full_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_geometric_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_geqrf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_gradient_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_hash_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_hstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_hypot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_igamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_index_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_index_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_index_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_index_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_index_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_int_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_isin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_isinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_isneginf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_isreal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_jiterator_unary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_ldexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_lgamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_eigh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_eigvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_inv_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_ldl_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_lu_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_multi_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_pinv_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_solve_triangular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_svdvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_tensorinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_tensorsolve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_vecdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_vector_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_log10_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_log_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_log_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_logaddexp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_logical_xor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_logit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_logspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_long_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_lt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_mT_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_matmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_matrix_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_min_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_minimum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_mm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_msort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_mul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nanmean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nanquantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_new_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_new_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_new_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nextafter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_adaptive_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_adaptive_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_batch_norm_without_cudnn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_conv2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_conv2d_stride_padding_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_conv2d_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_conv_transpose1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_dropout3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_elu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_embedding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_embedding_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_glu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_grid_sample_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_group_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_hardshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_hardtanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_hinge_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_interpolate_area_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_interpolate_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_local_response_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_logsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_margin_ranking_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_max_unpool1d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_max_unpool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_mse_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_multi_head_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_multi_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_pad_circular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_pad_constant_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_pairwise_distance_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_poisson_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_rms_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_rrelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_silu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_softmin_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_softshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_tanhshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_threshold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_triplet_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_ones_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_ops_aten__new_zeros_with_same_feature_meta_functorchonly_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_permute_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_polar_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_polygamma_polygamma_n_2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_polygamma_polygamma_n_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_polygamma_polygamma_n_4_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_positive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_quantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_ravel_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_real_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_reciprocal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_reshape_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_reshape_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_resize__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_resize_as__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_resolve_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_roll_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_rot90_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_round_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_round_decimals_neg_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_rsub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_scatter_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_scatter_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_scatter_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_scatter_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_scatter_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_select_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_short_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_short_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_signal_windows_blackman_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_signal_windows_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_signal_windows_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_sin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_sinc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_sinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_slice_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_sparse_mm_reduce_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_bessel_y0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_bessel_y1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_chebyshev_polynomial_u_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_entr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_erfcx_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_hermite_polynomial_he_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_i0e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_i1e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_modified_bessel_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_modified_bessel_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_spherical_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_zeta_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_split_with_sizes_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_split_with_sizes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_square_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_squeeze_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_std_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_sum_to_size_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_tanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_tensordot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_tile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_to_sparse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_topk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_torch_ops_aten__efficient_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_transpose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_trapz_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_triangular_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_tril_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_triu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_true_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_unsafe_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_unsqueeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_var_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_view_as_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_view_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_view_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_view_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_xlogy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_CubeGenVmapAutogradFunction_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ForwardHasDefaultArgsAutogradFunction_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_NumpyCubeAutogradFunction_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_NumpyExpMarkDirtyAutogradFunction_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_NumpyTakeAutogradFunction_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_SortGenVmapAutogradFunction_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_T_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_T_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ZeroGradientsGenVmapAutogradFunction_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad___radd___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad___rdiv___cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad___rmod___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad___rmod___cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad___rpow___cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__batch_norm_with_update_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__chunk_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__native_batch_norm_legit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__native_batch_norm_legit_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__segment_reduce_lengths_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__segment_reduce_offsets_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__segment_reduce_offsets_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__softmax_backward_data_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__unsafe_masked_index_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__unsafe_masked_index_put_accumulate_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_abs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_abs_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_acos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_acosh_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_add_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_addbmm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_addcdiv_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_addcmul_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_addmm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_addmm_decomposed_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_addmv_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_addr_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_alias_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_all_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_allclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_amax_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_amin_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_aminmax_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_any_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_argmin_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_argsort_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_as_strided_copy_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_as_strided_partial_views_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_asin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_asinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_atan2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_atan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_atleast_1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_atleast_1d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_atleast_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_atleast_2d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_atleast_3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_atleast_3d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_baddbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_bernoulli_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_bfloat16_functorch_no_channels_last_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_block_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_block_diag_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_bmm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_bool_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_bool_functorch_no_channels_last_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_broadcast_shapes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_broadcast_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_broadcast_tensors_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_broadcast_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_bucketize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_byte_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ceil_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ceil_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cfloat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_chalf_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_char_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cholesky_solve_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_clamp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_clamp_max_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_clone_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_column_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_combinations_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_conj_physical_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_constant_pad_nd_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_contiguous_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cos_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cross_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cummax_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cumulative_trapezoid_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_deg2rad_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_diag_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_diag_embed_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_diagflat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_diagonal_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_diagonal_scatter_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_diff_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_digamma_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_div_floor_rounding_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_div_no_rounding_mode_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_div_trunc_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_dot_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_double_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_double_functorch_no_channels_last_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_dsplit_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_dstack_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_einsum_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_empty_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_empty_permuted_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_eq_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_equal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_erf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_erfc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_erfinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_exp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_exp2_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_exp_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_expand_copy_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_expand_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_expand_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_expm1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_exponential_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_eye_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_eye_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_fft2_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_fft_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_fftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_fftn_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_hfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_hfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_hfftn_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_ifft2_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_ifft_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_ifftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_ifftn_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_ihfft_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_ihfftn_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_irfft_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_irfftn_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_rfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_rfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_flatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_flatten_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fliplr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_float_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_float_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_float_functorch_no_channels_last_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_float_power_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_floor_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_floor_divide_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fmax_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fmin_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fmod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fmod_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_frexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_gather_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_gather_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_geometric_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_geometric_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_geqrf_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_gradient_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_gradient_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_grid_sampler_2d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_gt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_half_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_half_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_heaviside_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_histc_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_hsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_hsplit_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_hstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_hypot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_i0_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_igammac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_index_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_index_copy_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_index_put_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_index_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_index_reduce_mean_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_index_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_inner_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_int_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_int_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_int_functorch_no_channels_last_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_isclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_isin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_isin_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_isinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_isinf_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_isnan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_isnan_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_isneginf_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_isposinf_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_isreal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_isreal_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_jiterator_2inputs_2outputs_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_jiterator_4inputs_with_extra_args_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_jiterator_binary_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_jiterator_unary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_jiterator_unary_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_kron_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_kthvalue_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ldexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ldexp_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_le_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_lerp_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_lgamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_cholesky_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_cholesky_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_cond_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_det_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_diagonal_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_eig_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_eig_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_eigh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_eigvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_eigvalsh_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_householder_product_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_householder_product_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_inv_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_ldl_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_ldl_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_ldl_solve_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_lstsq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_lu_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_lu_solve_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_matrix_power_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_multi_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_multi_dot_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_pinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_pinv_hermitian_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_solve_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_svd_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_svdvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_tensorinv_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_tensorsolve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_tensorsolve_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_vecdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_vecdot_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_vector_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_log10_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_log10_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_log1p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_log2_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_log_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_log_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_log_softmax_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logaddexp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logaddexp2_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logical_or_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logical_xor_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logspace_tensor_overload_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_long_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_long_functorch_no_channels_last_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_lt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_lt_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_mH_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_mH_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_mT_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_amax_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_cumprod_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_cumsum_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_fill_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_fill_functorch_Scalar_only_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_logaddexp_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_mean_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_median_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_norm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_normalize_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_select_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_std_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_sum_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_matmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_matmul_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_max_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_max_binary_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_max_pool2d_with_indices_backward_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_max_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_max_reduction_with_dim_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_maximum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_median_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_meshgrid_list_of_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_meshgrid_variadic_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_min_binary_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_min_reduction_no_dim_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_minimum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_minimum_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_msort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_msort_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_mul_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_multinomial_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_multinomial_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_mv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_mvlgamma_mvlgamma_p_1_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_mvlgamma_mvlgamma_p_3_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nan_to_num_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nan_to_num_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nanmean_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nanmedian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nanmedian_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nansum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nansum_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_narrow_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_native_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_native_batch_norm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_native_layer_norm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ne_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_new_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_new_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_new_zeros_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nextafter_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_adaptive_avg_pool1d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_adaptive_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_adaptive_avg_pool2d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_adaptive_avg_pool3d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_adaptive_max_pool2d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_alpha_dropout_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_avg_pool2d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_avg_pool3d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_batch_norm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_batch_norm_without_cudnn_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_stride_no_bias_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_stride_padding_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_stride_padding_no_bias_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_stride_padding_with_bias_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_stride_with_bias_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_strided_padding_dilation_no_bias_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv3d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv_transpose1d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv_transpose2d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv_transpose3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_cosine_similarity_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_cross_entropy_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_ctc_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_ctc_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_dropout3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_dropout3d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_dropout_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_elu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_elu_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_embedding_bag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_embedding_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_fractional_max_pool2d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_gelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_gelu_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_grid_sample_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_grid_sample_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_group_norm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_hardshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_hardshrink_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_hardswish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_hardtanh_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_hinge_embedding_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_instance_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_interpolate_bicubic_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_interpolate_nearest-exact_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_interpolate_nearest-exact_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_interpolate_nearest_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_interpolate_trilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_kl_div_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_l1_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_layer_norm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_local_response_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_max_pool3d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_max_unpool1d_grad_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_mse_loss_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_multi_head_attention_forward_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_multi_margin_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_multilabel_margin_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_normalize_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_pad_replicate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_pad_replicate_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_pairwise_distance_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_pdist_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_pixel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_pixel_shuffle_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_pixel_unshuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_pixel_unshuffle_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_prelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_rrelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_selu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_soft_margin_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_softmin_with_dtype_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_softplus_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_softsign_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_threshold_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_unfold_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nonzero_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nonzero_static_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_norm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_norm_inf_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_normal_in_place_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_normal_in_place_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ones_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ops_aten__new_zeros_with_same_feature_meta_functorchonly_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ormqr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ormqr_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_outer_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_pca_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_pca_lowrank_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_permute_copy_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_permute_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_pinverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_pinverse_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_polar_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_polar_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_polygamma_polygamma_n_0_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_polygamma_polygamma_n_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_polygamma_polygamma_n_2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_polygamma_polygamma_n_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_polygamma_polygamma_n_4_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_positive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_positive_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_prod_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_put_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_quantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_rad2deg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_rad2deg_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_rand_like_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_randint_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_randn_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_randn_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_real_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_reciprocal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_remainder_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_remainder_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_renorm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_repeat_interleave_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_reshape_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_reshape_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_resize_as__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_resize_as__cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_resolve_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_roll_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_roll_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_rot90_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_round_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_round_decimals_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_round_decimals_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_round_decimals_neg_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_rsqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_rsub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_rsub_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_scalar_tensor_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_scatter_reduce_amax_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_scatter_reduce_amin_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_scatter_reduce_mean_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_scatter_reduce_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_scatter_reduce_sum_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_select_scatter_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_short_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_short_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_short_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_short_functorch_no_channels_last_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sign_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_blackman_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_cosine_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_gaussian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_general_cosine_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_general_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_general_hamming_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_hamming_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_hann_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_hann_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_kaiser_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_nuttall_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_nuttall_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signbit_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sin_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sinc_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sinh_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_slice_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_slice_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_slice_scatter_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_softmax_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_softmax_with_dtype_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sparse_mm_reduce_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sparse_sampled_addmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sparse_sampled_addmm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_airy_ai_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_airy_ai_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_bessel_j0_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_bessel_j1_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_bessel_y0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_bessel_y1_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_chebyshev_polynomial_u_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_chebyshev_polynomial_v_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_entr_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_i0e_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_i1_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_laguerre_polynomial_l_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_legendre_polynomial_p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_legendre_polynomial_p_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_log_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_modified_bessel_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_modified_bessel_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_modified_bessel_i1_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_modified_bessel_k0_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_modified_bessel_k1_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_ndtr_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_ndtri_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_ndtri_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_polygamma_special_polygamma_n_0_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_scaled_modified_bessel_k0_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_shifted_chebyshev_polynomial_u_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_shifted_chebyshev_polynomial_u_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_shifted_chebyshev_polynomial_v_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_shifted_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_shifted_chebyshev_polynomial_w_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_spherical_bessel_j0_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_xlog1py_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_zeta_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_zeta_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_split_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_split_list_args_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_split_list_args_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_split_with_sizes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_split_with_sizes_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_square_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_squeeze_copy_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_squeeze_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_std_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_std_mean_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_std_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_stft_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sum_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_t_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_t_copy_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_take_along_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_take_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_tan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_tan_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_tanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_tanh_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_tensordot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_tensordot_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_tile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_tile_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_topk_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_torch_ops_aten__efficient_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_torch_ops_aten__safe_softmax_default_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_transpose_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_trapezoid_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_trapz_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_triangular_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_triangular_solve_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_tril_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_tril_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_triu_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_true_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_trunc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_trunc_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unbind_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unbind_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unflatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unfold_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_uniform_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unique_consecutive_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unique_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unsafe_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unsafe_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_var_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_var_mean_unbiased_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_var_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_vdot_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_view_as_complex_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_view_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_view_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_view_copy_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_view_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_view_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_vsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_vsplit_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_vstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_vstack_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_xlogy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_zero__cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_zeros_like_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall___getitem___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall___getitem___functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall___rdiv___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall___rmatmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall___rmod___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall___rsub___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall__segment_reduce_lengths_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_abs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_acosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_addcdiv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_addcmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_addmv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_addr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_all_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_allclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_angle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_any_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_arange_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_argsort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_asin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_asinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_atan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_atleast_1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_atleast_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_atleast_3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_bernoulli_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_bmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_broadcast_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cartesian_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cauchy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_ceil_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cfloat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_chalf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_char_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_clamp_min_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_clone_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_column_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_constant_pad_nd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_copysign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_corrcoef_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cummin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cumulative_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_diag_embed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_diagflat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_diagonal_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_dist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_div_no_rounding_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_div_trunc_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_double_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_dstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_empty_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_empty_permuted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_equal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_erfc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_expand_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_expand_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fft_fft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fft_hfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fft_ifft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fft_ifftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fft_ifftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fft_ihfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fft_ihfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fft_rfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_flatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_flip_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_float_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_floor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fmod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_frac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_gather_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_geometric_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_geqrf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_half_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_T_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule___rmatmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule___rmod___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule__chunk_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule__segment_reduce_lengths_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule__softmax_backward_data_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule__unsafe_masked_index_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_acosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_addcdiv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_addcmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_addmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_addmv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_addr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_alias_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_allclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_aminmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_arange_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_argsort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_as_strided_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_as_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_asin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_atan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_baddbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_bernoulli_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_block_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_bmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_bool_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_bool_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_broadcast_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_bucketize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_byte_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cartesian_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cdouble_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_ceil_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_clamp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_clamp_max_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_column_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_conj_physical_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_contiguous_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_copysign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_corrcoef_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cov_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cummin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_deg2rad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_diagflat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_diagonal_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_div_no_rounding_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_div_trunc_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_einsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_empty_permuted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_eq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_equal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_expand_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_expm1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_eye_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_fft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_hfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_hfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_ifft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_ifftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_ihfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_ihfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_irfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_rfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_flip_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_flipud_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_float_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_float_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_floor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_floor_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fmod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_frac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_frexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_full_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_gather_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_geqrf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_gradient_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_half_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_hash_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_igamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_igammac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_index_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_index_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_index_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_index_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_index_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_index_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_inner_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_int_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_isclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_le_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_lerp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_det_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_eig_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_eigvalsh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_householder_product_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_ldl_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_lu_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_lu_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_multi_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_norm_subgradients_at_zero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_pinv_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_pinv_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_slogdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_solve_triangular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_svdvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_tensorinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_tensorsolve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_vander_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_log1p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_log2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_log_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_log_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_logcumsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_logdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_logical_xor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_logit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_lt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_mT_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_matmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_max_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_max_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_meshgrid_variadic_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_min_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_min_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_minimum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_mul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_multinomial_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_mv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_mvlgamma_mvlgamma_p_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_mvlgamma_mvlgamma_p_5_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nan_to_num_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nanmean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nansum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_narrow_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_native_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_new_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_adaptive_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_adaptive_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_alpha_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_conv2d_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_conv3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_ctc_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_dropout2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_dropout3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_elu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_embedding_bag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_embedding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_gaussian_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_glu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_grid_sample_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_hardswish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_hardtanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_hinge_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_instance_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_interpolate_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_interpolate_nearest-exact_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_interpolate_trilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_kl_div_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_logsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_margin_ranking_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_max_unpool1d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_max_unpool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_max_unpool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_mse_loss_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_multi_head_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_pad_circular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_pad_replicate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_pdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_pixel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_rms_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_scaled_dot_product_attention_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_selu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_silu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_smooth_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_softmin_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_upsample_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_norm_nuc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_ones_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_ops_aten_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_pca_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_permute_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_pinverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_polygamma_polygamma_n_4_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_rad2deg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_rand_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_ravel_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_real_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_reciprocal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_remainder_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_resize_as__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_resolve_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_resolve_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_round_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_rsub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_scalar_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_scatter_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_scatter_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_scatter_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_scatter_reduce_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_searchsorted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_sgn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signal_windows_bartlett_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signal_windows_blackman_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signal_windows_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signal_windows_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signal_windows_gaussian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signal_windows_general_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signal_windows_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signal_windows_hann_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signal_windows_kaiser_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signal_windows_nuttall_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signbit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_slice_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_sparse_mm_reduce_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_bessel_j1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_bessel_y1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_chebyshev_polynomial_u_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_entr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_i1e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_log_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_modified_bessel_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_polygamma_special_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_shifted_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_shifted_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_shifted_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_split_list_args_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_split_with_sizes_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_split_with_sizes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_squeeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_squeeze_multiple_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_std_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_std_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_stft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_svd_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_take_along_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_tanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_torch_ops_aten__efficient_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_torch_ops_aten__safe_softmax_default_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_transpose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_triangular_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_tril_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_unbind_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_uniform_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_unique_consecutive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_unique_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_unsafe_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_unsqueeze_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_unsqueeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_var_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_var_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_var_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_vdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_view_as_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_view_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_vsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_vstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_hsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_hstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_hypot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_igamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_index_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_index_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_index_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_index_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_isfinite_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_isinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_isnan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_isneginf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_isposinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_kron_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_lerp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_lgamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_eig_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_eigvalsh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_lstsq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_matrix_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_norm_subgradients_at_zero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_pinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_pinv_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_solve_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_solve_triangular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_svdvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_vander_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_log1p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_log_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_log_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logaddexp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logcumsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logical_or_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logical_xor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_max_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_maximum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_meshgrid_list_of_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_min_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_mm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_msort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_mvlgamma_mvlgamma_p_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nan_to_num_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nansum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_native_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_native_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_new_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_new_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_new_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_new_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nextafter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_adaptive_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_batch_norm_without_cudnn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_conv2d_stride_padding_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_conv2d_stride_padding_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_conv3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_conv_transpose2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_conv_transpose3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_ctc_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_dropout2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_elu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_embedding_bag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_embedding_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_gaussian_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_glu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_grid_sample_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_group_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_hardtanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_instance_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_interpolate_trilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_kl_div_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_leaky_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_local_response_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_margin_ranking_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_max_unpool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_max_unpool1d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_mse_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_multi_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_pad_constant_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_pad_reflect_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_pixel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_pixel_unshuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_prelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_relu6_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_rms_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_scaled_dot_product_attention_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_softshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_tanhshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_threshold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_triplet_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nonzero_static_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_normal_in_place_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_ops_aten__new_zeros_with_same_feature_meta_functorchonly_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_ops_aten_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_outer_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_permute_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_polygamma_polygamma_n_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_polygamma_polygamma_n_4_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_pow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_rand_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_randint_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_randint_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_randn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_randn_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_ravel_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_renorm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_reshape_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_resize__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_resolve_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_resolve_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_round_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_round_decimals_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_rsqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_scatter_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_select_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_sign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_signal_windows_bartlett_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_signal_windows_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_signal_windows_gaussian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_signal_windows_nuttall_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_slice_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_slice_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_sort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_sparse_mm_reduce_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_airy_ai_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_bessel_y1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_entr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_hermite_polynomial_he_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_i1e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_ndtri_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_shifted_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_xlog1py_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_zeta_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_square_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_squeeze_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_squeeze_multiple_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_std_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_std_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_stft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_svd_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_take_along_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_tan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_to_sparse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_torch_ops_aten__efficient_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_transpose_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_triangular_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_unbind_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_uniform_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_unique_consecutive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_unique_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_unsqueeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_var_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_var_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_vdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_view_as_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_view_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_vsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_H_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp___getitem___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp___radd___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp___rdiv___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp___rmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp___rsub___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp__segment_reduce_offsets_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_abs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_acosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_addbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_addcmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_addmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_addmm_decomposed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_all_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_argsort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_as_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_asin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_atan2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_atleast_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_bernoulli_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_bfloat16_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_bool_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_broadcast_shapes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_broadcast_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_bucketize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_byte_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_cartesian_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_cauchy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_cdouble_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_cfloat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_char_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_char_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_clamp_max_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_column_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_combinations_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_constant_pad_nd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_contiguous_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_copysign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_corrcoef_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_cov_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_cummin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_deg2rad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_diag_embed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_diagflat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_diagonal_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_diagonal_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_digamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_div_no_rounding_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_div_trunc_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_einsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_empty_permuted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_eq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_erf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_erfc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_erfinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_expand_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_expand_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_expm1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_eye_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_fft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_hfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_hfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_ifft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_ifft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_ifftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_ihfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_irfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_rfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_flip_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_flipud_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_float_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_floor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_frac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_ge_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_geometric_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_geqrf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_gradient_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_gt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_half_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_half_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_heaviside_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_hsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_hypot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_index_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_index_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_inner_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_int_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_isclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_isfinite_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_isinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_isnan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_isneginf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_isreal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_item_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_ldexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_cholesky_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_cond_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_eigvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_householder_product_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_ldl_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_lstsq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_matrix_rank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_solve_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_solve_triangular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_tensorsolve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_vander_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_vector_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_log2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_log_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_log_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_logcumsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_logdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_logical_xor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_logit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_logspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_logspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_long_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_lu_unpack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_matmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_matrix_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_max_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_max_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_meshgrid_variadic_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_min_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_minimum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_mm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_movedim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_msort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_mv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nan_to_num_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nanquantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nansum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_narrow_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_native_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_native_dropout_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_ne_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_new_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_new_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_new_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_new_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nextafter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_adaptive_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_conv1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_conv2d_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_conv2d_stride_padding_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_conv2d_strided_padding_dilation_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_conv3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_conv_transpose1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_conv_transpose2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_ctc_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_dropout2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_elu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_embedding_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_gelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_glu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_hardswish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_hinge_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_huber_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_instance_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_interpolate_area_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_interpolate_nearest-exact_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_interpolate_trilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_leaky_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_logsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_max_unpool1d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_mse_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_multi_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_pad_circular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_pad_reflect_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_pad_replicate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_pdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_pixel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_pixel_unshuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_relu6_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_rms_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_selu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_silu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_softmin_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_softplus_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_softsign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_tanhshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_norm_nuc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_normal_number_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_ones_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_ops_aten_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_outer_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_permute_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_permute_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_polygamma_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_polygamma_polygamma_n_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_polygamma_polygamma_n_2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_polygamma_polygamma_n_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_polygamma_polygamma_n_4_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_pow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_rad2deg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_rand_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_randint_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_randint_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_randn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_randn_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_real_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_reciprocal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_repeat_interleave_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_reshape_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_resolve_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_round_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_round_decimals_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_rsqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_scalar_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_scatter_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_scatter_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_scatter_reduce_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_select_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_sgn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_short_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_sign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_signal_windows_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_signal_windows_gaussian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_signal_windows_nuttall_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_sin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_sinc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_sinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_slice_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_slice_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_airy_ai_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_bessel_j1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_bessel_y0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_entr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_hermite_polynomial_h_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_hermite_polynomial_he_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_i1e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_log_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_modified_bessel_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_shifted_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_shifted_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_zeta_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_split_with_sizes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_sqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_squeeze_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_squeeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_squeeze_multiple_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_std_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_std_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_sum_to_size_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_take_along_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_to_sparse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_topk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_torch_ops_aten__efficient_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_transpose_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_transpose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_trapz_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_triangular_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_trunc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_unflatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_unfold_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_uniform_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_unsafe_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_unsafe_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_var_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_var_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_vdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_view_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_view_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_vsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_zero__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_zeros_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_T_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp___getitem___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp___rdiv___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp___rmod___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp___rmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp___rsub___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp__batch_norm_with_update_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp__softmax_backward_data_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp__upsample_bilinear2d_aa_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_abs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_acosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_addcdiv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_addmm_decomposed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_addmv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_addr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_allclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_aminmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_any_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_argsort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_as_strided_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_as_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_as_strided_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_atan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_atleast_3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_baddbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_bmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_bool_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_broadcast_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_bucketize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cauchy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_ceil_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_chalf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_char_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cholesky_inverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_clamp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_clone_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_combinations_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_conj_physical_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_copysign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_corrcoef_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cov_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cummin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cumulative_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_deg2rad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_diag_embed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_diagonal_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_diff_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_digamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_dist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_div_floor_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_div_no_rounding_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_div_trunc_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_dstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_empty_permuted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_eq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_equal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_erfc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_expand_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_expand_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_expm1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fft_fft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fft_fftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fft_hfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fft_ifft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fft_ihfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fft_ihfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fft_irfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fft_rfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fft_rfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_flatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fliplr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_flipud_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_float_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_float_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_float_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_frac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_full_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_gather_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_geometric_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_geqrf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_half_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_H_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_T_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule___radd___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule___rdiv___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule___rmod___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule___rmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule___rsub___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule__native_batch_norm_legit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule__unsafe_masked_index_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_abs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_acosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_addbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_addcmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_addmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_addmv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_alias_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_all_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_aminmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_angle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_arange_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_argsort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_as_strided_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_as_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_asinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_atleast_3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_baddbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_bernoulli_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_bfloat16_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_bool_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_broadcast_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_byte_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_cdouble_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_cfloat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_chalf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_char_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_cholesky_inverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_cholesky_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_clamp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_clone_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_column_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_constant_pad_nd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_contiguous_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_copysign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_corrcoef_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_count_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_cov_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_deg2rad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_diag_embed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_diagflat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_digamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_dist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_div_trunc_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_double_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_dstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_einsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_empty_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_empty_permuted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_eq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_equal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_expand_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_expm1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_fft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_fft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_fftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_hfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_ifftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_ifftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_ihfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_ihfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_irfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_irfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_rfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_rfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_flip_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fliplr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_float_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_float_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_float_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_floor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_floor_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fmod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_frac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_frexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_geometric_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_gradient_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_half_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_half_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_histc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_hsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_hstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_igammac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_index_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_index_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_index_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_index_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_index_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_inner_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_int_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_isnan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_isposinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_item_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_le_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_lgamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_eig_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_eigh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_eigvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_householder_product_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_inv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_inv_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_ldl_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_lu_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_multi_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_pinv_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_slogdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_tensorinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_tensorsolve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_vector_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_log1p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_log2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_log_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_log_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_logcumsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_logical_xor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_long_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_long_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_lu_unpack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_masked_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_masked_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_masked_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_masked_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_masked_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_masked_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_masked_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_masked_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_matmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_matrix_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_max_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_max_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_maximum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_min_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_min_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_min_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_msort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_multinomial_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_mvlgamma_mvlgamma_p_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_mvlgamma_mvlgamma_p_5_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nan_to_num_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nanmean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nanmedian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nanquantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_native_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_ne_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_new_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_new_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_new_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_new_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_new_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nextafter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_adaptive_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_adaptive_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_alpha_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_batch_norm_without_cudnn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_conv2d_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_conv2d_stride_padding_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_conv3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_conv_transpose1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_conv_transpose3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_dropout2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_elu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_gelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_glu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_group_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_interpolate_area_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_interpolate_nearest-exact_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_local_response_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_margin_ranking_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_max_unpool1d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_max_unpool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_poisson_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_relu6_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_scaled_dot_product_attention_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_silu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_smooth_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_softmin_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_softshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_softsign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_threshold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_triplet_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_upsample_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_norm_nuc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_normal_number_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_ones_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_outer_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_pca_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_permute_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_polar_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_polygamma_polygamma_n_2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_polygamma_polygamma_n_4_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_positive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_pow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_rand_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_randint_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_randn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_randn_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_ravel_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_real_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_renorm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_reshape_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_reshape_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_resize_as__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_resolve_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_resolve_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_roll_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_round_decimals_neg_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_rsqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_scalar_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_scatter_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_scatter_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_searchsorted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_short_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_signal_windows_bartlett_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_signal_windows_blackman_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_signal_windows_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_signal_windows_general_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_signal_windows_hann_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_sinc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_slice_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_slice_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_sort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_sparse_mm_reduce_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_airy_ai_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_bessel_y1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_erfcx_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_hermite_polynomial_h_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_hermite_polynomial_he_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_i0e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_i1e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_log_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_polygamma_special_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_shifted_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_xlog1py_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_split_with_sizes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_sqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_square_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_squeeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_squeeze_multiple_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_std_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_std_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_stft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_sub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_svd_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_t_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_take_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_tensordot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_tile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_to_sparse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_torch_ops_aten__safe_softmax_default_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_transpose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_trapz_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_true_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_unbind_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_unflatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_unfold_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_unique_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_unsafe_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_unsqueeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_var_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_var_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_vdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_view_as_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_view_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_view_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_vsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_xlogy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_zero__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_zeros_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_index_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_index_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_index_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_index_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_isclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_isfinite_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_isin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_isinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_isnan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_isposinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_lgamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_cholesky_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_cond_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_eig_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_eigh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_inv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_inv_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_ldl_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_ldl_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_lstsq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_lu_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_matrix_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_matrix_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_norm_subgradients_at_zero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_pinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_solve_triangular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_svdvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_tensorinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_tensorsolve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_vander_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_vecdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_log1p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_log2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_log_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_log_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_logaddexp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_logdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_logical_and_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_logical_or_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_logit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_logspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_logspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_long_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_lu_unpack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_max_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_max_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_maximum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_minimum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_mm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_mvlgamma_mvlgamma_p_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nanmean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nanmedian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nanquantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_native_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_native_dropout_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_native_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_new_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nextafter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_adaptive_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_adaptive_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_batch_norm_without_cudnn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv2d_stride_padding_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv2d_stride_padding_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv2d_strided_padding_dilation_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv2d_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv_transpose1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv_transpose3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_ctc_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_dropout2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_embedding_bag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_gelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_glu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_group_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_hardshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_huber_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_interpolate_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_kl_div_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_logsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_max_unpool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_max_unpool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_mse_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_mse_loss_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_multi_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_pad_circular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_pairwise_distance_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_pdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_poisson_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_prelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_relu6_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_selu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_silu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_tanhshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_threshold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_triplet_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_norm_nuc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_normal_in_place_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_ones_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_ops_aten_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_ormqr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_outer_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_polar_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_polygamma_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_polygamma_polygamma_n_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_polygamma_polygamma_n_2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_polygamma_polygamma_n_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_polygamma_polygamma_n_4_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_pow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_rad2deg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_randint_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_real_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_renorm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_resize__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_resolve_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_roll_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_round_decimals_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_round_decimals_neg_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_scalar_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_scatter_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_scatter_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_scatter_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_searchsorted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_select_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_sgn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_signal_windows_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_signal_windows_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_signal_windows_general_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_signal_windows_kaiser_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_signal_windows_nuttall_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_slice_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_sparse_mm_reduce_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_bessel_y0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_bessel_y1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_hermite_polynomial_h_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_polygamma_special_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_xlog1py_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_zeta_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_split_with_sizes_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_split_with_sizes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_square_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_std_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_stft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_sum_to_size_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_svd_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_t_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_take_along_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_tan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_tanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_tensordot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_tile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_topk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_transpose_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_triangular_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_triu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_unbind_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_unfold_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_unsafe_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_unsafe_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_unsqueeze_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_unsqueeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_view_as_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_view_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_view_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_vsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_vstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_xlogy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_H_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_T_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp___getitem___functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp___rmatmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp___rmod___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp___rmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp__chunk_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp__native_batch_norm_legit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp__segment_reduce_offsets_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp__softmax_backward_data_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp__upsample_bilinear2d_aa_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_abs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_acos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_addmm_decomposed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_addr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_alias_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_allclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_aminmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_arange_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_argwhere_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_as_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_as_strided_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_asinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_atan2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_atan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_atanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_atleast_1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_baddbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_bernoulli_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_bfloat16_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_broadcast_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_bucketize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_byte_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_cartesian_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_cauchy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_cdouble_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_cfloat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_char_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_cholesky_inverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_cholesky_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_clamp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_clamp_max_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_clone_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_column_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_conj_physical_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_constant_pad_nd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_copysign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_cosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_count_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_diagonal_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_dist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_dstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_empty_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_eq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_erfinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_expand_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_expm1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_fft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_fftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_fftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_hfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_ifftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_irfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_irfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_rfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_rfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_flatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fliplr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_flipud_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_float_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_floor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_frac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_frexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_full_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_geometric_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_geqrf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_gradient_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_gt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_half_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_igamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_index_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_index_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_index_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_index_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_int_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_int_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_isclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_isin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_isinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_isnan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_isneginf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_isposinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_isreal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_jiterator_unary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_kron_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_ldexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_lerp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_lgamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_eigvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_householder_product_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_ldl_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_lu_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_lu_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_matrix_rank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_multi_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_pinv_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_pinv_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_slogdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_solve_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_tensorinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_tensorsolve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_vander_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_vecdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_vector_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_log1p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_log_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_logaddexp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_logical_and_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_logical_or_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_logical_xor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_logit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_logspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_logspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_long_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_lt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_lu_unpack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_mT_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_max_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_maximum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_meshgrid_list_of_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_meshgrid_variadic_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_min_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_msort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_mul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_multinomial_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_mv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nanmedian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nansum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_native_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_native_dropout_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_native_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_new_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_new_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_adaptive_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_alpha_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_batch_norm_without_cudnn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv2d_stride_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv2d_stride_padding_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv2d_stride_padding_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv2d_strided_padding_dilation_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv_transpose1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv_transpose2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv_transpose3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_ctc_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_elu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_embedding_bag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_embedding_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_glu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_group_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_huber_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_instance_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_interpolate_area_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_interpolate_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_interpolate_nearest-exact_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_interpolate_trilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_kl_div_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_leaky_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_local_response_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_max_unpool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_mse_loss_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_multi_head_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_multi_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_pad_constant_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_pad_reflect_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_pairwise_distance_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_pixel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_prelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_softmin_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_softplus_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_softshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_softsign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_tanhshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_threshold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_norm_fro_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_norm_nuc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_ones_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_ops_aten__new_zeros_with_same_feature_meta_functorchonly_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_outer_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_pca_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_permute_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_polar_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_polygamma_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_quantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_rand_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_randint_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_randn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_real_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_renorm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_repeat_interleave_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_reshape_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_resize_as__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_resolve_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_rot90_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_round_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_round_decimals_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_rsub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_scatter_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_scatter_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_scatter_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_scatter_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_scatter_reduce_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_searchsorted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_sgn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_short_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_short_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_signal_windows_blackman_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_signal_windows_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_signal_windows_general_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_signal_windows_general_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_signal_windows_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_signal_windows_kaiser_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_signal_windows_nuttall_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_sinc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_sinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_slice_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_sort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_airy_ai_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_bessel_y1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_hermite_polynomial_h_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_hermite_polynomial_he_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_legendre_polynomial_p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_log_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_modified_bessel_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_polygamma_special_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_shifted_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_shifted_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_split_with_sizes_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_split_with_sizes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_squeeze_multiple_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_std_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_std_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_std_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_stft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_sub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_svd_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_take_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_tanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_tensor_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_tensordot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_topk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_torch_ops_aten__safe_softmax_default_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_transpose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_tril_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_trunc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_unbind_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_unfold_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_uniform_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_unique_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_unsafe_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_var_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_view_as_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_view_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_view_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_vsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_vstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_zero__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_zeros_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvmap_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvmap_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvmap_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvmap_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvmap_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvmap_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvmap_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvmap_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvmapjvp_linalg_solve_cuda 2025-10-10T02:20:41.8543565Z 2025-10-10T02:20:41.9119792Z 2025-10-10T02:20:41.9120733Z optim/test_optim 1/1 was successful, full logs can be found in artifacts with path test/test-reports/optim.test_optim_1.1_54604825053500b8_.log 2025-10-10T02:20:41.9122786Z 2025-10-10T02:20:45.4124651Z Running test_sparse_csr 1/2 ... [2025-10-10 02:20:45.411913] 2025-10-10T02:20:45.4125080Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:20:45.4126355Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_sparse_csr.py', '-m', 'not serial', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:20:45.412296] 2025-10-10T02:20:45.8364021Z Running test_serialization 1/1 ... [2025-10-10 02:20:45.835738] 2025-10-10T02:20:45.8364484Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:20:45.8365476Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_serialization.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:20:45.836134] 2025-10-10T02:20:50.4113682Z 2025-10-10T02:20:50.4115190Z test_serialization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_serialization_1.1_cc7d12474b1a1e9e_.log 2025-10-10T02:20:50.4227384Z Running 203 items in this shard: test/test_serialization.py::TestOldSerialization::test_debug_set_in_ci, test/test_serialization.py::TestOldSerialization::test_load_error_msg, test/test_serialization.py::TestOldSerialization::test_load_nonexistent_device, test/test_serialization.py::TestOldSerialization::test_load_python2_unicode_module, test/test_serialization.py::TestOldSerialization::test_load_unicode_error_msg, test/test_serialization.py::TestOldSerialization::test_pickle_module, test/test_serialization.py::TestOldSerialization::test_safe_load_basic_types, test/test_serialization.py::TestOldSerialization::test_save_different_dtype_error, test/test_serialization.py::TestOldSerialization::test_save_different_dtype_unallocated, test/test_serialization.py::TestOldSerialization::test_serialization, test/test_serialization.py::TestOldSerialization::test_serialization_backwards_compat, test/test_serialization.py::TestOldSerialization::test_serialization_backwards_compat_safe, test/test_serialization.py::TestOldSerialization::test_serialization_container, test/test_serialization.py::TestOldSerialization::test_serialization_container_filelike, test/test_serialization.py::TestOldSerialization::test_serialization_dill, test/test_serialization.py::TestOldSerialization::test_serialization_dill_version_not_supported, test/test_serialization.py::TestOldSerialization::test_serialization_fake_zip, test/test_serialization.py::TestOldSerialization::test_serialization_filelike, test/test_serialization.py::TestOldSerialization::test_serialization_filelike_api_requirements, test/test_serialization.py::TestOldSerialization::test_serialization_filelike_exceptions, test/test_serialization.py::TestOldSerialization::test_serialization_filelike_missing_attrs, test/test_serialization.py::TestOldSerialization::test_serialization_filelike_stress, test/test_serialization.py::TestOldSerialization::test_serialization_filelike_uses_readinto, test/test_serialization.py::TestOldSerialization::test_serialization_gzip, test/test_serialization.py::TestOldSerialization::test_serialization_map_location, test/test_serialization.py::TestOldSerialization::test_serialization_offset, test/test_serialization.py::TestOldSerialization::test_serialization_offset_filelike_weights_only_False, test/test_serialization.py::TestOldSerialization::test_serialization_offset_filelike_weights_only_True, test/test_serialization.py::TestOldSerialization::test_serialization_offset_gzip, test/test_serialization.py::TestOldSerialization::test_serialization_safe, test/test_serialization.py::TestOldSerialization::test_serialization_save_warnings, test/test_serialization.py::TestOldSerialization::test_serialization_sparse, test/test_serialization.py::TestOldSerialization::test_serialization_sparse_bsc_invalid, test/test_serialization.py::TestOldSerialization::test_serialization_sparse_bsr_invalid, test/test_serialization.py::TestOldSerialization::test_serialization_sparse_csc_invalid, test/test_serialization.py::TestOldSerialization::test_serialization_sparse_csr_invalid, test/test_serialization.py::TestOldSerialization::test_serialization_sparse_invalid, test/test_serialization.py::TestOldSerialization::test_serialization_sparse_invalid_legacy_ctor, test/test_serialization.py::TestOldSerialization::test_serialization_sparse_safe, test/test_serialization.py::TestOldSerialization::test_serialization_storage_slice, test/test_serialization.py::TestOldSerialization::test_serialization_zipfile_utils, test/test_serialization.py::TestOldSerialization::test_serialize_device, test/test_serialization.py::TestOldSerialization::test_skip_data_load, test/test_serialization.py::TestSerialization::test_crc32_options_compute_crc32_False_filename_False, test/test_serialization.py::TestSerialization::test_crc32_options_compute_crc32_False_filename_True, test/test_serialization.py::TestSerialization::test_crc32_options_compute_crc32_True_filename_False, test/test_serialization.py::TestSerialization::test_crc32_options_compute_crc32_True_filename_True, test/test_serialization.py::TestSerialization::test_debug_set_in_ci, test/test_serialization.py::TestSerialization::test_filewriter_metadata_writing_filename_False, test/test_serialization.py::TestSerialization::test_filewriter_metadata_writing_filename_True, test/test_serialization.py::TestSerialization::test_get_unsafe_globals_in_checkpoint, test/test_serialization.py::TestSerialization::test_has_format_version, test/test_serialization.py::TestSerialization::test_load_error_msg, test/test_serialization.py::TestSerialization::test_load_njt_weights_only_should_import_False, test/test_serialization.py::TestSerialization::test_load_njt_weights_only_should_import_True, test/test_serialization.py::TestSerialization::test_load_nonexistent_device, test/test_serialization.py::TestSerialization::test_load_python2_unicode_module, test/test_serialization.py::TestSerialization::test_load_unicode_error_msg, test/test_serialization.py::TestSerialization::test_lr_scheduler_serialization, test/test_serialization.py::TestSerialization::test_meta_serialization_weights_only_False, test/test_serialization.py::TestSerialization::test_meta_serialization_weights_only_True, test/test_serialization.py::TestSerialization::test_mmap_load_offset_calculation_path_type0, test/test_serialization.py::TestSerialization::test_mmap_load_offset_calculation_path_type1, test/test_serialization.py::TestSerialization::test_pathlike_serialization_weights_only_False, test/test_serialization.py::TestSerialization::test_pathlike_serialization_weights_only_True, test/test_serialization.py::TestSerialization::test_pickle_module, test/test_serialization.py::TestSerialization::test_safe_load_basic_types, test/test_serialization.py::TestSerialization::test_save_different_dtype_error, test/test_serialization.py::TestSerialization::test_save_different_dtype_unallocated, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_bfloat16_weights_only_False, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_bfloat16_weights_only_True, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_bool_weights_only_False, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_bool_weights_only_True, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_complex128_weights_only_False, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_complex128_weights_only_True, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_complex64_weights_only_False, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_complex64_weights_only_True, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_float16_weights_only_False, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_float16_weights_only_True, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_float32_weights_only_False, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_float32_weights_only_True, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_float64_weights_only_False, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_float64_weights_only_True, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_int16_weights_only_False, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_int16_weights_only_True, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_int32_weights_only_False, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_int32_weights_only_True, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_int64_weights_only_False, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_int64_weights_only_True, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_int8_weights_only_False, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_int8_weights_only_True, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_uint8_weights_only_False, test/test_serialization.py::TestSerialization::test_save_load_preserves_dtype_uint8_weights_only_True, test/test_serialization.py::TestSerialization::test_serialization, test/test_serialization.py::TestSerialization::test_serialization_backwards_compat, test/test_serialization.py::TestSerialization::test_serialization_backwards_compat_safe, test/test_serialization.py::TestSerialization::test_serialization_byte_literal_byte_literals0_weights_only_False, test/test_serialization.py::TestSerialization::test_serialization_byte_literal_byte_literals0_weights_only_True, test/test_serialization.py::TestSerialization::test_serialization_byte_literal_byte_literals1_weights_only_False, test/test_serialization.py::TestSerialization::test_serialization_byte_literal_byte_literals1_weights_only_True, test/test_serialization.py::TestSerialization::test_serialization_byteorder_mark, test/test_serialization.py::TestSerialization::test_serialization_dill, test/test_serialization.py::TestSerialization::test_serialization_dill_version_not_supported, test/test_serialization.py::TestSerialization::test_serialization_dtype_complex32_weights_only_False, test/test_serialization.py::TestSerialization::test_serialization_dtype_complex32_weights_only_True, test/test_serialization.py::TestSerialization::test_serialization_dtype_float8_e4m3fn_weights_only_False, test/test_serialization.py::TestSerialization::test_serialization_dtype_float8_e4m3fn_weights_only_True, test/test_serialization.py::TestSerialization::test_serialization_dtype_float8_e5m2_weights_only_False, test/test_serialization.py::TestSerialization::test_serialization_dtype_float8_e5m2_weights_only_True, test/test_serialization.py::TestSerialization::test_serialization_dtype_uint16_weights_only_False, test/test_serialization.py::TestSerialization::test_serialization_dtype_uint16_weights_only_True, test/test_serialization.py::TestSerialization::test_serialization_dtype_uint32_weights_only_False, test/test_serialization.py::TestSerialization::test_serialization_dtype_uint32_weights_only_True, test/test_serialization.py::TestSerialization::test_serialization_dtype_uint64_weights_only_False, test/test_serialization.py::TestSerialization::test_serialization_dtype_uint64_weights_only_True, test/test_serialization.py::TestSerialization::test_serialization_efficient_zerotensor_weights_only_False, test/test_serialization.py::TestSerialization::test_serialization_efficient_zerotensor_weights_only_True, test/test_serialization.py::TestSerialization::test_serialization_fake_zip, test/test_serialization.py::TestSerialization::test_serialization_filelike, test/test_serialization.py::TestSerialization::test_serialization_filelike_api_requirements, test/test_serialization.py::TestSerialization::test_serialization_filelike_exceptions, test/test_serialization.py::TestSerialization::test_serialization_filelike_missing_attrs, test/test_serialization.py::TestSerialization::test_serialization_filelike_stress, test/test_serialization.py::TestSerialization::test_serialization_filelike_uses_readinto, test/test_serialization.py::TestSerialization::test_serialization_gzip, test/test_serialization.py::TestSerialization::test_serialization_load_bom_data, test/test_serialization.py::TestSerialization::test_serialization_load_bom_data_bfloat16, test/test_serialization.py::TestSerialization::test_serialization_load_bom_data_bool, test/test_serialization.py::TestSerialization::test_serialization_load_bom_data_cdouble, test/test_serialization.py::TestSerialization::test_serialization_load_bom_data_cfloat, test/test_serialization.py::TestSerialization::test_serialization_load_bom_data_double, test/test_serialization.py::TestSerialization::test_serialization_load_bom_data_float, test/test_serialization.py::TestSerialization::test_serialization_load_bom_data_half, test/test_serialization.py::TestSerialization::test_serialization_load_bom_data_int, test/test_serialization.py::TestSerialization::test_serialization_load_bom_data_int16, test/test_serialization.py::TestSerialization::test_serialization_load_bom_data_int8, test/test_serialization.py::TestSerialization::test_serialization_load_bom_data_long, test/test_serialization.py::TestSerialization::test_serialization_load_bom_data_uint8, test/test_serialization.py::TestSerialization::test_serialization_map_location, test/test_serialization.py::TestSerialization::test_serialization_math_bits_weights_only_False, test/test_serialization.py::TestSerialization::test_serialization_math_bits_weights_only_True, test/test_serialization.py::TestSerialization::test_serialization_mmap_loading, test/test_serialization.py::TestSerialization::test_serialization_mmap_loading_ctx, test/test_serialization.py::TestSerialization::test_serialization_mmap_loading_options_path_type0_weights_only_False, test/test_serialization.py::TestSerialization::test_serialization_mmap_loading_options_path_type0_weights_only_True, test/test_serialization.py::TestSerialization::test_serialization_mmap_loading_options_path_type1_weights_only_False, test/test_serialization.py::TestSerialization::test_serialization_mmap_loading_options_path_type1_weights_only_True, test/test_serialization.py::TestSerialization::test_serialization_mmap_loading_with_map_location, test/test_serialization.py::TestSerialization::test_serialization_nested_class, test/test_serialization.py::TestSerialization::test_serialization_offset_gzip, test/test_serialization.py::TestSerialization::test_serialization_python_attr_weights_only_False, test/test_serialization.py::TestSerialization::test_serialization_python_attr_weights_only_True, test/test_serialization.py::TestSerialization::test_serialization_safe, test/test_serialization.py::TestSerialization::test_serialization_save_warnings, test/test_serialization.py::TestSerialization::test_serialization_sparse, test/test_serialization.py::TestSerialization::test_serialization_sparse_bsc_invalid, test/test_serialization.py::TestSerialization::test_serialization_sparse_bsr_invalid, test/test_serialization.py::TestSerialization::test_serialization_sparse_csc_invalid, test/test_serialization.py::TestSerialization::test_serialization_sparse_csr_invalid, test/test_serialization.py::TestSerialization::test_serialization_sparse_invalid, test/test_serialization.py::TestSerialization::test_serialization_sparse_invalid_legacy_ctor, test/test_serialization.py::TestSerialization::test_serialization_sparse_safe, test/test_serialization.py::TestSerialization::test_serialization_storage_slice, test/test_serialization.py::TestSerialization::test_serialization_uintx_intx, test/test_serialization.py::TestSerialization::test_serialization_warning_s390x, test/test_serialization.py::TestSerialization::test_serialization_with_header, test/test_serialization.py::TestSerialization::test_serialization_zipfile_actually_jit, test/test_serialization.py::TestSerialization::test_serialization_zipfile_utils, test/test_serialization.py::TestSerialization::test_serialization_zipfile_weights_only_False, test/test_serialization.py::TestSerialization::test_serialization_zipfile_weights_only_True, test/test_serialization.py::TestSerialization::test_serialize_device, test/test_serialization.py::TestSerialization::test_skip_data_load, test/test_serialization.py::TestSerialization::test_skip_data_serialization_error_cases, test/test_serialization.py::TestSerialization::test_skip_data_serialization_materialize_fake_False, test/test_serialization.py::TestSerialization::test_skip_data_serialization_materialize_fake_True, test/test_serialization.py::TestSerialization::test_skip_data_serialization_preserves_views_materialize_fake_False, test/test_serialization.py::TestSerialization::test_skip_data_serialization_preserves_views_materialize_fake_True, test/test_serialization.py::TestSerialization::test_storage_alignment, test/test_serialization.py::TestSerialization::test_use_pinned_memory_for_d2h, test/test_serialization.py::TestSerialization::test_weights_only_assert, test/test_serialization.py::TestSerialization::test_weights_only_blocked_func_error_msg, test/test_serialization.py::TestSerialization::test_weights_only_env_variables_force_weights_only_False, test/test_serialization.py::TestSerialization::test_weights_only_env_variables_force_weights_only_True, test/test_serialization.py::TestSerialization::test_weights_only_error_unsafe_global_False, test/test_serialization.py::TestSerialization::test_weights_only_error_unsafe_global_True, test/test_serialization.py::TestSerialization::test_weights_only_safe_globals_blocklist, test/test_serialization.py::TestSerialization::test_weights_only_safe_globals_build, test/test_serialization.py::TestSerialization::test_weights_only_safe_globals_build_with_slots_slots_all, test/test_serialization.py::TestSerialization::test_weights_only_safe_globals_build_with_slots_slots_some, test/test_serialization.py::TestSerialization::test_weights_only_safe_globals_newobj, test/test_serialization.py::TestSerialization::test_weights_only_with_zoneinfo_unpickle_registration_success, test/test_serialization.py::TestSubclassSerialization::test_cloned_deepcopy_requires_grad_False, test/test_serialization.py::TestSubclassSerialization::test_cloned_deepcopy_requires_grad_True, test/test_serialization.py::TestSubclassSerialization::test_empty_class_serialization, test/test_serialization.py::TestSubclassSerialization::test_safe_globals_context_manager_weights_only, test/test_serialization.py::TestSubclassSerialization::test_safe_globals_for_weights_only, test/test_serialization.py::TestSubclassSerialization::test_sets_are_loadable_with_weights_only, test/test_serialization.py::TestSubclassSerialization::test_tensor_subclass_deepcopy, test/test_serialization.py::TestSubclassSerialization::test_tensor_subclass_getstate_overwrite, test/test_serialization.py::TestSubclassSerialization::test_tensor_subclass_map_location, test/test_serialization.py::TestSubclassSerialization::test_tensor_subclass_wrapper_serialization, test/test_serialization.py::TestBothSerializationCUDA::test_serialization_new_format_old_format_compat_weights_only_False_cuda, test/test_serialization.py::TestBothSerializationCUDA::test_serialization_new_format_old_format_compat_weights_only_True_cuda 2025-10-10T02:20:50.4334712Z 2025-10-10T02:20:54.2885337Z Running torch_np/numpy_tests/lib/test_twodim_base 1/1 ... [2025-10-10 02:20:54.287901] 2025-10-10T02:20:54.2886674Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:20:54.2889346Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/lib/test_twodim_base.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:20:54.288316] 2025-10-10T02:20:58.3622624Z 2025-10-10T02:20:58.3623549Z torch_np/numpy_tests/lib/test_twodim_base 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.lib.test_twodim_base_1.1_f2b6b88ee147953d_.log 2025-10-10T02:20:58.3634053Z Running 34 items in this shard: test/torch_np/numpy_tests/lib/test_twodim_base.py::TestEye::test_2d, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestEye::test_basic, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestEye::test_bool, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestEye::test_diag, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestEye::test_diag2d, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestEye::test_eye_bounds, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestEye::test_order, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestDiag::test_diag_bounds, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestDiag::test_failure, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestDiag::test_fortran_order, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestDiag::test_matrix, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestDiag::test_vector, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestFliplr::test_basic, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestFlipud::test_basic, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestHistogram2d::test_all_outliers, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestHistogram2d::test_asym, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestHistogram2d::test_bad_length_x_len_10_y_len_11, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestHistogram2d::test_bad_length_x_len_20_y_len_19, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestHistogram2d::test_binparameter_combination, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestHistogram2d::test_density, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestHistogram2d::test_empty, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestHistogram2d::test_simple, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestTri::test_dtype, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestTri::test_mask_indices, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestTri::test_tril_indices, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestTri::test_tril_triu_dtype, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestTri::test_tril_triu_ndim2, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestTri::test_tril_triu_ndim3, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestTri::test_tril_triu_with_inf, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestTriuIndices::test_triu_indices, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestTrilIndicesFrom::test_exceptions, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestTriuIndicesFrom::test_exceptions, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestVander::test_basic, test/torch_np/numpy_tests/lib/test_twodim_base.py::TestVander::test_dtypes 2025-10-10T02:20:58.3643455Z 2025-10-10T02:21:02.2439993Z Running test_function_schema 1/1 ... [2025-10-10 02:21:02.243369] 2025-10-10T02:21:02.2440436Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:21:02.2441749Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_function_schema.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:21:02.243816] 2025-10-10T02:21:06.3168079Z 2025-10-10T02:21:06.3169294Z test_function_schema 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_function_schema_1.1_f798fa934797b399_.log 2025-10-10T02:21:06.3180241Z Running 15 items in this shard: test/test_function_schema.py::TestFunctionSchema::test_backward_compatible_arguments, test/test_function_schema.py::TestFunctionSchema::test_backward_compatible_outputs, test/test_function_schema.py::TestFunctionSchema::test_backward_compatible_structure, test/test_function_schema.py::TestFunctionSchema::test_backward_compatible_with_smart_serialization, test/test_function_schema.py::TestFunctionSchema::test_forward_compatible_arguments_real_use_case, test/test_function_schema.py::TestFunctionSchema::test_forward_compatible_arguments_with_out, test/test_function_schema.py::TestFunctionSchema::test_forward_compatible_arguments_without_out, test/test_function_schema.py::TestFunctionSchema::test_hash_schema, test/test_function_schema.py::TestFunctionSchema::test_out_schema, test/test_function_schema.py::TestFunctionSchema::test_schema_error, test/test_function_schema.py::TestFunctionSchema::test_serialize_and_deserialize, test/test_function_schema.py::TestFunctionSchema::test_string_optional_parameter_default_value, test/test_function_schema.py::TestFunctionSchema::test_sym_int_argument_properly_parsed, test/test_function_schema.py::TestFunctionSchema::test_tensor_list_alias_annotation_properly_parsed, test/test_function_schema.py::TestFunctionSchema::test_tensor_option_arguments_properly_parsed 2025-10-10T02:21:06.3188623Z 2025-10-10T02:21:10.1936968Z Running functorch/test_vmap 1/1 ... [2025-10-10 02:21:10.193093] 2025-10-10T02:21:10.1937495Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:21:10.1938613Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_vmap.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:21:10.193501] 2025-10-10T02:21:22.7349017Z 2025-10-10T02:21:22.7350119Z functorch/test_vmap 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_vmap_1.1_63b66feb7af2a617_.log 2025-10-10T02:21:22.8239501Z Running 2140 items in this shard: test/functorch/test_vmap.py::TestVmapAPI::test_accepts_nested_inputs, test/functorch/test_vmap.py::TestVmapAPI::test_backward_unsupported_interaction, test/functorch/test_vmap.py::TestVmapAPI::test_batch_rule_does_not_need_to_handle_no_batched_input, test/functorch/test_vmap.py::TestVmapAPI::test_batched_gradient_basic, test/functorch/test_vmap.py::TestVmapAPI::test_checkpoint, test/functorch/test_vmap.py::TestVmapAPI::test_constant_function, test/functorch/test_vmap.py::TestVmapAPI::test_data_attribute, test/functorch/test_vmap.py::TestVmapAPI::test_data_dependent_control_flow_throws, test/functorch/test_vmap.py::TestVmapAPI::test_decomposition_under_python_dispatcher, test/functorch/test_vmap.py::TestVmapAPI::test_different_map_dim_size_raises, test/functorch/test_vmap.py::TestVmapAPI::test_fallback_does_not_warn_by_default, test/functorch/test_vmap.py::TestVmapAPI::test_fallback_masked_fill, test/functorch/test_vmap.py::TestVmapAPI::test_fallback_multiple_returns, test/functorch/test_vmap.py::TestVmapAPI::test_fallback_warning, test/functorch/test_vmap.py::TestVmapAPI::test_fallback_warns_when_warnings_are_enabled, test/functorch/test_vmap.py::TestVmapAPI::test_fallback_with_undefined_grad, test/functorch/test_vmap.py::TestVmapAPI::test_fallback_zero_dim, test/functorch/test_vmap.py::TestVmapAPI::test_func_with_no_inputs, test/functorch/test_vmap.py::TestVmapAPI::test_func_with_no_tensors, test/functorch/test_vmap.py::TestVmapAPI::test_functools_partial, test/functorch/test_vmap.py::TestVmapAPI::test_grad_unsupported_interaction, test/functorch/test_vmap.py::TestVmapAPI::test_in_dim_not_in_tensor_err_msg, test/functorch/test_vmap.py::TestVmapAPI::test_in_dims_wrong_type_err_msg, test/functorch/test_vmap.py::TestVmapAPI::test_inplace_fallback_nary_different_levels, test/functorch/test_vmap.py::TestVmapAPI::test_inplace_fallback_nary_same_levels, test/functorch/test_vmap.py::TestVmapAPI::test_inplace_fallback_unary, test/functorch/test_vmap.py::TestVmapAPI::test_integer_in_dim_but_not_tensor_input_err_msg, test/functorch/test_vmap.py::TestVmapAPI::test_item_throws, test/functorch/test_vmap.py::TestVmapAPI::test_multiple_inputs, test/functorch/test_vmap.py::TestVmapAPI::test_multiple_out_dims, test/functorch/test_vmap.py::TestVmapAPI::test_multiple_outputs, test/functorch/test_vmap.py::TestVmapAPI::test_multiple_outputs2, test/functorch/test_vmap.py::TestVmapAPI::test_nested_negative_in_dims, test/functorch/test_vmap.py::TestVmapAPI::test_nested_non_default_in_dims, test/functorch/test_vmap.py::TestVmapAPI::test_nested_out_dims, test/functorch/test_vmap.py::TestVmapAPI::test_nested_with_diag_embed, test/functorch/test_vmap.py::TestVmapAPI::test_nested_with_different_map_dim, test/functorch/test_vmap.py::TestVmapAPI::test_nested_with_same_map_dim, test/functorch/test_vmap.py::TestVmapAPI::test_nn_module, test/functorch/test_vmap.py::TestVmapAPI::test_non_default_in_dims_out_dims, test/functorch/test_vmap.py::TestVmapAPI::test_non_tensor_output_raises, test/functorch/test_vmap.py::TestVmapAPI::test_non_zero_in_dims, test/functorch/test_vmap.py::TestVmapAPI::test_none_in_dims, test/functorch/test_vmap.py::TestVmapAPI::test_nonzero_out_dims, test/functorch/test_vmap.py::TestVmapAPI::test_noop_in_inner_vmap, test/functorch/test_vmap.py::TestVmapAPI::test_not_enough_in_dims_err_msg, test/functorch/test_vmap.py::TestVmapAPI::test_out_dim_out_of_bounds_err_msg, test/functorch/test_vmap.py::TestVmapAPI::test_out_dims_and_num_outputs_mismatch_err_msg, test/functorch/test_vmap.py::TestVmapAPI::test_out_dims_edge_case, test/functorch/test_vmap.py::TestVmapAPI::test_out_dims_must_be_int_or_collection_of_int_err_msg, test/functorch/test_vmap.py::TestVmapAPI::test_out_dims_none, test/functorch/test_vmap.py::TestVmapAPI::test_out_dims_none_tuple, test/functorch/test_vmap.py::TestVmapAPI::test_out_dims_normal_tensor, test/functorch/test_vmap.py::TestVmapAPI::test_pytree_odict_returns, test/functorch/test_vmap.py::TestVmapAPI::test_pytree_returns, test/functorch/test_vmap.py::TestVmapAPI::test_pytree_returns_broadcast_nested, test/functorch/test_vmap.py::TestVmapAPI::test_pytree_returns_broadcast_simple, test/functorch/test_vmap.py::TestVmapAPI::test_pytree_returns_outdims, test/functorch/test_vmap.py::TestVmapAPI::test_reshape_dim_into, test/functorch/test_vmap.py::TestVmapAPI::test_reshape_dim_outof, test/functorch/test_vmap.py::TestVmapAPI::test_restore_vmap_no_vmapped_inputs, test/functorch/test_vmap.py::TestVmapAPI::test_restore_vmap_pytree_input_output, test/functorch/test_vmap.py::TestVmapAPI::test_restore_vmap_unexpanded_outputs, test/functorch/test_vmap.py::TestVmapAPI::test_single_input, test/functorch/test_vmap.py::TestVmapAPI::test_unsupported_op_err_msg, test/functorch/test_vmap.py::TestVmapAPI::test_vmap_autocast_cpu, test/functorch/test_vmap.py::TestVmapAPI::test_vmap_autocast_cuda, test/functorch/test_vmap.py::TestVmapOperators::test_T_numpy, test/functorch/test_vmap.py::TestVmapOperators::test_adaptive_avg_pool2d, test/functorch/test_vmap.py::TestVmapOperators::test_argmax_dim, test/functorch/test_vmap.py::TestVmapOperators::test_arithmetic_add, test/functorch/test_vmap.py::TestVmapOperators::test_arithmetic_add_dunder, test/functorch/test_vmap.py::TestVmapOperators::test_arithmetic_div, test/functorch/test_vmap.py::TestVmapOperators::test_arithmetic_div_dunder, test/functorch/test_vmap.py::TestVmapOperators::test_arithmetic_mul, test/functorch/test_vmap.py::TestVmapOperators::test_arithmetic_mul_dunder, test/functorch/test_vmap.py::TestVmapOperators::test_arithmetic_pow, test/functorch/test_vmap.py::TestVmapOperators::test_arithmetic_pow_dunder, test/functorch/test_vmap.py::TestVmapOperators::test_arithmetic_sub, test/functorch/test_vmap.py::TestVmapOperators::test_arithmetic_sub_dunder, test/functorch/test_vmap.py::TestVmapOperators::test_as_strided, test/functorch/test_vmap.py::TestVmapOperators::test_bmm, test/functorch/test_vmap.py::TestVmapOperators::test_cat, test/functorch/test_vmap.py::TestVmapOperators::test_chunk, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_0_out_dim_0_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_0_out_dim_0_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_0_out_dim_1_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_0_out_dim_1_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_0_out_dim_2_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_0_out_dim_2_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_1_out_dim_0_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_1_out_dim_0_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_1_out_dim_1_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_1_out_dim_1_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_1_out_dim_2_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_1_out_dim_2_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_2_out_dim_0_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_2_out_dim_0_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_2_out_dim_1_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_2_out_dim_1_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_2_out_dim_2_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_chunk_vmap_in_dim_2_out_dim_2_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_clamp, test/functorch/test_vmap.py::TestVmapOperators::test_clamp_inplace_variant_clamp_max_, test/functorch/test_vmap.py::TestVmapOperators::test_clamp_inplace_variant_clamp_min_, test/functorch/test_vmap.py::TestVmapOperators::test_clamp_variant_clamp_max, test/functorch/test_vmap.py::TestVmapOperators::test_clamp_variant_clamp_min, test/functorch/test_vmap.py::TestVmapOperators::test_clone, test/functorch/test_vmap.py::TestVmapOperators::test_comparison_ops, test/functorch/test_vmap.py::TestVmapOperators::test_conj, test/functorch/test_vmap.py::TestVmapOperators::test_conj_bit, test/functorch/test_vmap.py::TestVmapOperators::test_contiguous, test/functorch/test_vmap.py::TestVmapOperators::test_conv2d, test/functorch/test_vmap.py::TestVmapOperators::test_copy_, test/functorch/test_vmap.py::TestVmapOperators::test_cross_batch_size_three, test/functorch/test_vmap.py::TestVmapOperators::test_diagonal, test/functorch/test_vmap.py::TestVmapOperators::test_dot, test/functorch/test_vmap.py::TestVmapOperators::test_expand_as, test/functorch/test_vmap.py::TestVmapOperators::test_fill_and_zero_inplace, test/functorch/test_vmap.py::TestVmapOperators::test_imag, test/functorch/test_vmap.py::TestVmapOperators::test_is_complex, test/functorch/test_vmap.py::TestVmapOperators::test_is_contiguous, test/functorch/test_vmap.py::TestVmapOperators::test_is_floating_point, test/functorch/test_vmap.py::TestVmapOperators::test_mean, test/functorch/test_vmap.py::TestVmapOperators::test_mean_dim, test/functorch/test_vmap.py::TestVmapOperators::test_mm, test/functorch/test_vmap.py::TestVmapOperators::test_mode_key, test/functorch/test_vmap.py::TestVmapOperators::test_movedim, test/functorch/test_vmap.py::TestVmapOperators::test_mv, test/functorch/test_vmap.py::TestVmapOperators::test_narrow, test/functorch/test_vmap.py::TestVmapOperators::test_new_empty, test/functorch/test_vmap.py::TestVmapOperators::test_new_empty_strided, test/functorch/test_vmap.py::TestVmapOperators::test_new_zeros, test/functorch/test_vmap.py::TestVmapOperators::test_nll_loss, test/functorch/test_vmap.py::TestVmapOperators::test_one_hot, test/functorch/test_vmap.py::TestVmapOperators::test_real, test/functorch/test_vmap.py::TestVmapOperators::test_repeat, test/functorch/test_vmap.py::TestVmapOperators::test_reshape, test/functorch/test_vmap.py::TestVmapOperators::test_reshape_as, test/functorch/test_vmap.py::TestVmapOperators::test_result_type, test/functorch/test_vmap.py::TestVmapOperators::test_roll_no_dims, test/functorch/test_vmap.py::TestVmapOperators::test_select, test/functorch/test_vmap.py::TestVmapOperators::test_silu_backward, test/functorch/test_vmap.py::TestVmapOperators::test_slice, test/functorch/test_vmap.py::TestVmapOperators::test_slogdet, test/functorch/test_vmap.py::TestVmapOperators::test_split, test/functorch/test_vmap.py::TestVmapOperators::test_squeeze, test/functorch/test_vmap.py::TestVmapOperators::test_stack, test/functorch/test_vmap.py::TestVmapOperators::test_stride, test/functorch/test_vmap.py::TestVmapOperators::test_sum, test/functorch/test_vmap.py::TestVmapOperators::test_sum_dim, test/functorch/test_vmap.py::TestVmapOperators::test_t, test/functorch/test_vmap.py::TestVmapOperators::test_tensor_split, test/functorch/test_vmap.py::TestVmapOperators::test_to, test/functorch/test_vmap.py::TestVmapOperators::test_trace, test/functorch/test_vmap.py::TestVmapOperators::test_transpose, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_abs, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_acos, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_asin, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_atan, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_ceil, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_cos, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_cosh, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_digamma, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_exp, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_expm1, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_floor, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_frac, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_lgamma, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_log, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_log10, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_log1p, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_log2, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_neg, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_reciprocal, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_relu, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_round, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_rsqrt, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_sigmoid, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_sign, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_sin, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_sinh, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_sqrt, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_tan, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_tanh, test/functorch/test_vmap.py::TestVmapOperators::test_unary_pointwise_trunc, test/functorch/test_vmap.py::TestVmapOperators::test_unbind, test/functorch/test_vmap.py::TestVmapOperators::test_unfold, test/functorch/test_vmap.py::TestVmapOperators::test_unsafe_view, test/functorch/test_vmap.py::TestVmapOperators::test_unsqueeze, test/functorch/test_vmap.py::TestVmapOperators::test_view, test/functorch/test_vmap.py::TestVmapOperators::test_view_as, test/functorch/test_vmap.py::TestVmapOperators::test_view_as_complex, test/functorch/test_vmap.py::TestVmapOperators::test_view_as_real, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_composition_in_dim_0_out_dim_0_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_composition_in_dim_0_out_dim_0_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_composition_in_dim_0_out_dim_1_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_composition_in_dim_0_out_dim_1_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_composition_in_dim_1_out_dim_0_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_composition_in_dim_1_out_dim_0_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_composition_in_dim_1_out_dim_1_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_composition_in_dim_1_out_dim_1_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_error_in_dim_0_out_dim_0_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_error_in_dim_0_out_dim_0_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_error_in_dim_0_out_dim_1_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_error_in_dim_0_out_dim_1_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_error_in_dim_1_out_dim_0_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_error_in_dim_1_out_dim_0_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_error_in_dim_1_out_dim_1_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_error_in_dim_1_out_dim_1_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_0_out_dim_0_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_0_out_dim_0_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_0_out_dim_1_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_0_out_dim_1_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_0_out_dim_2_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_0_out_dim_2_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_1_out_dim_0_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_1_out_dim_0_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_1_out_dim_1_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_1_out_dim_1_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_1_out_dim_2_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_1_out_dim_2_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_2_out_dim_0_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_2_out_dim_0_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_2_out_dim_1_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_2_out_dim_1_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_2_out_dim_2_randomness_error, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_chunksize_in_dim_2_out_dim_2_randomness_same, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_fallback_check, test/functorch/test_vmap.py::TestVmapOperators::test_vmap_fallback_check_ok, test/functorch/test_vmap.py::TestVmapOperators::test_weird_matmul_case, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_0d_tensor_index_put_inplace_False_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_0d_tensor_index_put_inplace_True_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_advanced_indexing_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_batch_norm_training_False_track_running_stats_False_affine_False_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_batch_norm_training_False_track_running_stats_False_affine_True_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_batch_norm_training_False_track_running_stats_True_affine_False_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_batch_norm_training_False_track_running_stats_True_affine_True_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_batch_norm_training_True_track_running_stats_False_affine_False_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_batch_norm_training_True_track_running_stats_False_affine_True_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_batch_norm_training_True_track_running_stats_True_affine_False_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_batch_norm_training_True_track_running_stats_True_affine_True_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_conv_double_backward_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_fill__Tensor_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_flatten_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_foo_like_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_group_norm_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_index_fill_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_index_put_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_inplace_on_view_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_isinf_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_isnan_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_linalg_eigh_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_linalg_svd_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_namedtuple_returns_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_nested_advanced_indexing_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_H_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyCatCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyCubeCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyMulCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyMulScalarCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyNMSCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyNonzeroCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpySortAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpySortCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpySplitCopyCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpySplitCopyWithIntCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyTakeCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_NumpyViewCopyCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_SelectAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_T_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___getitem___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___getitem___functorch_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___radd___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___rand___cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___rdiv___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___rmatmul___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___rmod___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___rmul___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___ror___cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___rpow___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___rsub___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule___rxor___cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule__batch_norm_with_update_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule__chunk_cat_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule__native_batch_norm_legit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule__segment_reduce_lengths_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule__segment_reduce_offsets_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule__softmax_backward_data_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule__unsafe_masked_index_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule__upsample_bilinear2d_aa_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_abs_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_acos_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_acosh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_add_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_addbmm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_addcdiv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_addcmul_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_addmm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_addmm_decomposed_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_addmv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_addr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_alias_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_all_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_allclose_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_amax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_amin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_aminmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_angle_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_any_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_arange_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_argmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_argmin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_argsort_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_argwhere_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_as_strided_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_as_strided_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_as_strided_partial_views_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_as_strided_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_asin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_asinh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_atan2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_atan_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_atanh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_atleast_1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_atleast_2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_atleast_3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_baddbmm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bernoulli_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bfloat16_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bincount_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bitwise_and_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bitwise_left_shift_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bitwise_not_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bitwise_or_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bitwise_right_shift_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bitwise_xor_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_block_diag_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bmm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bool_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bool_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_broadcast_shapes_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_broadcast_tensors_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_broadcast_to_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_bucketize_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_byte_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_byte_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cartesian_prod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cat_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cauchy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cdist_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cdouble_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ceil_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cfloat_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_chalf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_char_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_char_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cholesky_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cholesky_inverse_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cholesky_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_chunk_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_clamp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_clamp_max_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_clamp_min_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_clone_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_column_stack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_combinations_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_complex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_conj_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_conj_physical_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_constant_pad_nd_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_contiguous_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_copysign_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_corrcoef_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cos_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cosh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_count_nonzero_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cov_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cross_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cummax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cummin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cumprod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cumsum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_cumulative_trapezoid_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_deg2rad_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_diag_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_diag_embed_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_diagflat_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_diagonal_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_diagonal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_diagonal_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_diff_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_digamma_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_dist_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_div_floor_rounding_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_div_no_rounding_mode_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_div_trunc_rounding_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_dot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_double_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_double_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_dsplit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_dstack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_einsum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_empty_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_empty_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_empty_permuted_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_empty_strided_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_eq_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_equal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_erf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_erfc_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_erfinv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_exp2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_exp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_expand_as_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_expand_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_expand_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_expm1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_exponential_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_eye_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_fft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_fft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_fftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_fftshift_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_hfft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_hfft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_hfftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_ifft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_ifft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_ifftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_ifftshift_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_ihfft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_ihfft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_ihfftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_irfft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_irfft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_irfftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_rfft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_rfft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fft_rfftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fill_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_flatten_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_flip_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fliplr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_flipud_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_float_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_float_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_float_power_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_floor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_floor_divide_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fmin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_fmod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_frac_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_frexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_full_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_full_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_gather_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_gcd_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ge_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_geometric_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_geqrf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_gradient_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_grid_sampler_2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_grid_sampler_3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_gt_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_half_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_half_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_hash_tensor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_heaviside_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_histc_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_hsplit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_hstack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_hypot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_i0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_igamma_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_igammac_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_imag_cuda_complex64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_index_add_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_index_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_index_fill_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_index_put_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_index_put_functorch_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_index_reduce_amax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_index_reduce_amin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_index_reduce_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_index_reduce_prod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_index_select_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_inner_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_int_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_int_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_isclose_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_isfinite_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_isin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_isinf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_isnan_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_isneginf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_isposinf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_isreal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_istft_cuda_complex64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_item_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_jiterator_4inputs_with_extra_args_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_jiterator_binary_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_jiterator_unary_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_kron_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_kthvalue_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_lcm_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ldexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_le_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_lerp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_lgamma_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_cholesky_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_cholesky_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_cond_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_cross_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_det_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_diagonal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_eig_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_eigh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_eigvals_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_eigvalsh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_householder_product_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_inv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_inv_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_ldl_factor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_ldl_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_lstsq_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_lu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_lu_factor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_lu_factor_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_lu_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_matrix_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_matrix_power_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_matrix_rank_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_multi_dot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_norm_subgradients_at_zero_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_pinv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_pinv_hermitian_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_pinv_singular_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_qr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_slogdet_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_solve_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_solve_triangular_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_svd_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_svdvals_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_tensorinv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_tensorsolve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_vander_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_vecdot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linalg_vector_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linspace_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_linspace_tensor_overload_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_log10_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_log1p_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_log2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_log_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_log_normal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_log_softmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_log_softmax_with_dtype_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logaddexp2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logaddexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logcumsumexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logdet_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logical_and_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logical_not_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logical_or_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logical_xor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logspace_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logspace_tensor_overload_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_logsumexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_long_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_long_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_lt_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_lu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_lu_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_lu_unpack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_mH_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_mT_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_amax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_amin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_argmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_argmin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_cumprod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_cumsum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_fill_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_log_softmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_logaddexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_logsumexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_median_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_normalize_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_prod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_select_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_softmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_softmin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_std_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_sum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_masked_var_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_matmul_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_matrix_exp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_max_binary_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_max_reduction_no_dim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_max_reduction_with_dim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_maximum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_median_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_meshgrid_list_of_tensors_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_meshgrid_variadic_tensors_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_min_binary_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_min_reduction_no_dim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_min_reduction_with_dim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_minimum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_mm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_mode_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_movedim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_msort_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_mul_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_multinomial_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_mv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_mvlgamma_mvlgamma_p_1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_mvlgamma_mvlgamma_p_5_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nan_to_num_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nanmean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nanmedian_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nanquantile_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nansum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_narrow_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_narrow_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_native_batch_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_native_dropout_backward_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_native_layer_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ne_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_neg_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_new_empty_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_new_empty_strided_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_new_full_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_new_ones_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_new_zeros_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nextafter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_adaptive_avg_pool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_adaptive_avg_pool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_adaptive_max_pool1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_alpha_dropout_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_avg_pool1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_avg_pool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_batch_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_batch_norm_without_cudnn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_bilinear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_celu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv2d_no_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv2d_stride_no_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv2d_stride_padding_no_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv2d_stride_padding_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv2d_strided_padding_dilation_no_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv2d_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv_transpose1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv_transpose2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_conv_transpose3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_cross_entropy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_ctc_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_dropout2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_dropout3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_dropout_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_elu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_embedding_bag_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_embedding_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_embedding_functorch_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_gaussian_nll_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_gelu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_glu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_grid_sample_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_group_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_hardshrink_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_hardswish_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_hardtanh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_hinge_embedding_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_huber_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_instance_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_interpolate_area_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_interpolate_linear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_interpolate_nearest-exact_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_interpolate_trilinear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_kl_div_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_l1_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_layer_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_leaky_relu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_linear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_local_response_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_logsigmoid_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_margin_ranking_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_max_pool1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_max_pool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_max_pool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_max_unpool1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_max_unpool1d_grad_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_max_unpool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_max_unpool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_mish_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_mse_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_mse_loss_functorch_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_multi_head_attention_forward_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_multi_margin_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_nll_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_normalize_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_one_hot_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_pad_circular_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_pad_constant_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_pad_reflect_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_pad_replicate_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_pairwise_distance_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_pdist_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_pixel_shuffle_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_pixel_unshuffle_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_poisson_nll_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_prelu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_relu6_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_relu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_rms_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_rrelu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_scaled_dot_product_attention_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_selu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_silu_complex_cuda_complex64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_silu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_smooth_l1_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_softmin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_softmin_with_dtype_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_softplus_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_softshrink_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_softsign_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_tanhshrink_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_threshold_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_triplet_margin_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_unfold_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_upsample_bilinear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nonzero_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_nonzero_static_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_norm_fro_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_norm_inf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_norm_nuc_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_normal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_normal_in_place_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_normal_number_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ones_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ones_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ops_aten__new_zeros_with_same_feature_meta_functorchonly_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ops_aten_index_put_functorch_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ormqr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_outer_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_pca_lowrank_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_permute_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_permute_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_pinverse_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_polar_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_polygamma_polygamma_n_0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_polygamma_polygamma_n_1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_polygamma_polygamma_n_2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_polygamma_polygamma_n_3_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_polygamma_polygamma_n_4_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_positive_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_pow_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_prod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_put_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_qr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_quantile_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_rad2deg_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_rand_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_randint_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_randint_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_randn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_randn_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_ravel_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_real_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_reciprocal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_remainder_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_renorm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_repeat_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_repeat_interleave_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_reshape_as_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_reshape_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_resize__cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_resize_as__cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_resolve_conj_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_resolve_neg_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_roll_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_rot90_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_round_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_round_decimals_0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_round_decimals_3_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_round_decimals_neg_3_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_rsqrt_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_rsub_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_scalar_tensor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_scatter_add_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_scatter_reduce_amax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_scatter_reduce_amin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_scatter_reduce_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_scatter_reduce_prod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_scatter_reduce_sum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_searchsorted_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_select_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_select_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sgn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_short_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_short_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sigmoid_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sign_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signal_windows_bartlett_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signal_windows_blackman_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signal_windows_cosine_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signal_windows_exponential_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signal_windows_gaussian_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signal_windows_general_cosine_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signal_windows_general_hamming_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signal_windows_hamming_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signal_windows_hann_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signal_windows_kaiser_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signal_windows_nuttall_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_signbit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sinc_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sinh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_slice_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_slice_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_softmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_softmax_with_dtype_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sort_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sparse_mm_reduce_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sparse_sampled_addmm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_airy_ai_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_bessel_j0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_bessel_j1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_bessel_y0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_bessel_y1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_chebyshev_polynomial_u_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_chebyshev_polynomial_v_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_chebyshev_polynomial_w_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_entr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_erfcx_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_hermite_polynomial_h_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_hermite_polynomial_he_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_i0e_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_i1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_i1e_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_legendre_polynomial_p_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_log_ndtr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_modified_bessel_i0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_modified_bessel_i1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_modified_bessel_k0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_modified_bessel_k1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_ndtr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_ndtri_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_polygamma_special_polygamma_n_0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_shifted_chebyshev_polynomial_t_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_shifted_chebyshev_polynomial_u_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_shifted_chebyshev_polynomial_v_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_shifted_chebyshev_polynomial_w_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_spherical_bessel_j0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_xlog1py_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_special_zeta_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_split_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_split_list_args_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_split_with_sizes_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_split_with_sizes_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sqrt_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_square_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_squeeze_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_squeeze_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_squeeze_multiple_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_stack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_std_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_std_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_std_mean_unbiased_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_std_unbiased_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_stft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sub_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_sum_to_size_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_svd_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_svd_lowrank_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_t_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_t_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_take_along_dim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_take_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_tan_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_tanh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_tensor_split_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_tensordot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_tile_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_to_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_to_sparse_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_topk_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_torch__scaled_mm_cuda_float8_e4m3fn, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_torch_ops_aten__efficient_attention_forward_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_torch_ops_aten__flash_attention_forward_cuda_float16, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_torch_ops_aten__safe_softmax_default_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_trace_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_transpose_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_transpose_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_trapezoid_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_trapz_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_triangular_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_tril_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_tril_indices_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_triu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_triu_indices_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_true_divide_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_trunc_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unbind_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unbind_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unflatten_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unfold_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unfold_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_uniform_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unique_consecutive_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unique_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unravel_index_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unsafe_chunk_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unsafe_split_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unsqueeze_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_unsqueeze_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_var_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_var_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_var_mean_unbiased_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_var_unbiased_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_vdot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_view_as_complex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_view_as_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_view_as_real_cuda_complex64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_view_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_view_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_vsplit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_vstack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_where_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_xlogy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_zero__cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_zeros_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_op_has_batch_rule_zeros_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_searchsorted_bucketize_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_slogdet_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_sum_scalar_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_torch_return_types_returns_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_escaped_error_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_H_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyCatCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyCubeCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyMulCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyMulScalarCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyNMSCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyNonzeroCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpySortAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpySortCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpySplitCopyCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpySplitCopyWithIntCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyTakeCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_NumpyViewCopyCustomOp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_SelectAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_T_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___getitem___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___getitem___functorch_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___radd___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___rand___cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___rdiv___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___rmatmul___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___rmod___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___rmul___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___ror___cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___rpow___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___rsub___cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive___rxor___cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive__batch_norm_with_update_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive__chunk_cat_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive__native_batch_norm_legit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive__segment_reduce_lengths_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive__segment_reduce_offsets_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive__softmax_backward_data_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive__unsafe_masked_index_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive__upsample_bilinear2d_aa_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_abs_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_acos_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_acosh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_add_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_addbmm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_addcdiv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_addcmul_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_addmm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_addmm_decomposed_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_addmv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_addr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_alias_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_all_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_allclose_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_amax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_amin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_aminmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_angle_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_any_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_arange_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_argmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_argmin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_argsort_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_argwhere_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_as_strided_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_as_strided_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_as_strided_partial_views_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_as_strided_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_asin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_asinh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_atan2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_atan_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_atanh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_atleast_1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_atleast_2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_atleast_3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_baddbmm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bernoulli_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bfloat16_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bincount_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bitwise_and_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bitwise_left_shift_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bitwise_not_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bitwise_or_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bitwise_right_shift_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bitwise_xor_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_block_diag_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bmm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bool_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bool_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_broadcast_shapes_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_broadcast_tensors_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_broadcast_to_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_bucketize_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_byte_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_byte_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cartesian_prod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cat_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cauchy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cdist_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cdouble_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ceil_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cfloat_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_chalf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_char_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_char_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cholesky_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cholesky_inverse_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cholesky_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_chunk_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_clamp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_clamp_max_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_clamp_min_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_clone_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_column_stack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_combinations_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_complex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_conj_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_conj_physical_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_constant_pad_nd_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_contiguous_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_copysign_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_corrcoef_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cos_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cosh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_count_nonzero_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cov_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cross_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cummax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cummin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cumprod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cumsum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_cumulative_trapezoid_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_deg2rad_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_diag_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_diag_embed_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_diagflat_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_diagonal_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_diagonal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_diagonal_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_diff_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_digamma_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_dist_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_div_floor_rounding_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_div_no_rounding_mode_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_div_trunc_rounding_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_dot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_double_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_double_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_dsplit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_dstack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_einsum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_empty_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_empty_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_empty_permuted_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_empty_strided_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_eq_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_equal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_erf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_erfc_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_erfinv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_exp2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_exp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_expand_as_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_expand_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_expand_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_expm1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_exponential_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_eye_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_fft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_fft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_fftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_fftshift_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_hfft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_hfft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_hfftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_ifft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_ifft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_ifftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_ifftshift_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_ihfft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_ihfft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_ihfftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_irfft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_irfft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_irfftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_rfft2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_rfft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fft_rfftn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fill_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_flatten_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_flip_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fliplr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_flipud_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_float_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_float_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_float_power_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_floor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_floor_divide_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fmin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_fmod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_frac_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_frexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_full_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_full_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_gather_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_gcd_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ge_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_geometric_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_geqrf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_gradient_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_grid_sampler_2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_grid_sampler_3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_gt_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_half_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_half_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_hash_tensor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_heaviside_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_histc_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_hsplit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_hstack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_hypot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_i0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_igamma_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_igammac_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_imag_cuda_complex64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_index_add_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_index_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_index_fill_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_index_put_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_index_put_functorch_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_index_reduce_amax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_index_reduce_amin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_index_reduce_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_index_reduce_prod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_index_select_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_inner_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_int_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_int_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_isclose_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_isfinite_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_isin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_isinf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_isnan_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_isneginf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_isposinf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_isreal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_istft_cuda_complex64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_item_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_jiterator_4inputs_with_extra_args_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_jiterator_binary_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_jiterator_unary_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_kron_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_kthvalue_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_lcm_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ldexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_le_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_lerp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_lgamma_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_cholesky_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_cholesky_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_cond_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_cross_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_det_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_diagonal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_eig_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_eigh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_eigvals_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_eigvalsh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_householder_product_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_inv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_inv_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_ldl_factor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_ldl_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_lstsq_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_lu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_lu_factor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_lu_factor_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_lu_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_matrix_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_matrix_power_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_matrix_rank_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_multi_dot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_norm_subgradients_at_zero_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_pinv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_pinv_hermitian_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_pinv_singular_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_qr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_slogdet_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_solve_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_solve_triangular_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_svd_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_svdvals_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_tensorinv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_tensorsolve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_vander_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_vecdot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linalg_vector_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linspace_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_linspace_tensor_overload_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_log10_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_log1p_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_log2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_log_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_log_normal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_log_softmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_log_softmax_with_dtype_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logaddexp2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logaddexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logcumsumexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logdet_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logical_and_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logical_not_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logical_or_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logical_xor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logspace_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logspace_tensor_overload_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_logsumexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_long_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_long_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_lt_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_lu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_lu_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_lu_unpack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_mH_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_mT_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_amax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_amin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_argmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_argmin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_cumprod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_cumsum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_fill_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_log_softmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_logaddexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_logsumexp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_median_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_normalize_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_prod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_select_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_softmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_softmin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_std_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_sum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_masked_var_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_matmul_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_matrix_exp_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_max_binary_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_max_reduction_no_dim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_max_reduction_with_dim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_maximum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_median_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_meshgrid_list_of_tensors_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_meshgrid_variadic_tensors_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_min_binary_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_min_reduction_no_dim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_min_reduction_with_dim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_minimum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_mm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_mode_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_movedim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_msort_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_mul_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_multinomial_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_mv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_mvlgamma_mvlgamma_p_1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_mvlgamma_mvlgamma_p_5_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nan_to_num_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nanmean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nanmedian_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nanquantile_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nansum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_narrow_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_narrow_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_native_batch_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_native_dropout_backward_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_native_layer_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ne_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_neg_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_new_empty_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_new_empty_strided_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_new_full_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_new_ones_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_new_zeros_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nextafter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_adaptive_avg_pool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_adaptive_avg_pool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_adaptive_max_pool1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_alpha_dropout_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_avg_pool1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_avg_pool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_batch_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_batch_norm_without_cudnn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_bilinear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_celu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv2d_no_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv2d_stride_no_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv2d_stride_padding_no_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv2d_stride_padding_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv2d_strided_padding_dilation_no_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv2d_with_bias_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv_transpose1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv_transpose2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_conv_transpose3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_cross_entropy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_ctc_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_dropout2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_dropout3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_dropout_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_elu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_embedding_bag_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_embedding_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_embedding_functorch_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_gaussian_nll_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_gelu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_glu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_grid_sample_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_group_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_hardshrink_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_hardswish_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_hardtanh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_hinge_embedding_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_huber_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_instance_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_interpolate_area_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_interpolate_linear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_interpolate_nearest-exact_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_interpolate_trilinear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_kl_div_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_l1_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_layer_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_leaky_relu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_linear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_local_response_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_logsigmoid_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_margin_ranking_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_max_pool1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_max_pool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_max_pool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_max_unpool1d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_max_unpool1d_grad_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_max_unpool2d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_max_unpool3d_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_mish_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_mse_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_mse_loss_functorch_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_multi_head_attention_forward_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_multi_margin_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_nll_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_normalize_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_one_hot_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_pad_circular_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_pad_constant_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_pad_reflect_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_pad_replicate_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_pairwise_distance_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_pdist_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_pixel_shuffle_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_pixel_unshuffle_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_poisson_nll_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_prelu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_relu6_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_relu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_rms_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_rrelu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_scaled_dot_product_attention_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_selu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_silu_complex_cuda_complex64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_silu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_smooth_l1_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_softmin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_softmin_with_dtype_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_softplus_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_softshrink_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_softsign_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_tanhshrink_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_threshold_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_triplet_margin_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_unfold_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_upsample_bilinear_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nonzero_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_nonzero_static_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_norm_fro_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_norm_inf_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_norm_nuc_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_normal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_normal_in_place_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_normal_number_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ones_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ones_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ops_aten__new_zeros_with_same_feature_meta_functorchonly_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ops_aten_index_put_functorch_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ormqr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_outer_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_pca_lowrank_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_permute_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_permute_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_pinverse_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_polar_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_polygamma_polygamma_n_0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_polygamma_polygamma_n_1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_polygamma_polygamma_n_2_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_polygamma_polygamma_n_3_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_polygamma_polygamma_n_4_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_positive_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_pow_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_prod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_put_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_qr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_quantile_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_rad2deg_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_rand_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_randint_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_randint_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_randn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_randn_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_ravel_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_real_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_reciprocal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_remainder_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_renorm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_repeat_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_repeat_interleave_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_reshape_as_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_reshape_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_resize__cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_resize_as__cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_resolve_conj_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_resolve_neg_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_roll_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_rot90_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_round_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_round_decimals_0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_round_decimals_3_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_round_decimals_neg_3_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_rsqrt_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_rsub_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_scalar_tensor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_scatter_add_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_scatter_reduce_amax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_scatter_reduce_amin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_scatter_reduce_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_scatter_reduce_prod_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_scatter_reduce_sum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_searchsorted_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_select_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_select_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sgn_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_short_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_short_functorch_no_channels_last_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sigmoid_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sign_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signal_windows_bartlett_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signal_windows_blackman_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signal_windows_cosine_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signal_windows_exponential_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signal_windows_gaussian_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signal_windows_general_cosine_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signal_windows_general_hamming_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signal_windows_hamming_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signal_windows_hann_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signal_windows_kaiser_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signal_windows_nuttall_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_signbit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sin_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sinc_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sinh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_slice_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_slice_scatter_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_softmax_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_softmax_with_dtype_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sort_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sparse_mm_reduce_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sparse_sampled_addmm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_airy_ai_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_bessel_j0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_bessel_j1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_bessel_y0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_bessel_y1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_chebyshev_polynomial_u_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_chebyshev_polynomial_v_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_chebyshev_polynomial_w_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_entr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_erfcx_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_hermite_polynomial_h_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_hermite_polynomial_he_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_i0e_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_i1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_i1e_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_legendre_polynomial_p_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_log_ndtr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_modified_bessel_i0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_modified_bessel_i1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_modified_bessel_k0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_modified_bessel_k1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_ndtr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_ndtri_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_polygamma_special_polygamma_n_0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_shifted_chebyshev_polynomial_t_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_shifted_chebyshev_polynomial_u_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_shifted_chebyshev_polynomial_v_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_shifted_chebyshev_polynomial_w_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_spherical_bessel_j0_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_xlog1py_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_special_zeta_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_split_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_split_list_args_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_split_with_sizes_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_split_with_sizes_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sqrt_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_square_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_squeeze_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_squeeze_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_squeeze_multiple_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_stack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_std_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_std_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_std_mean_unbiased_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_std_unbiased_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_stft_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sub_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sum_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_sum_to_size_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_svd_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_svd_lowrank_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_t_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_t_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_take_along_dim_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_take_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_tan_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_tanh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_tensor_split_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_tensordot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_tile_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_to_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_to_sparse_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_topk_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_torch__scaled_mm_cuda_float8_e4m3fn, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_torch_ops_aten__efficient_attention_forward_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_torch_ops_aten__flash_attention_forward_cuda_float16, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_torch_ops_aten__safe_softmax_default_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_trace_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_transpose_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_transpose_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_trapezoid_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_trapz_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_triangular_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_tril_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_tril_indices_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_triu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_triu_indices_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_true_divide_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_trunc_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unbind_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unbind_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unflatten_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unfold_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unfold_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_uniform_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unique_consecutive_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unique_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unravel_index_cuda_int64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unsafe_chunk_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unsafe_split_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unsqueeze_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_unsqueeze_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_var_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_var_mean_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_var_mean_unbiased_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_var_unbiased_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_vdot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_view_as_complex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_view_as_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_view_as_real_cuda_complex64, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_view_copy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_view_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_vsplit_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_vstack_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_where_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_xlogy_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_zero__cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_zeros_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_exhaustive_zeros_like_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_cholesky_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_cholesky_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_cond_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_cross_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_det_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_diagonal_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_eig_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_eigh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_eigvals_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_eigvalsh_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_householder_product_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_inv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_inv_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_ldl_factor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_ldl_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_lstsq_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_lu_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_lu_factor_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_lu_factor_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_lu_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_matrix_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_matrix_power_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_matrix_rank_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_multi_dot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_norm_subgradients_at_zero_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_pinv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_pinv_hermitian_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_pinv_singular_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_qr_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_slogdet_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_solve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_solve_ex_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_solve_triangular_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_svd_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_svdvals_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_tensorinv_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_tensorsolve_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_vander_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_vecdot_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_linalg_failure_1D_input_linalg_vector_norm_cuda_float32, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_multi_dot_failure_1D_input_cuda, test/functorch/test_vmap.py::TestVmapOperatorsOpInfoCUDA::test_vmap_with_anomaly_detection_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_add_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_binary_cross_entropy_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_diagonal_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_div_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_expand_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_index_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_inplace_manyview_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_inplace_view_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_lgamma_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_log1p_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_log_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_log_softmax_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_logsumexp_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_max_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_median_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_min_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_mul_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_permute_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_randomness_backend0_randomness_different_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_randomness_backend0_randomness_error_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_randomness_backend0_randomness_same_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_randomness_backend1_randomness_different_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_randomness_backend1_randomness_error_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_randomness_backend1_randomness_same_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_randomness_backend2_randomness_different_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_randomness_backend2_randomness_error_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_randomness_backend2_randomness_same_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_randomness_backend3_randomness_different_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_randomness_backend3_randomness_error_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_randomness_backend3_randomness_same_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_reshape_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_sdpa_backend0_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_sdpa_backend1_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_sdpa_backend2_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_sdpa_backend3_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_select_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_sigmoid_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_slice_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_stack_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_sub_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_threshold_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_trace_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_unrelated_output_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_unrelated_output_multiple_grad_cuda, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_vmap_fallback_check, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_vmap_fallback_check_ok, test/functorch/test_vmap.py::TestVmapBatchedGradientCUDA::test_where_cuda, test/functorch/test_vmap.py::TestTransformFailureCUDA::test_fails_with_autograd_function_transform_grad_and_value_cuda, test/functorch/test_vmap.py::TestTransformFailureCUDA::test_fails_with_autograd_function_transform_grad_cuda, test/functorch/test_vmap.py::TestTransformFailureCUDA::test_fails_with_autograd_function_transform_jacfwd_cuda, test/functorch/test_vmap.py::TestTransformFailureCUDA::test_fails_with_autograd_function_transform_jacrev_cuda, test/functorch/test_vmap.py::TestTransformFailureCUDA::test_fails_with_autograd_function_transform_jvp_cuda, test/functorch/test_vmap.py::TestTransformFailureCUDA::test_fails_with_autograd_function_transform_vjp_cuda, test/functorch/test_vmap.py::TestTransformFailureCUDA::test_fails_with_autograd_function_transform_vmap_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_alpha_dropout_randomness_different_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_alpha_dropout_randomness_different_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_alpha_dropout_randomness_different_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_alpha_dropout_randomness_error_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_alpha_dropout_randomness_error_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_alpha_dropout_randomness_error_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_alpha_dropout_randomness_same_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_alpha_dropout_randomness_same_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_alpha_dropout_randomness_same_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_different_batched_input_first_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_different_batched_input_first_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_different_batched_input_first_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_different_batched_input_last_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_different_batched_input_last_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_different_batched_input_last_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_different_batched_input_none_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_different_batched_input_none_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_different_batched_input_none_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_error_batched_input_first_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_error_batched_input_first_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_error_batched_input_first_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_error_batched_input_last_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_error_batched_input_last_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_error_batched_input_last_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_error_batched_input_none_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_error_batched_input_none_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_error_batched_input_none_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_same_batched_input_first_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_same_batched_input_first_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_same_batched_input_first_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_same_batched_input_last_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_same_batched_input_last_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_same_batched_input_last_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_same_batched_input_none_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_same_batched_input_none_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_False_randomness_same_batched_input_none_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_different_batched_input_first_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_different_batched_input_first_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_different_batched_input_first_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_different_batched_input_last_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_different_batched_input_last_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_different_batched_input_last_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_different_batched_input_none_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_different_batched_input_none_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_different_batched_input_none_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_error_batched_input_first_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_error_batched_input_first_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_error_batched_input_first_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_error_batched_input_last_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_error_batched_input_last_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_error_batched_input_last_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_error_batched_input_none_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_error_batched_input_none_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_error_batched_input_none_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_same_batched_input_first_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_same_batched_input_first_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_same_batched_input_first_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_same_batched_input_last_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_same_batched_input_last_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_same_batched_input_last_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_same_batched_input_none_batched_probability_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_same_batched_input_none_batched_probability_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_bernoulli_in_place_use_generator_True_randomness_same_batched_input_none_batched_probability_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_chunk_vmap_in_dim_0_out_dim_0_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_chunk_vmap_in_dim_0_out_dim_1_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_chunk_vmap_in_dim_0_out_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_chunk_vmap_in_dim_1_out_dim_0_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_chunk_vmap_in_dim_1_out_dim_1_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_chunk_vmap_in_dim_1_out_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_chunk_vmap_in_dim_2_out_dim_0_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_chunk_vmap_in_dim_2_out_dim_1_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_chunk_vmap_in_dim_2_out_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_randomness_different_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_randomness_different_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_randomness_different_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_randomness_error_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_randomness_error_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_randomness_error_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_randomness_same_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_randomness_same_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_randomness_same_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_unbatched_randomness_different_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_unbatched_randomness_error_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_dropout_unbatched_randomness_same_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_factory_ops_randomness_different_use_generator_False_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_factory_ops_randomness_different_use_generator_True_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_factory_ops_randomness_error_use_generator_False_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_factory_ops_randomness_error_use_generator_True_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_factory_ops_randomness_same_use_generator_False_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_factory_ops_randomness_same_use_generator_True_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_alpha_dropout_randomness_different_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_alpha_dropout_randomness_different_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_alpha_dropout_randomness_different_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_alpha_dropout_randomness_error_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_alpha_dropout_randomness_error_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_alpha_dropout_randomness_error_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_alpha_dropout_randomness_same_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_alpha_dropout_randomness_same_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_alpha_dropout_randomness_same_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_different_batched_input_first_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_different_batched_input_first_dim_3_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_different_batched_input_last_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_different_batched_input_last_dim_3_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_different_batched_input_none_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_different_batched_input_none_dim_3_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_error_batched_input_first_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_error_batched_input_first_dim_3_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_error_batched_input_last_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_error_batched_input_last_dim_3_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_error_batched_input_none_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_error_batched_input_none_dim_3_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_same_batched_input_first_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_same_batched_input_first_dim_3_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_same_batched_input_last_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_same_batched_input_last_dim_3_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_same_batched_input_none_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_feature_dropout_randomness_same_batched_input_none_dim_3_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_jacfwd_with_random_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_like_functions_randomness_different_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_like_functions_randomness_different_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_like_functions_randomness_different_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_like_functions_randomness_error_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_like_functions_randomness_error_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_like_functions_randomness_error_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_like_functions_randomness_same_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_like_functions_randomness_same_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_like_functions_randomness_same_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_different_batched_call_False_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_different_batched_call_False_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_different_batched_call_False_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_different_batched_call_True_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_different_batched_call_True_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_different_batched_call_True_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_error_batched_call_False_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_error_batched_call_False_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_error_batched_call_False_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_error_batched_call_True_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_error_batched_call_True_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_error_batched_call_True_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_same_batched_call_False_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_same_batched_call_False_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_same_batched_call_False_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_same_batched_call_True_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_same_batched_call_True_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_False_randomness_same_batched_call_True_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_different_batched_call_False_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_different_batched_call_False_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_different_batched_call_False_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_different_batched_call_True_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_different_batched_call_True_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_different_batched_call_True_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_error_batched_call_False_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_error_batched_call_False_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_error_batched_call_False_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_error_batched_call_True_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_error_batched_call_True_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_error_batched_call_True_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_same_batched_call_False_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_same_batched_call_False_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_same_batched_call_False_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_same_batched_call_True_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_same_batched_call_True_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_multinomial_use_generator_True_randomness_same_batched_call_True_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_different_batched_input_first_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_different_batched_input_first_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_different_batched_input_first_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_different_batched_input_last_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_different_batched_input_last_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_different_batched_input_last_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_different_batched_input_none_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_different_batched_input_none_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_different_batched_input_none_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_error_batched_input_first_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_error_batched_input_first_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_error_batched_input_first_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_error_batched_input_last_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_error_batched_input_last_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_error_batched_input_last_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_error_batched_input_none_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_error_batched_input_none_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_error_batched_input_none_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_same_batched_input_first_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_same_batched_input_first_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_same_batched_input_first_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_same_batched_input_last_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_same_batched_input_last_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_same_batched_input_last_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_same_batched_input_none_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_same_batched_input_none_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_False_randomness_same_batched_input_none_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_different_batched_input_first_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_different_batched_input_first_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_different_batched_input_first_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_different_batched_input_last_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_different_batched_input_last_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_different_batched_input_last_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_different_batched_input_none_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_different_batched_input_none_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_different_batched_input_none_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_error_batched_input_first_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_error_batched_input_first_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_error_batched_input_first_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_error_batched_input_last_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_error_batched_input_last_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_error_batched_input_last_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_error_batched_input_none_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_error_batched_input_none_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_error_batched_input_none_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_same_batched_input_first_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_same_batched_input_first_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_same_batched_input_first_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_same_batched_input_last_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_same_batched_input_last_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_same_batched_input_last_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_same_batched_input_none_batched_other_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_same_batched_input_none_batched_other_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_binary_out_of_place_use_generator_True_randomness_same_batched_input_none_batched_other_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_False_randomness_different_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_False_randomness_different_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_False_randomness_different_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_False_randomness_error_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_False_randomness_error_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_False_randomness_error_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_False_randomness_same_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_False_randomness_same_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_False_randomness_same_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_True_randomness_different_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_True_randomness_different_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_True_randomness_different_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_True_randomness_error_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_True_randomness_error_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_True_randomness_error_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_True_randomness_same_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_True_randomness_same_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_inplace_use_generator_True_randomness_same_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_False_randomness_different_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_False_randomness_different_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_False_randomness_different_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_False_randomness_error_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_False_randomness_error_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_False_randomness_error_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_False_randomness_same_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_False_randomness_same_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_False_randomness_same_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_True_randomness_different_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_True_randomness_different_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_True_randomness_different_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_True_randomness_error_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_True_randomness_error_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_True_randomness_error_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_True_randomness_same_batched_input_first_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_True_randomness_same_batched_input_last_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_random_unary_out_of_place_use_generator_True_randomness_same_batched_input_none_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_randperm_randomness_different_use_generator_False_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_randperm_randomness_different_use_generator_True_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_randperm_randomness_error_use_generator_False_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_randperm_randomness_error_use_generator_True_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_randperm_randomness_same_use_generator_False_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_randperm_randomness_same_use_generator_True_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_unsupported_random_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_vmap_chunksize_in_dim_0_out_dim_0_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_vmap_chunksize_in_dim_0_out_dim_1_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_vmap_chunksize_in_dim_0_out_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_vmap_chunksize_in_dim_1_out_dim_0_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_vmap_chunksize_in_dim_1_out_dim_1_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_vmap_chunksize_in_dim_1_out_dim_2_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_vmap_chunksize_in_dim_2_out_dim_0_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_vmap_chunksize_in_dim_2_out_dim_1_cuda, test/functorch/test_vmap.py::TestRandomnessCUDA::test_vmap_chunksize_in_dim_2_out_dim_2_cuda, test/functorch/test_vmap.py::TestVmapDeviceTypeCUDA::test__is_all_true_cuda, test/functorch/test_vmap.py::TestVmapDeviceTypeCUDA::test__is_any_true_cuda, test/functorch/test_vmap.py::TestVmapDeviceTypeCUDA::test_check_tensor_cuda, test/functorch/test_vmap.py::TestVmapDeviceTypeCUDA::test_vmap_fallback_check, test/functorch/test_vmap.py::TestVmapDeviceTypeCUDA::test_vmap_fallback_check_ok, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_cat_batching_rule_cuda, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_fallback_binary_cuda, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_fallback_binary_nt_and_batched_dense_cuda, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_fallback_binary_nt_and_unbatched_dense_cuda, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_fallback_unary_cuda, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_fallback_with_nt_and_batched_dense_with_nonzero_bdim_raises_cuda, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_multilevel_vmap_raises_cuda, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_nt_acts_as_dense_in_vmap_cuda, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_nt_with_nonzero_in_dim_raises_cuda, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_nt_with_nonzero_out_dim_raises_cuda, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_shape_call_cuda, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_vmap_fallback_check, test/functorch/test_vmap.py::TestVmapNestedTensorCUDA::test_vmap_fallback_check_ok 2025-10-10T02:21:22.9104965Z 2025-10-10T02:21:26.5697535Z Running torch_np/numpy_tests/lib/test_shape_base_ 1/1 ... [2025-10-10 02:21:26.569061] 2025-10-10T02:21:26.5698603Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:21:26.5700966Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/lib/test_shape_base_.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:21:26.569506] 2025-10-10T02:21:30.6925874Z 2025-10-10T02:21:30.6927362Z torch_np/numpy_tests/lib/test_shape_base_ 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.lib.test_shape_base__1.1_407426e7354b1940_.log 2025-10-10T02:21:30.6964927Z Running 73 items in this shard: test/torch_np/numpy_tests/lib/test_shape_base_.py::TestTakeAlongAxis::test_argequivalent, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestTakeAlongAxis::test_broadcast, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestTakeAlongAxis::test_empty, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestTakeAlongAxis::test_invalid, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestPutAlongAxis::test_broadcast, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestPutAlongAxis::test_replace_max, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestApplyAlongAxis::test_0d_array, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestApplyAlongAxis::test_3d, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestApplyAlongAxis::test_axis_insertion, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestApplyAlongAxis::test_axis_insertion_ma, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestApplyAlongAxis::test_empty, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestApplyAlongAxis::test_scalar_array, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestApplyAlongAxis::test_simple, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestApplyAlongAxis::test_simple101, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestApplyAlongAxis::test_tuple_func1d, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestApplyAlongAxis::test_with_iterable_object, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestApplyOverAxes::test_simple, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestExpandDims::test_axis_out_of_range, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestExpandDims::test_axis_tuple, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestExpandDims::test_functionality, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestExpandDims::test_repeated_axis, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestArraySplit::test_index_split_high_bound, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestArraySplit::test_index_split_low_bound, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestArraySplit::test_index_split_simple, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestArraySplit::test_integer_0_split, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestArraySplit::test_integer_split, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestArraySplit::test_integer_split_2D_cols, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestArraySplit::test_integer_split_2D_default, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestArraySplit::test_integer_split_2D_rows, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestArraySplit::test_integer_split_2D_rows_greater_max_int32, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestSplit::test_equal_split, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestSplit::test_unequal_split, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestColumnStack::test_1D_arrays, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestColumnStack::test_2D_arrays, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestColumnStack::test_generator, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestColumnStack::test_non_iterable, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestDstack::test_0D_array, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestDstack::test_1D_array, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestDstack::test_2D_array, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestDstack::test_2D_array2, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestDstack::test_generator, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestDstack::test_non_iterable, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestHsplit::test_0D_array, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestHsplit::test_1D_array, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestHsplit::test_2D_array, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestHsplit::test_non_iterable, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestVsplit::test_0D_array, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestVsplit::test_1D_array, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestVsplit::test_2D_array, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestVsplit::test_non_iterable, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestDsplit::test_0D_array, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestDsplit::test_1D_array, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestDsplit::test_2D_array, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestDsplit::test_3D_array, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestDsplit::test_non_iterable, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestSqueeze::test_basic, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestSqueeze::test_basic_2, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestSqueeze::test_squeeze_axis, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestSqueeze::test_squeeze_axis_handling, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestSqueeze::test_squeeze_contiguous, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestSqueeze::test_squeeze_type, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestKron::test_basic, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestKron::test_kron_shape_shape_a0_shape_b0, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestKron::test_kron_shape_shape_a1_shape_b1, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestKron::test_kron_shape_shape_a2_shape_b2, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestKron::test_kron_shape_shape_a3_shape_b3, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestKron::test_kron_shape_shape_a4_shape_b4, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestKron::test_kron_shape_shape_a5_shape_b5, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestTile::test_basic, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestTile::test_empty, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestTile::test_kroncompare, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestTile::test_tile_one_repetition_on_array_gh4679, test/torch_np/numpy_tests/lib/test_shape_base_.py::TestMayShareMemory::test_basic 2025-10-10T02:21:30.7000233Z 2025-10-10T02:21:34.5402616Z Running torch_np/numpy_tests/fft/test_pocketfft 1/1 ... [2025-10-10 02:21:34.539555] 2025-10-10T02:21:34.5403203Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:21:34.5405177Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/fft/test_pocketfft.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:21:34.540010] 2025-10-10T02:21:38.7131778Z 2025-10-10T02:21:38.7132976Z torch_np/numpy_tests/fft/test_pocketfft 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.fft.test_pocketfft_1.1_1b99f28b2aa0788e_.log 2025-10-10T02:21:38.7159578Z Running 79 items in this shard: test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFTShift::test_fft_n, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_all_1d_norm_preserving, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_axes_op0, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_axes_op1, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_axes_op2, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_axes_op3, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_dtypes_dtype0, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_dtypes_dtype1, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_dtypes_dtype2, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft2, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype0_order_F_fft0, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype0_order_F_fft1, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype0_order_F_fft2, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype0_order_F_fft3, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype0_order_F_fft4, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype0_order_F_fft5, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype0_order_non-contiguous_fft0, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype0_order_non-contiguous_fft1, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype0_order_non-contiguous_fft2, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype0_order_non-contiguous_fft3, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype0_order_non-contiguous_fft4, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype0_order_non-contiguous_fft5, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype1_order_F_fft0, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype1_order_F_fft1, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype1_order_F_fft2, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype1_order_F_fft3, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype1_order_F_fft4, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype1_order_F_fft5, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype1_order_non-contiguous_fft0, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype1_order_non-contiguous_fft1, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype1_order_non-contiguous_fft2, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype1_order_non-contiguous_fft3, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype1_order_non-contiguous_fft4, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype1_order_non-contiguous_fft5, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype2_order_F_fft0, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype2_order_F_fft1, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype2_order_F_fft2, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype2_order_F_fft3, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype2_order_F_fft4, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype2_order_F_fft5, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype2_order_non-contiguous_fft0, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype2_order_non-contiguous_fft1, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype2_order_non-contiguous_fft2, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype2_order_non-contiguous_fft3, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype2_order_non-contiguous_fft4, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype2_order_non-contiguous_fft5, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype3_order_F_fft0, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype3_order_F_fft1, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype3_order_F_fft2, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype3_order_F_fft3, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype3_order_F_fft4, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype3_order_F_fft5, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype3_order_non-contiguous_fft0, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype3_order_non-contiguous_fft1, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype3_order_non-contiguous_fft2, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype3_order_non-contiguous_fft3, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype3_order_non-contiguous_fft4, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fft_with_order_dtype3_order_non-contiguous_fft5, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_fftn, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_hfft, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_identity, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_ifft2, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_ifft_norm0, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_ifft_norm_backward, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_ifft_norm_forward, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_ifft_norm_ortho, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_ifftn, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_ihfft, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_irfft, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_irfft2, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_irfftn, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_rfft, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_rfft2, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFT1D::test_rfftn, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFTThreadSafe::test_fft, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFTThreadSafe::test_ifft, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFTThreadSafe::test_irfft, test/torch_np/numpy_tests/fft/test_pocketfft.py::TestFFTThreadSafe::test_rfft 2025-10-10T02:21:38.7185508Z 2025-10-10T02:21:42.5768444Z Running test_scatter_gather_ops 1/1 ... [2025-10-10 02:21:42.576304] 2025-10-10T02:21:42.5769039Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:21:42.5772098Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_scatter_gather_ops.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:21:42.576710] 2025-10-10T02:21:46.6504625Z 2025-10-10T02:21:46.6505874Z test_scatter_gather_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_scatter_gather_ops_1.1_35b67cd1046f2ac5_.log 2025-10-10T02:21:46.6533725Z Running 76 items in this shard: test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_gather_backward_with_empty_index_tensor_sparse_grad_False_cuda_float32, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_gather_backward_with_empty_index_tensor_sparse_grad_False_cuda_float64, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_gather_backward_with_empty_index_tensor_sparse_grad_True_cuda_float32, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_gather_backward_with_empty_index_tensor_sparse_grad_True_cuda_float64, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_gather_bool_cuda_bool, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_gather_cuda_complex64, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_gather_cuda_float32, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_gather_expanded_index_cuda_bfloat16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_gather_expanded_index_cuda_float32, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_gather_expanded_index_cuda_float64, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_gather_large_cuda_bfloat16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_gather_large_cuda_int8, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter__cuda_complex64, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter__cuda_float16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter__cuda_float32, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter__reductions_cuda_float16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter__reductions_cuda_float32, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter__scalar_cuda_complex64, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter__scalar_cuda_float16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter__scalar_cuda_float32, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_add__cuda_complex64, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_add__cuda_float16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_add__cuda_float32, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_add_broadcasted_index_deterministic_cuda_float32, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_add_mult_index_base_cuda_float32, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_expanded_index_cuda_bfloat16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_expanded_index_cuda_float16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_expanded_index_cuda_float32, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_expanded_index_cuda_float64, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_amax_cuda_bfloat16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_amax_cuda_float16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_amax_cuda_float32, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_amax_cuda_float64, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_amax_cuda_int16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_amax_cuda_int32, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_amax_cuda_int64, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_amax_cuda_int8, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_amax_cuda_uint8, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_amin_cuda_bfloat16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_amin_cuda_float16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_amin_cuda_float32, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_amin_cuda_float64, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_amin_cuda_int16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_amin_cuda_int32, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_amin_cuda_int64, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_amin_cuda_int8, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_amin_cuda_uint8, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_mean_cuda_bfloat16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_mean_cuda_float16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_mean_cuda_float32, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_mean_cuda_float64, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_mean_cuda_int16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_mean_cuda_int32, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_mean_cuda_int64, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_mean_cuda_int8, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_mean_cuda_uint8, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_prod_cuda_bfloat16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_prod_cuda_float16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_prod_cuda_float32, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_prod_cuda_float64, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_prod_cuda_int16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_prod_cuda_int32, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_prod_cuda_int64, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_prod_cuda_int8, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_prod_cuda_uint8, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_sum_cuda_bfloat16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_sum_cuda_complex128, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_sum_cuda_complex64, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_sum_cuda_float16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_sum_cuda_float32, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_sum_cuda_float64, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_sum_cuda_int16, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_sum_cuda_int32, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_sum_cuda_int64, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_sum_cuda_int8, test/test_scatter_gather_ops.py::TestScatterGatherCUDA::test_scatter_reduce_sum_cuda_uint8 2025-10-10T02:21:46.6558962Z 2025-10-10T02:21:50.5004788Z Running torch_np/test_ndarray_methods 1/1 ... [2025-10-10 02:21:50.499976] 2025-10-10T02:21:50.5005314Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:21:50.5008564Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/test_ndarray_methods.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:21:50.500375] 2025-10-10T02:21:55.1253949Z 2025-10-10T02:21:55.1255209Z torch_np/test_ndarray_methods 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.test_ndarray_methods_1.1_a3bcb645e12a6c00_.log 2025-10-10T02:21:55.1379175Z Running 342 items in this shard: test/torch_np/test_ndarray_methods.py::TestIndexing::test_indexing_simple, test/torch_np/test_ndarray_methods.py::TestIndexing::test_setitem, test/torch_np/test_ndarray_methods.py::TestReshape::test_reshape_function, test/torch_np/test_ndarray_methods.py::TestReshape::test_reshape_method, test/torch_np/test_ndarray_methods.py::TestTranspose::test_transpose_function, test/torch_np/test_ndarray_methods.py::TestTranspose::test_transpose_method, test/torch_np/test_ndarray_methods.py::TestRavel::test_ravel_function, test/torch_np/test_ndarray_methods.py::TestRavel::test_ravel_method, test/torch_np/test_ndarray_methods.py::TestNonzero::test_array_method, test/torch_np/test_ndarray_methods.py::TestNonzero::test_nonzero_onedim, test/torch_np/test_ndarray_methods.py::TestNonzero::test_nonzero_trivial, test/torch_np/test_ndarray_methods.py::TestNonzero::test_nonzero_twodim, test/torch_np/test_ndarray_methods.py::TestNonzero::test_sparse, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_all_method_max, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_all_method_min, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size0_axis0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size0_axis0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size10_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size10_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size11_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size11_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size12_axis_1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size12_axis_1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size13_axis13_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size13_axis13_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size14_axis_-2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size14_axis_-2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size15_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size15_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size16_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size16_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size17_axis_1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size17_axis_1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size18_axis18_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size18_axis18_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size19_axis_-3_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size19_axis_-3_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size1_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size1_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size20_axis_-2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size20_axis_-2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size21_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size21_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size22_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size22_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size23_axis_1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size23_axis_1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size24_axis_2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size24_axis_2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size25_axis25_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size25_axis25_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size26_axis_-3_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size26_axis_-3_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size27_axis_-2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size27_axis_-2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size28_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size28_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size29_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size29_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size2_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size2_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size30_axis_1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size30_axis_1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size31_axis_2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size31_axis_2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size32_axis32_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size32_axis32_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size33_axis_-4_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size33_axis_-4_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size34_axis_-3_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size34_axis_-3_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size35_axis_-2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size35_axis_-2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size36_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size36_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size37_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size37_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size38_axis_1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size38_axis_1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size39_axis_2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size39_axis_2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size3_axis3_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size3_axis3_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size40_axis_3_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size40_axis_3_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size41_axis41_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size41_axis41_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size42_axis_-4_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size42_axis_-4_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size43_axis_-3_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size43_axis_-3_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size44_axis_-2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size44_axis_-2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size45_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size45_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size46_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size46_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size47_axis_1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size47_axis_1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size48_axis_2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size48_axis_2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size49_axis_3_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size49_axis_3_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size4_axis_-2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size4_axis_-2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size50_axis50_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size50_axis50_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size51_axis_-4_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size51_axis_-4_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size52_axis_-3_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size52_axis_-3_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size53_axis_-2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size53_axis_-2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size54_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size54_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size55_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size55_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size56_axis_1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size56_axis_1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size57_axis_2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size57_axis_2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size58_axis_3_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size58_axis_3_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size59_axis59_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size59_axis59_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size5_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size5_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size60_axis_-4_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size60_axis_-4_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size61_axis_-3_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size61_axis_-3_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size62_axis_-2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size62_axis_-2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size63_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size63_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size64_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size64_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size65_axis_1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size65_axis_1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size66_axis_2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size66_axis_2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size67_axis_3_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size67_axis_3_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size68_axis68_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size68_axis68_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size69_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size69_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size6_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size6_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size70_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size70_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size71_axis71_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size71_axis71_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size72_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size72_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size73_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size73_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size74_axis74_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size74_axis74_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size75_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size75_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size76_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size76_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size77_axis77_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size77_axis77_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size7_axis_1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size7_axis_1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size8_axis8_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size8_axis8_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size9_axis_-2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size9_axis_-2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_vs_ndarray_arr_method_argmax_np_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_vs_ndarray_arr_method_argmin_np_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_vs_ndarray_positional_arr_method_argmax_np_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_vs_ndarray_positional_arr_method_argmin_np_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_output_shape_method_argmax, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_output_shape_method_argmin, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_ret_is_out_ndim_0_method_argmax, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_ret_is_out_ndim_0_method_argmin, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_ret_is_out_ndim_1_method_argmax, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_ret_is_out_ndim_1_method_argmin, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data0, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data1, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data10, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data11, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data12, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data13, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data14, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data15, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data16, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data17, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data18, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data19, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data2, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data20, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data21, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data22, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data23, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data24, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data25, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data26, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data27, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data28, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data29, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data3, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data30, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data31, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data32, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data33, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data34, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data35, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data36, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data37, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data38, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data39, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data4, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data40, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data41, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data42, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data43, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data44, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data45, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data46, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data47, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data48, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data49, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data5, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data50, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data51, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data52, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data53, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data54, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data55, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data56, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data57, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data58, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data59, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data6, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data60, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data61, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data62, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data63, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data64, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data65, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data66, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data67, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data68, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data69, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data7, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data70, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data71, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data72, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data73, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data8, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data9, test/torch_np/test_ndarray_methods.py::TestArgmax::test_maximum_signed_integers, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data0, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data1, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data10, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data11, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data12, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data13, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data14, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data15, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data16, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data17, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data18, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data19, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data2, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data20, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data21, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data22, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data23, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data24, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data25, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data26, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data27, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data28, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data29, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data3, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data30, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data31, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data32, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data33, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data34, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data35, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data36, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data37, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data38, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data39, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data4, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data40, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data41, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data42, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data43, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data44, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data45, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data46, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data47, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data48, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data49, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data5, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data50, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data51, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data52, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data53, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data54, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data55, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data56, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data57, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data58, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data59, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data6, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data60, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data61, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data62, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data63, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data64, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data65, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data66, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data67, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data68, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data69, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data7, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data70, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data71, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data72, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data73, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data8, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data9, test/torch_np/test_ndarray_methods.py::TestArgmin::test_minimum_signed_integers, test/torch_np/test_ndarray_methods.py::TestAmax::test_basic, test/torch_np/test_ndarray_methods.py::TestAmin::test_basic, test/torch_np/test_ndarray_methods.py::TestContains::test_contains, test/torch_np/test_ndarray_methods.py::TestNoExtraMethods::test_extra_methods_name_fn, test/torch_np/test_ndarray_methods.py::TestNoExtraMethods::test_extra_methods_name_ivar, test/torch_np/test_ndarray_methods.py::TestNoExtraMethods::test_extra_methods_name_method, test/torch_np/test_ndarray_methods.py::TestNoExtraMethods::test_extra_methods_name_name, test/torch_np/test_ndarray_methods.py::TestNoExtraMethods::test_extra_methods_name_plain, test/torch_np/test_ndarray_methods.py::TestNoExtraMethods::test_extra_methods_name_rvar, test/torch_np/test_ndarray_methods.py::TestIter::test_iter_1d, test/torch_np/test_ndarray_methods.py::TestIter::test_iter_2d 2025-10-10T02:21:55.1491421Z 2025-10-10T02:21:58.9821261Z Running test_view_ops 1/1 ... [2025-10-10 02:21:58.981630] 2025-10-10T02:21:58.9821680Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:21:58.9823981Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_view_ops.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:21:58.982024] 2025-10-10T02:22:03.5057647Z 2025-10-10T02:22:03.5058371Z test_view_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_view_ops_1.1_add3b985aeb53ec3_.log 2025-10-10T02:22:03.5133051Z Running 279 items in this shard: test/test_view_ops.py::TestViewOpsCUDA::test_T_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_advanced_indexing_assignment_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_advanced_indexing_nonview_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_as_strided_gradients_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_as_strided_inplace_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_as_strided_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_basic_indexing_ellipses_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_basic_indexing_newaxis_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_basic_indexing_slice_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_chunk_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_conj_imag_view_cuda_complex128, test/test_view_ops.py::TestViewOpsCUDA::test_conj_imag_view_cuda_complex64, test/test_view_ops.py::TestViewOpsCUDA::test_conj_self_cuda_bfloat16, test/test_view_ops.py::TestViewOpsCUDA::test_conj_self_cuda_float16, test/test_view_ops.py::TestViewOpsCUDA::test_conj_self_cuda_float32, test/test_view_ops.py::TestViewOpsCUDA::test_conj_self_cuda_float64, test/test_view_ops.py::TestViewOpsCUDA::test_conj_self_cuda_int16, test/test_view_ops.py::TestViewOpsCUDA::test_conj_self_cuda_int32, test/test_view_ops.py::TestViewOpsCUDA::test_conj_self_cuda_int64, test/test_view_ops.py::TestViewOpsCUDA::test_conj_self_cuda_int8, test/test_view_ops.py::TestViewOpsCUDA::test_conj_self_cuda_uint8, test/test_view_ops.py::TestViewOpsCUDA::test_conj_view_with_shared_memory_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_contiguous_nonview_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_contiguous_self_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_diagonal_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_expand_as_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_expand_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_flatten_nonview_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_flatten_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_imag_noncomplex_cuda_bfloat16, test/test_view_ops.py::TestViewOpsCUDA::test_imag_noncomplex_cuda_float16, test/test_view_ops.py::TestViewOpsCUDA::test_imag_noncomplex_cuda_float32, test/test_view_ops.py::TestViewOpsCUDA::test_imag_noncomplex_cuda_float64, test/test_view_ops.py::TestViewOpsCUDA::test_imag_noncomplex_cuda_int16, test/test_view_ops.py::TestViewOpsCUDA::test_imag_noncomplex_cuda_int32, test/test_view_ops.py::TestViewOpsCUDA::test_imag_noncomplex_cuda_int64, test/test_view_ops.py::TestViewOpsCUDA::test_imag_noncomplex_cuda_int8, test/test_view_ops.py::TestViewOpsCUDA::test_imag_noncomplex_cuda_uint8, test/test_view_ops.py::TestViewOpsCUDA::test_movedim_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_narrow_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_permute_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_real_imag_view_cuda_complex128, test/test_view_ops.py::TestViewOpsCUDA::test_real_imag_view_cuda_complex64, test/test_view_ops.py::TestViewOpsCUDA::test_reshape_as_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_reshape_nonview_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_reshape_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_select_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_bfloat16, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_bool, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_complex128, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_complex64, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_float16, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_float32, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_float64, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_int16, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_int32, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_int64, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_int8, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_uint8, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_bfloat16, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_bool, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_complex128, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_complex64, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_float16, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_float32, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_float64, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_int16, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_int32, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_int64, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_int8, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_uint8, test/test_view_ops.py::TestViewOpsCUDA::test_split_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_squeeze_inplace_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_squeeze_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_t_inplace_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_t_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_transpose_inplace_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_transpose_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_unbind_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_unbind_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_unfold_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_unsqueeze_inplace_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_unsqueeze_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_view_as_complex_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_view_as_real_cuda_complex128, test/test_view_ops.py::TestViewOpsCUDA::test_view_as_real_cuda_complex32, test/test_view_ops.py::TestViewOpsCUDA::test_view_as_real_cuda_complex64, test/test_view_ops.py::TestViewOpsCUDA::test_view_as_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_view_copy_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_view_copy_out_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_view_copy_output_contiguous_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_new_cuda_bool, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_new_cuda_complex128, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_new_cuda_complex64, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_new_cuda_float16, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_new_cuda_float32, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_new_cuda_float64, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_new_cuda_int16, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_new_cuda_int32, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_new_cuda_int64, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_new_cuda_int8, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_new_cuda_uint8, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_bfloat16, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_bool, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_complex128, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_complex64, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_float16, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_float32, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_float64, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_int16, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_int32, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_int64, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_int8, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_uint8, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_bfloat16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_bool, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_complex128, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_complex64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_float16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_float32, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_float64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_int16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_int32, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_int64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_int8, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_uint8, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_bfloat16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_bool, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_complex128, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_complex64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_float16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_float32, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_float64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_int16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_int32, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_int64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_int8, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_uint8, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_bfloat16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_bool, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_complex128, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_complex64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_float16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_float32, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_float64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_int16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_int32, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_int64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_int8, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_uint8, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_bfloat16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_bool, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_complex128, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_complex64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_float16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_float32, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_float64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_int16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_int32, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_int64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_int8, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_uint8, test/test_view_ops.py::TestViewOpsCUDA::test_view_view_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_T_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_as_strided_overflow_storage_offset_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_atleast_cuda_complex128, test/test_view_ops.py::TestOldViewOpsCUDA::test_atleast_cuda_complex64, test/test_view_ops.py::TestOldViewOpsCUDA::test_atleast_cuda_float16, test/test_view_ops.py::TestOldViewOpsCUDA::test_atleast_cuda_float32, test/test_view_ops.py::TestOldViewOpsCUDA::test_atleast_cuda_float64, test/test_view_ops.py::TestOldViewOpsCUDA::test_atleast_cuda_int16, test/test_view_ops.py::TestOldViewOpsCUDA::test_atleast_cuda_int32, test/test_view_ops.py::TestOldViewOpsCUDA::test_atleast_cuda_int64, test/test_view_ops.py::TestOldViewOpsCUDA::test_atleast_cuda_int8, test/test_view_ops.py::TestOldViewOpsCUDA::test_atleast_cuda_uint8, test/test_view_ops.py::TestOldViewOpsCUDA::test_atleast_gradient_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_big_transpose_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_shapes_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_tensors_cuda_float32, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_to_cuda_bool, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_to_cuda_complex128, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_to_cuda_complex64, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_to_cuda_float16, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_to_cuda_float32, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_to_cuda_float64, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_to_cuda_int16, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_to_cuda_int32, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_to_cuda_int64, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_to_cuda_int8, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_to_cuda_uint8, test/test_view_ops.py::TestOldViewOpsCUDA::test_chunk_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_conj_neg_view_numpy_error_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_contiguous_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_crow_col_indices_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_empty_reshape_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_expand_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_flatten_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_memory_format_resize__cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_memory_format_resize_as_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_narrow_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_narrow_tensor_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_python_types_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_ravel_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_bfloat16, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_bool, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_complex128, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_complex64, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_float16, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_float32, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_float64, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_int16, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_int32, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_int64, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_int8, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_uint8, test/test_view_ops.py::TestOldViewOpsCUDA::test_resize_all_dtypes_and_devices_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_resize_as_all_dtypes_and_devices_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_resize_as_preserves_strides_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_resize_overflow_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_split_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_t_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_errors_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_indices_cuda_bool, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_indices_cuda_complex128, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_indices_cuda_complex64, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_indices_cuda_float16, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_indices_cuda_float32, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_indices_cuda_float64, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_indices_cuda_int16, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_indices_cuda_int32, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_indices_cuda_int64, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_indices_cuda_int8, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_indices_cuda_uint8, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_sections_cuda_bool, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_sections_cuda_complex128, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_sections_cuda_complex64, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_sections_cuda_float16, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_sections_cuda_float32, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_sections_cuda_float64, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_sections_cuda_int16, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_sections_cuda_int32, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_sections_cuda_int64, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_sections_cuda_int8, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_sections_cuda_uint8, test/test_view_ops.py::TestOldViewOpsCUDA::test_transpose_invalid_cuda_complex128, test/test_view_ops.py::TestOldViewOpsCUDA::test_transpose_invalid_cuda_float32, test/test_view_ops.py::TestOldViewOpsCUDA::test_transpose_invalid_cuda_int64, test/test_view_ops.py::TestOldViewOpsCUDA::test_transpose_vs_numpy_cuda_complex128, test/test_view_ops.py::TestOldViewOpsCUDA::test_transpose_vs_numpy_cuda_float32, test/test_view_ops.py::TestOldViewOpsCUDA::test_transpose_vs_numpy_cuda_int64, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_bfloat16, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_bool, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_complex128, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_complex64, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_float16, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_float32, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_float64, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_int16, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_int32, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_int64, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_int8, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_uint8, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_bfloat16, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_bool, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_complex128, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_complex64, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_float16, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_float32, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_float64, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_int16, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_int32, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_int64, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_int8, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_uint8, test/test_view_ops.py::TestOldViewOpsCUDA::test_unsqueeze_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_view_all_dtypes_and_devices_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_view_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_view_empty_cuda 2025-10-10T02:22:03.5202821Z 2025-10-10T02:22:07.3835631Z Running torch_np/numpy_tests/core/test_dlpack 1/1 ... [2025-10-10 02:22:07.382957] 2025-10-10T02:22:07.3836291Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:22:07.3839793Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_dlpack.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:22:07.383391] 2025-10-10T02:22:11.1053531Z 2025-10-10T02:22:11.1054896Z torch_np/numpy_tests/core/test_dlpack 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_dlpack_1.1_b0ab03ce3acae62e_.log 2025-10-10T02:22:11.1071539Z Running 53 items in this shard: test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_dlpack_destructor_exception, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_dlpack_device, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_dtype_passthrough_dtype0, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_dtype_passthrough_dtype1, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_dtype_passthrough_dtype2, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_dtype_passthrough_dtype3, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_dtype_passthrough_dtype4, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_dtype_passthrough_dtype5, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_dtype_passthrough_dtype6, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_dtype_passthrough_dtype7, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_dtype_passthrough_dtype8, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_dtype_passthrough_dtype9, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_dunder_dlpack_refcount, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_dunder_dlpack_stream, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_from_dlpack_refcount, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_from_torch, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_0, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_1, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_10, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_11, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_12, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_13, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_14, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_15, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_16, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_17, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_18, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_19, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_2, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_20, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_21, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_22, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_23, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_24, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_25, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_26, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_27, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_28, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_29, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_3, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_30, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_31, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_32, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_4, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_5, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_6, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_7, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_8, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_higher_dims_ndim_9, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_ndim0, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_non_contiguous, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_readonly, test/torch_np/numpy_tests/core/test_dlpack.py::TestDLPack::test_to_torch 2025-10-10T02:22:11.1087391Z 2025-10-10T02:22:15.0104885Z Running torch_np/numpy_tests/core/test_getlimits 1/1 ... [2025-10-10 02:22:15.009852] 2025-10-10T02:22:15.0105401Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:22:15.0107320Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_getlimits.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:22:15.010252] 2025-10-10T02:22:19.0335649Z 2025-10-10T02:22:19.0336598Z torch_np/numpy_tests/core/test_getlimits 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_getlimits_1.1_fcd04f230de9bbf6_.log 2025-10-10T02:22:19.0343777Z Running 17 items in this shard: test/torch_np/numpy_tests/core/test_getlimits.py::TestPythonFloat::test_singleton, test/torch_np/numpy_tests/core/test_getlimits.py::TestHalf::test_singleton, test/torch_np/numpy_tests/core/test_getlimits.py::TestSingle::test_singleton, test/torch_np/numpy_tests/core/test_getlimits.py::TestDouble::test_singleton, test/torch_np/numpy_tests/core/test_getlimits.py::TestFinfo::test_basic, test/torch_np/numpy_tests/core/test_getlimits.py::TestFinfo::test_basic_missing, test/torch_np/numpy_tests/core/test_getlimits.py::TestIinfo::test_basic, test/torch_np/numpy_tests/core/test_getlimits.py::TestIinfo::test_unsigned_max_T0, test/torch_np/numpy_tests/core/test_getlimits.py::TestIinfo::test_unsigned_max_T1, test/torch_np/numpy_tests/core/test_getlimits.py::TestIinfo::test_unsigned_max_T2, test/torch_np/numpy_tests/core/test_getlimits.py::TestIinfo::test_unsigned_max_T3, test/torch_np/numpy_tests/core/test_getlimits.py::TestRepr::test_finfo_repr, test/torch_np/numpy_tests/core/test_getlimits.py::TestRepr::test_iinfo_repr, test/torch_np/numpy_tests/core/test_getlimits.py::TestMisc::test_instances, test/torch_np/numpy_tests/core/test_getlimits.py::TestMisc::test_known_types, test/torch_np/numpy_tests/core/test_getlimits.py::TestMisc::test_plausible_finfo, test/torch_np/numpy_tests/core/test_getlimits.py::TestMisc::test_subnormal_warning 2025-10-10T02:22:19.0348836Z 2025-10-10T02:22:22.8715987Z Running test_accelerator 1/1 ... [2025-10-10 02:22:22.871031] 2025-10-10T02:22:22.8716593Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:22:22.8719572Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_accelerator.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:22:22.871527] 2025-10-10T02:22:26.8445883Z 2025-10-10T02:22:26.8446702Z test_accelerator 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_accelerator_1.1_7503bad570e8177e_.log 2025-10-10T02:22:26.8450262Z Running 11 items in this shard: test/test_accelerator.py::TestAccelerator::test_current_accelerator, test/test_accelerator.py::TestAccelerator::test_current_stream_query, test/test_accelerator.py::TestAccelerator::test_device_context_manager, test/test_accelerator.py::TestAccelerator::test_generic_event_behavior, test/test_accelerator.py::TestAccelerator::test_generic_multi_device_behavior, test/test_accelerator.py::TestAccelerator::test_generic_stream_behavior, test/test_accelerator.py::TestAccelerator::test_memory_stats, test/test_accelerator.py::TestAccelerator::test_multi_device_context_manager, test/test_accelerator.py::TestAccelerator::test_multi_device_stream_context_manager, test/test_accelerator.py::TestAccelerator::test_pin_memory_on_non_blocking_copy, test/test_accelerator.py::TestAccelerator::test_stream_context_manager 2025-10-10T02:22:26.8453240Z 2025-10-10T02:22:30.6745172Z Running lazy/test_reuse_ir 1/1 ... [2025-10-10 02:22:30.673976] 2025-10-10T02:22:30.6745706Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:22:30.6746745Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'lazy/test_reuse_ir.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:22:30.674369] 2025-10-10T02:22:34.6468973Z 2025-10-10T02:22:34.6470479Z lazy/test_reuse_ir 1/1 was successful, full logs can be found in artifacts with path test/test-reports/lazy.test_reuse_ir_1.1_4dd9796a057338cd_.log 2025-10-10T02:22:34.6472836Z Running 4 items in this shard: test/lazy/test_reuse_ir.py::TestLazyReuseIr::testAdd, test/lazy/test_reuse_ir.py::TestLazyReuseIr::testAddSub, test/lazy/test_reuse_ir.py::TestLazyReuseIr::testAddSubFallback, test/lazy/test_reuse_ir.py::TestLazyReuseIr::testBatchNorm 2025-10-10T02:22:34.6474390Z 2025-10-10T02:22:38.4768898Z Running torch_np/numpy_tests/lib/test_index_tricks 1/1 ... [2025-10-10 02:22:38.476203] 2025-10-10T02:22:38.4769604Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:22:38.4771290Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/lib/test_index_tricks.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:22:38.476572] 2025-10-10T02:22:42.5003134Z 2025-10-10T02:22:42.5004441Z torch_np/numpy_tests/lib/test_index_tricks 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.lib.test_index_tricks_1.1_9dc3800ac10e145e_.log 2025-10-10T02:22:42.5020752Z Running 47 items in this shard: test/torch_np/numpy_tests/lib/test_index_tricks.py::TestRavelUnravelIndex::test_0d, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestRavelUnravelIndex::test_basic, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestRavelUnravelIndex::test_big_indices, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestRavelUnravelIndex::test_clipmodes, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestRavelUnravelIndex::test_dtypes, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestRavelUnravelIndex::test_empty_array_ravel_mode_clip, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestRavelUnravelIndex::test_empty_array_ravel_mode_raise, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestRavelUnravelIndex::test_empty_array_ravel_mode_wrap, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestRavelUnravelIndex::test_empty_array_unravel, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestRavelUnravelIndex::test_empty_indices, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestRavelUnravelIndex::test_writeability, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestGrid::test_accepts_longdouble, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestGrid::test_accepts_npcomplexfloating, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestGrid::test_accepts_npfloating, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestGrid::test_basic, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestGrid::test_linspace_equivalence, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestGrid::test_mgrid_size_none_handling_start0_stop_10_step0_expected0, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestGrid::test_mgrid_size_none_handling_start_-10_stop_20_step1_expected1, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestGrid::test_nd, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestGrid::test_sparse, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestConcatenator::test_0d, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestConcatenator::test_1d, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestConcatenator::test_2d, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestConcatenator::test_complex_step, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestConcatenator::test_mixed_type, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestConcatenator::test_more_mixed_type, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestNdenumerate::test_basic, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestIndexExpression::test_regression_1, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestIndexExpression::test_simple_1, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestIx_::test_1d_only, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestIx_::test_bool, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestIx_::test_regression_1, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestIx_::test_repeated_input, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestIx_::test_shape_and_dtype, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestC::test_c_, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestFillDiagonal::test_basic, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestFillDiagonal::test_hetero_shape_handling, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestFillDiagonal::test_low_dim_handling, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestFillDiagonal::test_operate_4d_array, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestFillDiagonal::test_tall_matrix, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestFillDiagonal::test_tall_matrix_wrap, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestFillDiagonal::test_wide_matrix, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestDiagIndices::test_diag_indices, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestDiagIndicesFrom::test_diag_indices_from, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestDiagIndicesFrom::test_error_shape_mismatch, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestDiagIndicesFrom::test_error_small_input, test/torch_np/numpy_tests/lib/test_index_tricks.py::TestNdIndex::test_ndindex 2025-10-10T02:22:42.5035436Z 2025-10-10T02:22:46.3233476Z Running nn/test_init 1/1 ... [2025-10-10 02:22:46.322791] 2025-10-10T02:22:46.3234063Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:22:46.3235521Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'nn/test_init.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:22:46.323201] 2025-10-10T02:22:48.1078489Z 2025-10-10T02:22:48.1079742Z benchmark_utils/test_benchmark_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/benchmark_utils.test_benchmark_utils_1.1_fd1d46f532803740_.log 2025-10-10T02:22:48.1085356Z Running 9 items in this shard: test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_adaptive_timer, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_collect_callgrind, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_collect_cpp_callgrind, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_compare, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_cpp_timer, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_fuzzer, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_manipulate_callgrind_stats, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_timer, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_timer_tiny_fast_snippet 2025-10-10T02:22:48.1089983Z 2025-10-10T02:22:50.8479552Z 2025-10-10T02:22:50.8480725Z nn/test_init 1/1 was successful, full logs can be found in artifacts with path test/test-reports/nn.test_init_1.1_09bb5d4c6c81d927_.log 2025-10-10T02:22:50.8492139Z Running 30 items in this shard: test/nn/test_init.py::TestNNInit::test_calculate_gain_leaky_relu, test/nn/test_init.py::TestNNInit::test_calculate_gain_leaky_relu_only_accepts_numbers, test/nn/test_init.py::TestNNInit::test_calculate_gain_linear, test/nn/test_init.py::TestNNInit::test_calculate_gain_nonlinear, test/nn/test_init.py::TestNNInit::test_calculate_gain_only_accepts_valid_nonlinearities, test/nn/test_init.py::TestNNInit::test_constant, test/nn/test_init.py::TestNNInit::test_deprecation, test/nn/test_init.py::TestNNInit::test_dirac_identity, test/nn/test_init.py::TestNNInit::test_dirac_only_works_on_3_4_5d_inputs, test/nn/test_init.py::TestNNInit::test_dirac_properties, test/nn/test_init.py::TestNNInit::test_eye, test/nn/test_init.py::TestNNInit::test_eye_only_works_on_2d_inputs, test/nn/test_init.py::TestNNInit::test_kaiming_normal, test/nn/test_init.py::TestNNInit::test_kaiming_normal_errors_on_inputs_smaller_than_2d, test/nn/test_init.py::TestNNInit::test_kaiming_normal_warning_on_0element_tensor, test/nn/test_init.py::TestNNInit::test_kaiming_uniform, test/nn/test_init.py::TestNNInit::test_kaiming_uniform_errors_on_inputs_smaller_than_2d, test/nn/test_init.py::TestNNInit::test_kaiming_uniform_warning_on_0element_tensor, test/nn/test_init.py::TestNNInit::test_normal, test/nn/test_init.py::TestNNInit::test_ones_and_zeros, test/nn/test_init.py::TestNNInit::test_orthogonal, test/nn/test_init.py::TestNNInit::test_sparse_default_std, test/nn/test_init.py::TestNNInit::test_sparse_only_works_on_2d_inputs, test/nn/test_init.py::TestNNInit::test_trunc_normal, test/nn/test_init.py::TestNNInit::test_trunc_normal_generator, test/nn/test_init.py::TestNNInit::test_uniform, test/nn/test_init.py::TestNNInit::test_xavier_normal, test/nn/test_init.py::TestNNInit::test_xavier_normal_errors_on_inputs_smaller_than_2d, test/nn/test_init.py::TestNNInit::test_xavier_uniform, test/nn/test_init.py::TestNNInit::test_xavier_uniform_errors_on_inputs_smaller_than_2d 2025-10-10T02:22:50.8504269Z 2025-10-10T02:22:52.1073905Z Running torch_np/numpy_tests/core/test_numerictypes 1/1 ... [2025-10-10 02:22:52.106678] 2025-10-10T02:22:52.1074446Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:22:52.1075934Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_numerictypes.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:22:52.107060] 2025-10-10T02:22:54.7909138Z Running test_type_promotion 1/1 ... [2025-10-10 02:22:54.790275] 2025-10-10T02:22:54.7909829Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:22:54.7911465Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_type_promotion.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:22:54.790704] 2025-10-10T02:22:56.2306893Z 2025-10-10T02:22:56.2308220Z torch_np/numpy_tests/core/test_numerictypes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_numerictypes_1.1_98cad5f0df449ff9_.log 2025-10-10T02:22:56.2322694Z Running 34 items in this shard: test/torch_np/numpy_tests/core/test_numerictypes.py::TestCommonType::test_scalar_loses1, test/torch_np/numpy_tests/core/test_numerictypes.py::TestCommonType::test_scalar_loses2, test/torch_np/numpy_tests/core/test_numerictypes.py::TestCommonType::test_scalar_wins, test/torch_np/numpy_tests/core/test_numerictypes.py::TestCommonType::test_scalar_wins2, test/torch_np/numpy_tests/core/test_numerictypes.py::TestCommonType::test_scalar_wins3, test/torch_np/numpy_tests/core/test_numerictypes.py::TestIsSubDType::test_both_abstract, test/torch_np/numpy_tests/core/test_numerictypes.py::TestIsSubDType::test_nondtype_nonscalartype, test/torch_np/numpy_tests/core/test_numerictypes.py::TestIsSubDType::test_same, test/torch_np/numpy_tests/core/test_numerictypes.py::TestIsSubDType::test_sibling_class, test/torch_np/numpy_tests/core/test_numerictypes.py::TestIsSubDType::test_subclass, test/torch_np/numpy_tests/core/test_numerictypes.py::TestIsSubDType::test_subclass_backwards, test/torch_np/numpy_tests/core/test_numerictypes.py::TestBitName::test_abstract, test/torch_np/numpy_tests/core/test_numerictypes.py::TestDocStrings::test_platform_dependent_aliases, test/torch_np/numpy_tests/core/test_numerictypes.py::TestScalarTypeNames::test_names_are_undersood_by_dtype_t0, test/torch_np/numpy_tests/core/test_numerictypes.py::TestScalarTypeNames::test_names_are_undersood_by_dtype_t1, test/torch_np/numpy_tests/core/test_numerictypes.py::TestScalarTypeNames::test_names_are_undersood_by_dtype_t2, test/torch_np/numpy_tests/core/test_numerictypes.py::TestScalarTypeNames::test_names_are_undersood_by_dtype_t3, test/torch_np/numpy_tests/core/test_numerictypes.py::TestScalarTypeNames::test_names_are_undersood_by_dtype_t4, test/torch_np/numpy_tests/core/test_numerictypes.py::TestScalarTypeNames::test_names_are_undersood_by_dtype_t5, test/torch_np/numpy_tests/core/test_numerictypes.py::TestScalarTypeNames::test_names_are_undersood_by_dtype_t6, test/torch_np/numpy_tests/core/test_numerictypes.py::TestScalarTypeNames::test_names_are_undersood_by_dtype_t7, test/torch_np/numpy_tests/core/test_numerictypes.py::TestScalarTypeNames::test_names_are_undersood_by_dtype_t8, test/torch_np/numpy_tests/core/test_numerictypes.py::TestScalarTypeNames::test_names_are_undersood_by_dtype_t9, test/torch_np/numpy_tests/core/test_numerictypes.py::TestScalarTypeNames::test_names_are_unique, test/torch_np/numpy_tests/core/test_numerictypes.py::TestScalarTypeNames::test_names_reflect_attributes_t0, test/torch_np/numpy_tests/core/test_numerictypes.py::TestScalarTypeNames::test_names_reflect_attributes_t1, test/torch_np/numpy_tests/core/test_numerictypes.py::TestScalarTypeNames::test_names_reflect_attributes_t2, test/torch_np/numpy_tests/core/test_numerictypes.py::TestScalarTypeNames::test_names_reflect_attributes_t3, test/torch_np/numpy_tests/core/test_numerictypes.py::TestScalarTypeNames::test_names_reflect_attributes_t4, test/torch_np/numpy_tests/core/test_numerictypes.py::TestScalarTypeNames::test_names_reflect_attributes_t5, test/torch_np/numpy_tests/core/test_numerictypes.py::TestScalarTypeNames::test_names_reflect_attributes_t6, test/torch_np/numpy_tests/core/test_numerictypes.py::TestScalarTypeNames::test_names_reflect_attributes_t7, test/torch_np/numpy_tests/core/test_numerictypes.py::TestScalarTypeNames::test_names_reflect_attributes_t8, test/torch_np/numpy_tests/core/test_numerictypes.py::TestScalarTypeNames::test_names_reflect_attributes_t9 2025-10-10T02:22:56.2336562Z 2025-10-10T02:22:59.5158817Z 2025-10-10T02:22:59.5160063Z test_type_promotion 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_type_promotion_1.1_f7173feb7ff71173_.log 2025-10-10T02:22:59.5309786Z Running 423 items in this shard: test/test_type_promotion.py::TestTypePromotionCUDA::test_add_wrapped_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_alpha_mismatch_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_alternate_result_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_bfloat16_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_booleans_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_can_cast_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_cat_different_dtypes_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_cat_out_different_dtypes_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_bool_bool_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_bool_bool_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_bool_bool_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_bool_bool_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_bool_float32_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_bool_float32_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_bool_float32_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_bool_float32_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_bool_float64_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_bool_float64_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_bool_float64_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_bool_float64_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_bool_int32_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_bool_int32_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_bool_int32_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_bool_int32_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float32_bool_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float32_bool_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float32_bool_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float32_bool_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float32_float32_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float32_float32_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float32_float32_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float32_float32_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float32_float64_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float32_float64_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float32_float64_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float32_float64_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float32_int32_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float32_int32_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float32_int32_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float32_int32_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float64_bool_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float64_bool_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float64_bool_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float64_bool_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float64_float32_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float64_float32_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float64_float32_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float64_float32_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float64_float64_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float64_float64_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float64_float64_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float64_float64_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float64_int32_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float64_int32_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float64_int32_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_float64_int32_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_int32_bool_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_int32_bool_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_int32_bool_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_int32_bool_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_int32_float32_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_int32_float32_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_int32_float32_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_int32_float32_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_int32_float64_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_int32_float64_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_int32_float64_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_int32_float64_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_int32_int32_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_int32_int32_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_int32_int32_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_clamp_type_promotion_cuda_int32_int32_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_comparison_ops_with_type_promotion_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_complex_assertraises_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_complex_half_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_complex_promotion_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_complex_scalar_mult_tensor_promotion_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_computation_ignores_out_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_create_bool_tensors_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_div_promotion_cuda_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_div_promotion_cuda_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_div_promotion_cuda_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_div_promotion_cuda_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_div_promotion_cuda_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_div_promotion_cuda_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_div_promotion_inplace_cuda_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_div_promotion_inplace_cuda_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_div_promotion_inplace_cuda_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_div_promotion_inplace_cuda_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_div_promotion_inplace_cuda_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_div_promotion_inplace_cuda_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_div_promotion_inplace_cuda_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_div_promotion_inplace_cuda_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_div_promotion_out_cuda_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_div_promotion_out_cuda_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_div_promotion_out_cuda_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_div_promotion_out_cuda_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_div_promotion_out_cuda_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_div_promotion_out_cuda_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_div_promotion_out_cuda_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_div_promotion_out_cuda_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_float_promotion_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_from_issue_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_half_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_indexing_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_indexing_fail_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_inplace_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_int_promotion_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_int_to_float_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_integer_addcdiv_deprecated_cuda_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_integer_addcdiv_deprecated_cuda_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_integer_addcdiv_deprecated_cuda_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_integer_addcdiv_deprecated_cuda_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_integer_addcdiv_deprecated_cuda_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_lt_with_type_promotion_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_many_promotions_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_mixed_type_backward_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_non_promoting_ops_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_bool_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_bool_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_bool_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_bool_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_bool_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_bool_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_bool_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_bool_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_bool_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_bool_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_bool_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_complex128_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_complex128_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_complex128_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_complex128_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_complex128_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_complex128_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_complex128_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_complex128_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_complex128_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_complex128_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_complex128_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_complex64_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_complex64_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_complex64_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_complex64_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_complex64_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_complex64_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_complex64_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_complex64_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_complex64_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_complex64_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_complex64_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float16_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float16_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float16_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float16_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float16_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float16_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float16_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float16_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float16_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float16_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float16_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float32_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float32_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float32_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float32_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float32_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float32_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float32_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float32_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float32_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float32_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float32_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float64_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float64_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float64_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float64_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float64_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float64_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float64_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float64_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float64_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float64_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_float64_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int16_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int16_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int16_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int16_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int16_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int16_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int16_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int16_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int16_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int16_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int16_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int32_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int32_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int32_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int32_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int32_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int32_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int32_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int32_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int32_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int32_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int32_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int64_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int64_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int64_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int64_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int64_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int64_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int64_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int64_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int64_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int64_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int64_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int8_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int8_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int8_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int8_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int8_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int8_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int8_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int8_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int8_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int8_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_int8_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_uint8_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_uint8_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_uint8_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_uint8_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_uint8_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_uint8_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_uint8_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_uint8_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_uint8_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_uint8_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_numpy_array_binary_ufunc_promotion_cuda_uint8_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_promote_self_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_promote_types_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bfloat16_bfloat16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bfloat16_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bfloat16_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bfloat16_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bfloat16_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bfloat16_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bfloat16_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bfloat16_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bfloat16_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bfloat16_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bfloat16_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bfloat16_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bool_bfloat16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bool_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bool_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bool_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bool_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bool_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bool_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bool_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bool_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bool_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bool_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_bool_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex128_bfloat16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex128_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex128_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex128_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex128_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex128_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex128_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex128_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex128_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex128_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex128_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex128_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex64_bfloat16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex64_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex64_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex64_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex64_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex64_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex64_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex64_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex64_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex64_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex64_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_complex64_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float16_bfloat16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float16_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float16_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float16_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float16_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float16_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float16_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float16_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float16_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float16_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float16_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float16_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float32_bfloat16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float32_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float32_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float32_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float32_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float32_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float32_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float32_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float32_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float32_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float32_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float32_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float64_bfloat16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float64_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float64_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float64_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float64_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float64_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float64_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float64_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float64_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float64_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float64_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_float64_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int16_bfloat16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int16_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int16_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int16_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int16_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int16_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int16_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int16_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int16_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int16_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int16_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int16_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int32_bfloat16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int32_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int32_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int32_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int32_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int32_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int32_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int32_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int32_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int32_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int32_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int32_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int64_bfloat16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int64_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int64_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int64_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int64_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int64_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int64_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int64_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int64_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int64_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int64_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int64_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int8_bfloat16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int8_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int8_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int8_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int8_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int8_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int8_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int8_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int8_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int8_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int8_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_int8_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_uint8_bfloat16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_uint8_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_uint8_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_uint8_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_uint8_float16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_uint8_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_uint8_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_uint8_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_uint8_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_uint8_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_uint8_int8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_cuda_uint8_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_result_type_tensor_vs_scalar_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_sparse_add_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_sparse_div_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_sparse_div_promotion_cuda_bool, test/test_type_promotion.py::TestTypePromotionCUDA::test_sparse_div_promotion_cuda_int16, test/test_type_promotion.py::TestTypePromotionCUDA::test_sparse_div_promotion_cuda_int32, test/test_type_promotion.py::TestTypePromotionCUDA::test_sparse_div_promotion_cuda_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_sparse_div_promotion_cuda_uint8, test/test_type_promotion.py::TestTypePromotionCUDA::test_sparse_mul_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_sparse_sub_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_ternary_out_promotion_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_transpose_cuda, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_complex128_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_complex128_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_complex128_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_complex128_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_complex128_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_complex64_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_complex64_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_complex64_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_complex64_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_complex64_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_float32_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_float32_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_float32_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_float32_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_float32_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_float64_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_float64_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_float64_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_float64_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_float64_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_int64_complex128, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_int64_complex64, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_int64_float32, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_int64_float64, test/test_type_promotion.py::TestTypePromotionCUDA::test_unary_op_out_casting_cuda_int64_int64, test/test_type_promotion.py::TestTypePromotionCUDA::test_unsigned_cuda 2025-10-10T02:22:59.5453522Z 2025-10-10T02:23:00.0899132Z Running torch_np/numpy_tests/core/test_scalar_methods 1/1 ... [2025-10-10 02:23:00.089174] 2025-10-10T02:23:00.0899848Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:23:00.0901233Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_scalar_methods.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:23:00.089561] 2025-10-10T02:23:03.4044505Z Running torch_np/numpy_tests/fft/test_helper 1/1 ... [2025-10-10 02:23:03.403914] 2025-10-10T02:23:03.4045018Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:23:03.4048910Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/fft/test_helper.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:23:03.404362] 2025-10-10T02:23:03.9131749Z 2025-10-10T02:23:03.9138141Z torch_np/numpy_tests/core/test_scalar_methods 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_scalar_methods_1.1_13b953159f4dde16_.log 2025-10-10T02:23:03.9190617Z Running 77 items in this shard: test/torch_np/numpy_tests/core/test_scalar_methods.py::TestAsIntegerRatio::test_against_known_values, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestAsIntegerRatio::test_errors_ftype0, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestAsIntegerRatio::test_errors_ftype1, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestAsIntegerRatio::test_errors_ftype2, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestAsIntegerRatio::test_roundtrip_ftype0_frac_vals0_exp_vals0, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestAsIntegerRatio::test_roundtrip_ftype1_frac_vals1_exp_vals1, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestAsIntegerRatio::test_roundtrip_ftype2_frac_vals2_exp_vals2, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestAsIntegerRatio::test_simple_fractions_ftype0, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestAsIntegerRatio::test_simple_fractions_ftype1, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestAsIntegerRatio::test_simple_fractions_ftype2, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestAsIntegerRatio::test_small_ftype0_f_-0_875_ratio1, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestAsIntegerRatio::test_small_ftype0_f_0_0_ratio2, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestAsIntegerRatio::test_small_ftype0_f_0_875_ratio0, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestAsIntegerRatio::test_small_ftype0_f_11_5_ratio3, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestAsIntegerRatio::test_small_ftype1_f_-0_875_ratio1, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestAsIntegerRatio::test_small_ftype1_f_0_0_ratio2, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestAsIntegerRatio::test_small_ftype1_f_0_875_ratio0, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestAsIntegerRatio::test_small_ftype1_f_11_5_ratio3, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestAsIntegerRatio::test_small_ftype2_f_-0_875_ratio1, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestAsIntegerRatio::test_small_ftype2_f_0_0_ratio2, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestAsIntegerRatio::test_small_ftype2_f_0_875_ratio0, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestAsIntegerRatio::test_small_ftype2_f_11_5_ratio3, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestIsInteger::test_false_code_b, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestIsInteger::test_false_code_h, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestIsInteger::test_false_code_i, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestIsInteger::test_false_code_l, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestIsInteger::test_special_str_value_inf_code_d, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestIsInteger::test_special_str_value_inf_code_e, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestIsInteger::test_special_str_value_inf_code_f, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestIsInteger::test_special_str_value_nan_code_d, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestIsInteger::test_special_str_value_nan_code_e, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestIsInteger::test_special_str_value_nan_code_f, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestIsInteger::test_true_code_B, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestIsInteger::test_true_code_b, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestIsInteger::test_true_code_d, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestIsInteger::test_true_code_e, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestIsInteger::test_true_code_f, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestIsInteger::test_true_code_h, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestIsInteger::test_true_code_i, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestIsInteger::test_true_code_l, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_abc_cls0, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_abc_cls1, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_abc_cls2, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_abc_cls3, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_abc_cls4, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_abc_cls5, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_abc_complexfloating, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_abc_complexfloating_subscript_tuple_arg_len_0, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_abc_complexfloating_subscript_tuple_arg_len_1, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_abc_complexfloating_subscript_tuple_arg_len_2, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_abc_complexfloating_subscript_tuple_arg_len_3, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_abc_non_numeric_cls0, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_concrete_code_?, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_concrete_code_B, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_concrete_code_D, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_concrete_code_F, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_concrete_code_b, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_concrete_code_d, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_concrete_code_e, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_concrete_code_f, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_concrete_code_h, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_concrete_code_i, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_concrete_code_l, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_subscript_scalar, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_subscript_tuple_arg_len_0, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_subscript_tuple_arg_len_1, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_subscript_tuple_arg_len_2, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestClassGetItem::test_subscript_tuple_arg_len_3, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestBitCount::test_bit_count, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestBitCount::test_small_itype0, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestBitCount::test_small_itype1, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestBitCount::test_small_itype2, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestBitCount::test_small_itype3, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestBitCount::test_small_itype4, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestBitCount::test_small_itype5, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestBitCount::test_small_itype6, test/torch_np/numpy_tests/core/test_scalar_methods.py::TestBitCount::test_small_itype7 2025-10-10T02:23:03.9235091Z 2025-10-10T02:23:07.4779739Z 2025-10-10T02:23:07.4780863Z torch_np/numpy_tests/fft/test_helper 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.fft.test_helper_1.1_4d788950329a8b0e_.log 2025-10-10T02:23:07.4784867Z Running 8 items in this shard: test/torch_np/numpy_tests/fft/test_helper.py::TestFFTShift::test_axes_keyword, test/torch_np/numpy_tests/fft/test_helper.py::TestFFTShift::test_definition, test/torch_np/numpy_tests/fft/test_helper.py::TestFFTShift::test_equal_to_original, test/torch_np/numpy_tests/fft/test_helper.py::TestFFTShift::test_inverse, test/torch_np/numpy_tests/fft/test_helper.py::TestFFTShift::test_uneven_dims, test/torch_np/numpy_tests/fft/test_helper.py::TestFFTFreq::test_definition, test/torch_np/numpy_tests/fft/test_helper.py::TestRFFTFreq::test_definition, test/torch_np/numpy_tests/fft/test_helper.py::TestIRFFTN::test_not_last_axis_success 2025-10-10T02:23:07.4787524Z 2025-10-10T02:23:07.8438605Z Running torch_np/test_function_base 1/1 ... [2025-10-10 02:23:07.843243] 2025-10-10T02:23:07.8439231Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:23:07.8441250Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/test_function_base.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:23:07.843685] 2025-10-10T02:23:11.3072423Z Running profiler/test_profiler_tree 1/1 ... [2025-10-10 02:23:11.306743] 2025-10-10T02:23:11.3073050Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:23:11.3075686Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_profiler_tree.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:23:11.307133] 2025-10-10T02:23:11.7667615Z 2025-10-10T02:23:11.7668832Z torch_np/test_function_base 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.test_function_base_1.1_2939ee6ae0a89997_.log 2025-10-10T02:23:11.7669788Z Running 1 items in this shard: test/torch_np/test_function_base.py::TestAppend::test_basic 2025-10-10T02:23:11.7670162Z 2025-10-10T02:23:15.3807871Z 2025-10-10T02:23:15.3808928Z profiler/test_profiler_tree 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_profiler_tree_1.1_df669bd43050df5d_.log 2025-10-10T02:23:15.3814219Z Running 10 items in this shard: test/profiler/test_profiler_tree.py::TestProfilerTree::test_profiler_experimental_tree, test/profiler/test_profiler_tree.py::TestProfilerTree::test_profiler_experimental_tree_cuda, test/profiler/test_profiler_tree.py::TestProfilerTree::test_profiler_experimental_tree_cuda_detailed, test/profiler/test_profiler_tree.py::TestProfilerTree::test_profiler_experimental_tree_cuda_with_stream, test/profiler/test_profiler_tree.py::TestProfilerTree::test_profiler_experimental_tree_with_memory, test/profiler/test_profiler_tree.py::TestProfilerTree::test_profiler_experimental_tree_with_memory_and_stack, test/profiler/test_profiler_tree.py::TestProfilerTree::test_profiler_experimental_tree_with_record_function, test/profiler/test_profiler_tree.py::TestProfilerTree::test_profiler_experimental_tree_with_stack_and_modules, test/profiler/test_profiler_tree.py::TestProfilerTree::test_profiler_experimental_tree_with_stack_and_torch_dispatch, test/profiler/test_profiler_tree.py::TestProfilerTree::test_profiler_experimental_tree_with_stack_and_torch_function 2025-10-10T02:23:15.3818421Z 2025-10-10T02:23:15.6138847Z Running functorch/test_eager_transforms 1/1 ... [2025-10-10 02:23:15.613300] 2025-10-10T02:23:15.6139473Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:23:15.6143024Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_eager_transforms.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:23:15.613728] 2025-10-10T02:23:19.1977960Z Running test_sparse 1/1 ... [2025-10-10 02:23:19.197109] 2025-10-10T02:23:19.1978505Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-10-10T02:23:19.1979734Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_sparse.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-10-10 02:23:19.197514] 2025-10-10T02:23:21.6408483Z 2025-10-10T02:23:21.6409962Z functorch/test_eager_transforms 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_eager_transforms_1.1_979ad6ef0ef55061_.log 2025-10-10T02:23:21.6642351Z Running 355 items in this shard: test/functorch/test_eager_transforms.py::TestSliceArgnums::test_argnums_reorders, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_duplicate_argnums, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_flat_args_with_negative_int_argnum, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_flat_args_with_positive_int_argnum, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_flat_args_with_tuple_argnum, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_invalid_argnum_type, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_not_enough_argnums, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_out_of_bounds_argnum_values, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_pytree_args, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_buffer_tying, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_combine_state_for_ensemble_error, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_combine_state_for_ensemble_smoke, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_correctness_mnist_mechanism_functional_call, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_correctness_mnist_mechanism_make_functional, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_disable_autograd_tracking_disable_autograd_tracking_False, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_disable_autograd_tracking_disable_autograd_tracking_True, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_make_functional_state_correctly_returned_after_forward_mechanism_functional_call, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_make_functional_state_correctly_returned_after_forward_mechanism_make_functional, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_parameter_tying, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_parameter_tying_ensemble, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_parameter_tying_grad, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_stack_module_state_error, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_stack_module_state_leaf, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_stack_module_state_mismatch_error, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_stack_module_state_smoke, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_using_detach_functional_call_detach_params_False, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_using_detach_functional_call_detach_params_True, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_with_buffers_disable_autograd_tracking_disable_autograd_tracking_False, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_with_buffers_disable_autograd_tracking_disable_autograd_tracking_True, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_advanced_indexing_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_argnums_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_composed_with_autograd_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_composite_complicated_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_composite_simple_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_composite_two_ops_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_conj_bit_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_dtype_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_escaped_wrappers_are_ignored_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_escaped_wrappers_are_marked_as_dead_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_fn_with_kwargs_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_functional_init_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_functional_init_with_buffers_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_grad_aux_pytree_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_grad_aux_tensor_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_grad_of_vjp_composition_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_grad_of_vjp_of_grad_composition_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_grad_pytree_inputs_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_inplace_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_inplace_on_captures_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_inplace_on_view_base_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_inplace_on_view_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_invalid_argnums_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_is_cuda_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_manual_seed_inside_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_negative_argnums_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_nesting_simple_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_inside_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_mixed_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_nested_complicated_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_nested_simple_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_outside_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_outside_vjp_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_outside_vjp_fn_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_outside_vjp_only_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_value_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_numel_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_out_of_order_argnums_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_primitive_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_print_captured_tensor_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_shape_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_ctor_inside_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_print_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_print_grad_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_print_vmap_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_print_vmap_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_print_vmap_vmap_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_unrelated_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_unrelated_hessian_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_unrelated_vjp_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_unrelated_vjp_multiple_inputs_outputs_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_view_inplace_simple_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_views_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_aux_pytree_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_aux_tensor_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_of_grad_composition_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_outputs_can_any_pytree_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_pytree_error_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_pytree_input_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_pytree_output_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_two_outputs_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_zero_grad_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_log_softmax_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_new_empty_materializes_tensor_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_new_zeros_materializes_tensor_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_per_sample_grads_embeddingnet_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_per_sample_grads_embeddingnet_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_per_sample_grads_inplace_view_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_per_sample_grads_simple_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_correctness_different_devices_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_correctness_different_devices_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_default_arg_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_default_arg_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_multi_input_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_multi_input_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_multi_input_multi_output_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_multi_input_multi_output_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_simple_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_simple_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_unrelated_outputs_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_unrelated_outputs_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_zero_dim_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_zero_dim_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_defaults_to_zero_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_defaults_to_zero_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_effect_on_return_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_effect_on_return_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_tuple_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_tuple_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_aux_pytree_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_aux_pytree_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_aux_tensor_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_aux_tensor_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev__preallocate_and_copy_False_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev__preallocate_and_copy_True_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev_chunksize_one__preallocate_and_copy_False_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev_chunksize_one__preallocate_and_copy_True_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev_composition__preallocate_and_copy_False_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev_composition__preallocate_and_copy_True_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_complex_error_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_diff_numel_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_diff_numel_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_dimensionality_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_dimensionality_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_empty_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_empty_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_empty_output_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_empty_output_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_float_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_float_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_hessian_simple_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_inplace_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_inplace_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_jac_with_non_tensor_args_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_jac_with_non_tensor_args_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_args_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_args_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_outputs_pytree_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_outputs_pytree_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_outputs_pytree_multidim_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_outputs_pytree_multidim_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_pytree_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_pytree_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_multiple_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_multiple_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_pytree_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_pytree_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_single_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_single_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_negative_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_negative_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_nested_jac_simple_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_nested_jac_simple_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_out_of_bounds_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_out_of_bounds_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_outputs_can_any_pytree_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_outputs_can_any_pytree_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_repeated_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_repeated_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_simple_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_simple_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_simple_not_flat_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_simple_not_flat_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_take_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_take_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_unrelated_input_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_unrelated_input_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_unrelated_output_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_unrelated_output_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_vmap_on_jac_simple_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_vmap_on_jac_simple_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_autograd_function_disables_fwd_grad_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_aux_pytree_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_aux_tensor_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_disable_fwd_grad_inside_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_disable_fwd_grad_mixed_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_disable_fwd_grad_outside_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_inplace_on_captures_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_inputs_are_tuples_of_tensors_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_jvp_inside_autograd_function_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_jvp_new_tensor_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_multiple_inputs_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_multiple_inputs_outputs_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_multiple_outputs_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_nonempty_primals_and_tangents_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_outputs_can_any_pytree_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_primals_tangents_length_mismatch_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_pytree_inputs_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_pytree_inputs_error_cases_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_simple_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_strict_mode_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_unrelated_input_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_unrelated_output_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_zerotensor_vmapjvp_interaction_cuda, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_basic_cuda_float32, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_composition_grad_cuda_float32, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_composition_vmap_cuda_float32, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_errors_cuda, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_nested_input_nested_output_cuda_float32, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_return_cuda_float32, test/functorch/test_eager_transforms.py::TestVmapJvpInplaceViewCUDA::test_all_dual_base_inplace_cuda, test/functorch/test_eager_transforms.py::TestVmapJvpInplaceViewCUDA::test_all_dual_base_view_inplace_cuda, test/functorch/test_eager_transforms.py::TestVmapJvpInplaceViewCUDA::test_all_dual_no_view_cuda, test/functorch/test_eager_transforms.py::TestVmapJvpInplaceViewCUDA::test_right_dual_base_prop_cuda, test/functorch/test_eager_transforms.py::TestVmapJvpInplaceViewCUDA::test_right_dual_view_prop_cuda, test/functorch/test_eager_transforms.py::TestHessianCUDA::test_hessian_vectorize_correctness_multi_input_cuda, test/functorch/test_eager_transforms.py::TestHessianCUDA::test_hessian_vectorize_correctness_simple_cuda, test/functorch/test_eager_transforms.py::TestHessianCUDA::test_hessian_vectorize_correctness_unrelated_outputs_cuda, test/functorch/test_eager_transforms.py::TestHessianCUDA::test_jacfwd_different_levels_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_functionalize_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_grad_and_value_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_hessian_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_jacrev_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_vmap_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_functional_jacfwd_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_functional_jacrev_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_functional_jvp_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_functional_vjp_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_can_use_functionalize_when_key_is_excluded_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_can_use_grad_when_key_is_excluded_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_can_use_vmap_when_key_is_excluded_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_functionalize_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_grad_and_value_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_hessian_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_jacrev_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_vmap_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_grad_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_grad_vjp_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_grad_vmap_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_jvp_supports_saved_tensor_hooks_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_make_fx_jacrev_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_make_fx_vjp_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_make_fx_vmap_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_no_warning_on_import_functorch_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_requires_grad_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_retain_grad_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_transforms_dont_support_saved_tensor_hooks_transform_grad_and_value_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_transforms_dont_support_saved_tensor_hooks_transform_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_transforms_dont_support_saved_tensor_hooks_transform_hessian_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_transforms_dont_support_saved_tensor_hooks_transform_jacrev_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vjp_doesnt_support_saved_tensor_hooks_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vjp_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vjp_vjp_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vjp_vmap_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vmap_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vmap_vjp_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vmap_vmap_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_ensemble_regression_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_ensemble_regression_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_AlphaDropout_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_AlphaDropout_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_Dropout_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_Dropout_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_FeatureAlphaDropout_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_FeatureAlphaDropout_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_lennard_jones_batched_jac_jac_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_lennard_jones_batched_jac_jac_jacrev_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_maml_omniglot_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_maml_omniglot_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_maml_regression_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_maml_regression_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_resnet18_per_sample_grads_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_resnet18_per_sample_grads_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_update_batch_norm_mechanism_functional_call_originally_track_running_stats_False_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_update_batch_norm_mechanism_functional_call_originally_track_running_stats_True_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_update_batch_norm_mechanism_make_functional_originally_track_running_stats_False_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_update_batch_norm_mechanism_make_functional_originally_track_running_stats_True_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_basic_sum_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_functional_call_multiple_dicts_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_grad_grad_sum_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_grad_name_wrapping_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_grad_sum_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_no_grad_inside_grad_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_no_grad_outside_grad_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_vmap_grad_sum_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_vmap_sum_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fake_tensors_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fx_multi_out_op_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fx_out_op_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fx_reapply_views_simple_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fx_simple_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fx_transpose_simple_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_grad_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_nonfunctional_output_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_opt_tensor_list_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_optional_tensorlist1_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_optional_tensorlist2_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_inplace_view_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_linear_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_multioutput_inplace_slice_view_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_multioutput_view_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_resize_program_inputs_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_simple_view_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_vmap_functionalize_jvp_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_input_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_input_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_neither_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_neither_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_output_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_output_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_input_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_input_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_neither_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_neither_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_output_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_output_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_input_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_input_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_neither_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_neither_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_output_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_output_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_input_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_input_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_neither_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_neither_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_output_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_output_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_grad_fn_name_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_needs_input_grads_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_once_differentiable_autograd_vjp_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_once_differentiable_grad_vjp_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_set_materialize_grads_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_has_vmap_staticmethod_and_has_generate_vmap_rule_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_in_dims_multiple_inputs_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_in_dims_single_input_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_incompatible_out_dims_error_msg_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_info_object_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_kwarg_only_tensors_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_no_vmap_staticmethod_and_no_generate_vmap_rule_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_none_returns_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_should_have_two_returns_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_skips_empty_layer_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_CtxWithSavedTensors_error_if_name_collision_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_CtxWithSavedTensors_nesting_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_CtxWithSavedTensors_overrides_saved_tensors_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_CtxWithSavedTensors_passthrough_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_debug_unwrap_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_reductify_leaf_cuda, test/functorch/test_eager_transforms.py::TestCompileTransformsCUDA::test_compile_vmap_hessian_cuda, test/functorch/test_eager_transforms.py::TestCompileTransformsCUDA::test_grad_deprecated_api_cuda 2025-10-10T02:23:21.6840939Z 2025-10-10T02:23:51.2114903Z 2025-10-10T02:23:51.2115872Z test_sparse_csr 1/2 was successful, full logs can be found in artifacts with path test/test-reports/test_sparse_csr_1.2_e71d1888f8d6e313_.log 2025-10-10T02:23:51.3063517Z Running 2575 items in this shard: test/test_sparse_csr.py::TestSparseCSRSampler::test_make_crow_indices, test/test_sparse_csr.py::TestSparseCSRCUDA::test_add_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_add_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_all_sparse_csr_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_all_sparse_csr_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_all_sparse_csr_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_all_sparse_csr_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_all_sparse_csr_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_all_sparse_csr_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_all_sparse_csr_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_all_sparse_csr_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_all_sparse_csr_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_all_sparse_csr_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_dense_result_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_dense_result_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_dense_result_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_0_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_0_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_0_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_1_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_1_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_1_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_1_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_1_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_25_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_25_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_0_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_1_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_25_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_25_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_25_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_0_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_0_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_1_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_1_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_25_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_0_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_0_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_1_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_25_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_25_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_0_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_0_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_0_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_1_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_1_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_25_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_25_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_25_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_25_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_1_m_0_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_1_m_0_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_1_m_0_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_1_m_1_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_1_m_1_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_1_m_1_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_1_m_25_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_1_m_25_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_1_m_25_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_1_m_25_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_1_m_25_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_0_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_0_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_1_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_1_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_1_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_1_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_25_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_10_m_0_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_10_m_0_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_10_m_0_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_10_m_0_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_10_m_0_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_10_m_1_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_10_m_1_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_10_m_1_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_10_m_1_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_10_m_25_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_10_m_25_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_10_m_25_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_10_m_25_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_1_m_0_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_1_m_0_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_1_m_0_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_1_m_0_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_1_m_1_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_1_m_1_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_1_m_25_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_1_m_25_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_1_m_25_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_1_m_25_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_1_m_25_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmv_shape_11x9_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmv_shape_3x3_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmv_shape_3x3_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmv_shape_3x3_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmv_shape_5x7_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmv_shape_5x7_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_dense_output_addmm_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_dense_output_addmv_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_dense_output_mm_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_dense_output_mv_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_abs_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_asin_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_asinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_atan_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_atanh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_conj_physical_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_conj_physical_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_deg2rad_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_expm1_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_floor_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_frac_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_isinf_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_isinf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_isnan_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_isposinf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_log1p_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_log1p_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_neg_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_nn_functional_relu_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_round_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_sign_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_sinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_sqrt_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_sqrt_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_tan_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_tan_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_tanh_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_tanh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_baddbmm_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int32_noncontiguous_False_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int32_noncontiguous_False_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int32_noncontiguous_False_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int32_noncontiguous_False_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int32_noncontiguous_True_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int32_noncontiguous_True_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int64_noncontiguous_False_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int64_noncontiguous_False_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int64_noncontiguous_False_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int64_noncontiguous_True_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int64_noncontiguous_True_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int32_noncontiguous_False_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int32_noncontiguous_False_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int32_noncontiguous_True_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int32_noncontiguous_True_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int32_noncontiguous_True_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int32_noncontiguous_True_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int32_noncontiguous_True_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int64_noncontiguous_False_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int64_noncontiguous_False_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int64_noncontiguous_False_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int64_noncontiguous_True_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int32_noncontiguous_False_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int32_noncontiguous_True_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int32_noncontiguous_True_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int32_noncontiguous_True_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int64_noncontiguous_False_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int64_noncontiguous_False_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int64_noncontiguous_True_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int64_noncontiguous_True_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int64_noncontiguous_True_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int32_noncontiguous_False_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int32_noncontiguous_False_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int32_noncontiguous_True_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int32_noncontiguous_True_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int32_noncontiguous_True_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int64_noncontiguous_False_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int64_noncontiguous_False_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int64_noncontiguous_True_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int64_noncontiguous_True_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int32_noncontiguous_False_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int32_noncontiguous_True_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int32_noncontiguous_True_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int32_noncontiguous_True_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int64_noncontiguous_True_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int64_noncontiguous_True_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_3_int32_noncontiguous_False_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_3_int32_noncontiguous_False_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_3_int32_noncontiguous_True_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_3_int32_noncontiguous_True_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_3_int32_noncontiguous_True_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_3_int64_noncontiguous_False_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_3_int64_noncontiguous_False_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_3_int64_noncontiguous_False_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_3_int64_noncontiguous_True_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_3_int64_noncontiguous_True_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_bmm_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_bmm_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseBSC_SparseBSR_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseBSC_SparseCSC_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseBSR_SparseBSC_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseBSR_SparseCSR_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseCSC_SparseBSC_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseCSC_SparseBSR_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseCSC_SparseCSR_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseCSR_SparseBSR_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_coo_csr_conversion_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_coo_csr_conversion_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_coo_csr_conversion_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_coo_csr_conversion_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_coo_conversion_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_coo_conversion_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_coo_conversion_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_coo_conversion_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_coo_conversion_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_coo_conversion_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_coo_conversion_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_double_to_sparse_csr_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_is_contiguous_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_matvec_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_matvec_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_nnz_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_to_block_csr_blocksize_2_cuda_float64_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_to_block_csr_blocksize_4_cuda_float64_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_to_block_csr_errors_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_dense_to_from_sparse_compressed_SparseBSC_Batched_Hybrid_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_dense_to_from_sparse_compressed_SparseBSC_Batched_NonHybrid_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_dense_to_from_sparse_compressed_SparseBSR_Batched_Hybrid_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_dense_to_from_sparse_compressed_SparseCSC_Batched_Hybrid_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_dense_to_from_sparse_compressed_SparseCSC_Batched_NonHybrid_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_dense_to_from_sparse_compressed_SparseCSC_NonBatched_Hybrid_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_dense_to_from_sparse_compressed_SparseCSR_Batched_Hybrid_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_dense_to_from_sparse_compressed_SparseCSR_Batched_NonHybrid_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_dense_to_from_sparse_compressed_SparseCSR_NonBatched_Hybrid_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_dense_to_from_sparse_compressed_SparseCSR_NonBatched_NonHybrid_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_direct_coo_csr_conversion_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_direct_coo_csr_conversion_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_direct_coo_csr_conversion_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_direct_coo_csr_conversion_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_direct_coo_csr_conversion_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_direct_coo_csr_conversion_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_exercise_detach_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_exercise_detach_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_exercise_detach_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_exercise_detach_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_exercise_detach_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_exercise_detach_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_exercise_detach_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_exercise_detach_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_linalg_solve_sparse_csr_cusolver_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_linalg_solve_sparse_csr_cusolver_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mm_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_as_sparse_compressed_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_as_sparse_compressed_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_errors_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_errors_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_errors_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_errors_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_autograd_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_autograd_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_errors_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_errors_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_zero_sized_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int32_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int32_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int32_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int32_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int64_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int64_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int64_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int64_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int64_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int64_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int32_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int32_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int32_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int32_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int64_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int64_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int64_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int64_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int64_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int32_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int32_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int32_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int32_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int32_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int64_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int64_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int64_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int64_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int32_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int32_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int64_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int64_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int64_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int64_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_add_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_add_errors_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_add_errors_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_add_errors_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_add_errors_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_addmm_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_addmm_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_addmm_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_addmm_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_addmm_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csc_to_dense_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csc_to_dense_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csc_to_dense_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csc_to_dense_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csc_to_dense_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csc_to_dense_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csc_to_dense_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_from_dense_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_from_dense_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_from_dense_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_from_dense_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_from_dense_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_from_dense_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_from_dense_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_to_dense_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_to_dense_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_to_dense_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_to_dense_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_to_dense_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_to_dense_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_abs_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_abs_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_abs_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_abs_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_abs_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_abs_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_angle_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_angle_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_angle_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_angle_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_angle_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_angle_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asin_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asin_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asin_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asin_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asin_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asin_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asinh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asinh_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asinh_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asinh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asinh_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asinh_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atan_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atan_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atan_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atan_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atanh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atanh_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atanh_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atanh_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_ceil_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_ceil_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_ceil_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_ceil_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_conj_physical_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_conj_physical_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_conj_physical_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_conj_physical_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_conj_physical_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_conj_physical_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_deg2rad_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_deg2rad_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_deg2rad_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_deg2rad_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_deg2rad_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_deg2rad_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erf_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erf_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erf_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erf_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erf_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erfinv_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erfinv_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erfinv_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erfinv_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erfinv_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erfinv_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_expm1_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_expm1_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_expm1_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_expm1_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_expm1_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_expm1_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_floor_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_floor_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_floor_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_floor_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_floor_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_frac_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isinf_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isinf_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isinf_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isinf_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isinf_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isnan_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isnan_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isnan_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isnan_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isnan_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isnan_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isneginf_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isneginf_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isneginf_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isneginf_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isneginf_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isposinf_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isposinf_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isposinf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isposinf_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isposinf_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isposinf_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_log1p_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_log1p_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_log1p_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_log1p_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_log1p_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_log1p_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_neg_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_neg_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_neg_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_neg_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_neg_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_neg_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_nn_functional_relu_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_nn_functional_relu_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_nn_functional_relu_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_nn_functional_relu_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_nn_functional_relu_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_nn_functional_relu_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_nn_functional_relu_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_nn_functional_relu_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_positive_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_positive_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_positive_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_positive_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_positive_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_positive_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_rad2deg_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_rad2deg_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_rad2deg_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_rad2deg_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_rad2deg_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_rad2deg_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_round_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_round_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_round_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_round_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sgn_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sgn_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sgn_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sgn_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sgn_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sign_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sign_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sign_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sign_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sign_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sign_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sign_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sign_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_signbit_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_signbit_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_signbit_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_signbit_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_signbit_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sin_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sin_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sin_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sin_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sin_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sin_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sin_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sin_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sin_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sinh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sinh_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sinh_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sinh_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sinh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sinh_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sinh_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sqrt_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sqrt_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sqrt_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sqrt_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sqrt_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tan_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tan_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tan_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tan_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tan_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tan_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tan_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tan_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tanh_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tanh_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tanh_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tanh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tanh_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_trunc_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_trunc_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_trunc_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_trunc_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_trunc_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_abs_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_abs_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_abs_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_abs_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_abs_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_abs_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_abs_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_abs_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_abs_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_angle_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_angle_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_angle_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_angle_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_angle_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_angle_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_angle_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asin_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asin_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asin_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asin_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asin_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asinh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asinh_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asinh_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asinh_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asinh_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asinh_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asinh_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atan_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atan_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atan_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atan_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atan_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atan_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atan_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atan_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atanh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atanh_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atanh_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atanh_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atanh_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atanh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atanh_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_ceil_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_ceil_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_ceil_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_ceil_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_ceil_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_ceil_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_conj_physical_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_conj_physical_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_conj_physical_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_conj_physical_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_conj_physical_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_conj_physical_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_conj_physical_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_conj_physical_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erf_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erf_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erf_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erf_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erf_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erfinv_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erfinv_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erfinv_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_expm1_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_expm1_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_expm1_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_expm1_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_expm1_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_expm1_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_expm1_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_expm1_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_expm1_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_expm1_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_floor_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_floor_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_floor_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_floor_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isinf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isinf_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isinf_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isinf_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isinf_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isnan_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isnan_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isneginf_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isneginf_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isneginf_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isneginf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isneginf_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isneginf_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isneginf_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isposinf_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isposinf_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isposinf_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isposinf_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_log1p_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_log1p_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_log1p_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_log1p_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_neg_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_neg_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_neg_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_neg_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_neg_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_nn_functional_relu_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_nn_functional_relu_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_nn_functional_relu_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_nn_functional_relu_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_nn_functional_relu_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_nn_functional_relu_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_positive_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_positive_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_positive_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_positive_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_positive_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_positive_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_positive_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_rad2deg_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_rad2deg_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_rad2deg_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_rad2deg_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_rad2deg_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_round_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_round_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_round_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sgn_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sgn_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sgn_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sgn_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sgn_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sign_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sign_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sign_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sign_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sign_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_signbit_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_signbit_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_signbit_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_signbit_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_signbit_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sin_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sin_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sin_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sin_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sinh_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sinh_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sinh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sinh_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sinh_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sinh_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sinh_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sqrt_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sqrt_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sqrt_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sqrt_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sqrt_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sqrt_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tan_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tan_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tan_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tan_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tan_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tanh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tanh_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tanh_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tanh_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tanh_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tanh_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_trunc_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_trunc_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_trunc_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_mm_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_mm_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_mm_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_mm_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_mm_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_mm_reduce_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_mm_reduce_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_mm_reduce_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_mm_reduce_sum_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_mm_reduce_sum_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_mm_reduce_sum_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_to_sparse_compressed_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_to_sparse_compressed_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_triangular_solve_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_triangular_solve_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_triangular_solve_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sum_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sum_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sum_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sum_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sum_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sum_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_abs_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_abs_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_abs_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_abs_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_angle_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_angle_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_angle_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_angle_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_angle_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asin_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asin_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asin_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asin_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asin_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asin_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asinh_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asinh_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asinh_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asinh_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asinh_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asinh_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atan_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atan_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atan_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atan_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atan_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atanh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atanh_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atanh_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atanh_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atanh_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atanh_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_ceil_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_ceil_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_ceil_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_ceil_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_conj_physical_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_conj_physical_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_conj_physical_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_conj_physical_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_deg2rad_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_deg2rad_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_deg2rad_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_deg2rad_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_deg2rad_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_deg2rad_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erf_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erf_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erf_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erf_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erfinv_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erfinv_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erfinv_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erfinv_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_expm1_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_expm1_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_expm1_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_expm1_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_expm1_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_expm1_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_floor_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_floor_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_floor_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_frac_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isinf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isinf_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isinf_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isinf_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isinf_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isinf_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isinf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isinf_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isinf_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isinf_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isnan_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isnan_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isnan_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isnan_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isnan_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isnan_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isneginf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isneginf_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isneginf_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isneginf_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isneginf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isneginf_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isneginf_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isposinf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isposinf_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isposinf_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isposinf_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isposinf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isposinf_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isposinf_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_log1p_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_log1p_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_log1p_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_log1p_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_log1p_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_log1p_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_log1p_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_neg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_neg_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_neg_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_neg_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_neg_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_nn_functional_relu_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_nn_functional_relu_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_nn_functional_relu_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_nn_functional_relu_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_nn_functional_relu_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_nn_functional_relu_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_nn_functional_relu_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_nn_functional_relu_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_positive_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_positive_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_positive_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_positive_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_positive_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_positive_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_positive_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_positive_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_rad2deg_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_rad2deg_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_round_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_round_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_round_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_round_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_round_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_round_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sgn_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sgn_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sgn_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sgn_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sgn_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sgn_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sgn_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sign_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sign_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sign_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sign_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sign_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sign_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sign_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_signbit_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_signbit_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_signbit_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_signbit_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_signbit_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_signbit_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_signbit_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sin_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sin_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sin_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sin_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sin_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sinh_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sinh_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sinh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sinh_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sinh_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sinh_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sinh_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sqrt_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sqrt_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sqrt_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sqrt_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sqrt_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sqrt_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sqrt_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sqrt_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sqrt_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tan_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tan_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tan_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tan_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tan_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tan_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tan_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tan_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tanh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tanh_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tanh_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tanh_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tanh_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_trunc_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_trunc_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_trunc_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_trunc_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_trunc_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_trunc_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_abs_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_abs_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_abs_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_abs_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_abs_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_angle_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_angle_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_angle_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_angle_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_angle_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asin_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asin_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asin_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asinh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asinh_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asinh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asinh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asinh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atan_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atan_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atan_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atan_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atan_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atan_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atan_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atanh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atanh_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atanh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atanh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atanh_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atanh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atanh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_ceil_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_ceil_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_ceil_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_conj_physical_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_conj_physical_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_conj_physical_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_conj_physical_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_conj_physical_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_conj_physical_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_conj_physical_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_deg2rad_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_deg2rad_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_deg2rad_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_deg2rad_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erf_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erf_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erf_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erfinv_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erfinv_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erfinv_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_expm1_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_expm1_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_expm1_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_expm1_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_floor_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_floor_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_floor_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_floor_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_floor_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_frac_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_frac_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isinf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isinf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isinf_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isinf_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isinf_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isinf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isinf_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isinf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isnan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isnan_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isnan_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isnan_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isnan_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isnan_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isnan_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isneginf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isneginf_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isneginf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isneginf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isneginf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isposinf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isposinf_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isposinf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isposinf_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_log1p_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_log1p_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_log1p_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_log1p_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_log1p_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_log1p_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_log1p_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_log1p_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_log1p_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amax_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amax_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amin_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amin_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amin_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_mean_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_prod_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_prod_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_prod_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_prod_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_prod_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_prod_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_prod_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_sum_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_sum_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_sum_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_sum_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_sum_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_sum_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_sum_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_mul_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_mul_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_mul_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_mul_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_mul_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_neg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_neg_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_neg_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_neg_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_nn_functional_relu_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_nn_functional_relu_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_nn_functional_relu_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_nn_functional_relu_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_nn_functional_relu_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_nn_functional_relu_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_positive_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_positive_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_positive_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_positive_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_positive_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_rad2deg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_rad2deg_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_rad2deg_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_rad2deg_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_randn_like_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_randn_like_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_randn_like_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_round_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_round_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_round_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sgn_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sgn_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sgn_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sgn_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sgn_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sgn_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sgn_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sign_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sign_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sign_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sign_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sign_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sign_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sign_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_signbit_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_signbit_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_signbit_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_signbit_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sin_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sin_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sin_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sin_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sinh_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sinh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sinh_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sinh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sqrt_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sqrt_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sqrt_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sqrt_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sqrt_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sqrt_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sqrt_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sum_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sum_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sum_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sum_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sum_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sum_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sum_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sum_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sum_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tan_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tan_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tan_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tan_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tan_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tan_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tanh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tanh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tanh_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tanh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tanh_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tanh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tanh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tanh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_to_sparse_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_to_sparse_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_to_sparse_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_to_sparse_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_to_sparse_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_trunc_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_trunc_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_trunc_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_trunc_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_zeros_like_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_zeros_like_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_zeros_like_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_zeros_like_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_zeros_like_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_zeros_like_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_zeros_like_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_zeros_like_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_zeros_like_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_zeros_like_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_abs_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_abs_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_abs_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_abs_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_abs_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_abs_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_abs_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_abs_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_abs_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_abs_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_abs_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_angle_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_angle_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_angle_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_angle_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_angle_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_angle_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_angle_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asin_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asin_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asin_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asin_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asin_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asin_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asinh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asinh_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asinh_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asinh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asinh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atan_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atan_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atan_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atan_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atanh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atanh_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atanh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atanh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atanh_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atanh_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atanh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_ceil_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_ceil_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_ceil_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_ceil_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_ceil_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_ceil_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_ceil_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_conj_physical_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_conj_physical_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_conj_physical_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_conj_physical_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_conj_physical_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_deg2rad_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_deg2rad_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_deg2rad_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_deg2rad_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_deg2rad_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_deg2rad_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_deg2rad_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erf_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erf_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erf_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erfinv_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erfinv_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erfinv_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erfinv_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erfinv_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_expm1_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_expm1_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_expm1_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_expm1_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_expm1_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_expm1_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_expm1_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_floor_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_floor_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_floor_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_floor_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_frac_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_frac_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isinf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isinf_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isinf_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isinf_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isinf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isinf_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isnan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isnan_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isnan_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isnan_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isnan_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isnan_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isnan_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isnan_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isneginf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isneginf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isneginf_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isneginf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isneginf_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isneginf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isposinf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isposinf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isposinf_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isposinf_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isposinf_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isposinf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_log1p_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_log1p_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amax_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amax_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amax_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amax_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amin_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amin_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amin_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amin_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amin_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_mean_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_mean_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_mean_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_mean_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_prod_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_prod_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_prod_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_prod_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_prod_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_sum_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_sum_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_sum_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_sum_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_sum_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_sum_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_sum_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_sum_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_mul_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_mul_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_mul_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_mul_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_mul_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_mul_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_neg_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_neg_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_neg_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_nn_functional_relu_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_nn_functional_relu_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_nn_functional_relu_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_nn_functional_relu_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_positive_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_positive_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_positive_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_rad2deg_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_rad2deg_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_rad2deg_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_rad2deg_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_rad2deg_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_randn_like_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_randn_like_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_randn_like_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_randn_like_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_randn_like_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_round_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_round_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_round_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_round_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sgn_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sgn_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sgn_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sgn_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sgn_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sgn_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sign_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sign_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sign_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sign_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sign_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sign_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_signbit_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_signbit_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_signbit_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_signbit_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_signbit_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sin_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sin_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sin_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sin_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sin_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sin_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sin_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sin_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sinh_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sinh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sinh_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sinh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sqrt_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sqrt_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sqrt_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sqrt_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sqrt_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sqrt_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sqrt_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sqrt_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sum_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sum_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sum_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sum_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sum_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sum_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sum_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sum_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sum_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tan_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tan_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tan_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tan_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tanh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tanh_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tanh_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tanh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tanh_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tanh_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tanh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tanh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_to_sparse_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_to_sparse_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_to_sparse_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_to_sparse_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_to_sparse_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_to_sparse_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_to_sparse_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_trunc_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_trunc_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_trunc_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_trunc_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_trunc_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_trunc_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_trunc_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_zeros_like_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_zeros_like_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_zeros_like_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_zeros_like_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_zeros_like_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_abs_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_abs_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_abs_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_abs_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_abs_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_abs_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_abs_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_abs_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_angle_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_angle_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_angle_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_angle_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_angle_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_angle_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_angle_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asin_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asin_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asin_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asin_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asin_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asin_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asinh_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asinh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asinh_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asinh_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asinh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asinh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atan_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atan_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atan_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atan_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atan_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atan_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atan_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atanh_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atanh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atanh_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atanh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atanh_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atanh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_ceil_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_ceil_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_ceil_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_ceil_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_deg2rad_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_deg2rad_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_deg2rad_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_deg2rad_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_deg2rad_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_deg2rad_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erf_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erf_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erf_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erfinv_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erfinv_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erfinv_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erfinv_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erfinv_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erfinv_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erfinv_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erfinv_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erfinv_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_expm1_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_expm1_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_expm1_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_expm1_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_expm1_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_expm1_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_expm1_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_expm1_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_expm1_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_floor_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_floor_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_floor_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_frac_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_frac_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_frac_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isinf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isinf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isinf_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isinf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isinf_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isinf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isinf_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isnan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isnan_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isnan_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isnan_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isnan_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isnan_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isnan_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isnan_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isneginf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isneginf_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isneginf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isneginf_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isposinf_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isposinf_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isposinf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isposinf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isposinf_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isposinf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_log1p_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_log1p_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_log1p_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_log1p_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_log1p_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_log1p_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_log1p_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amax_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amax_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amax_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amax_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amin_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amin_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amin_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amin_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_mean_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_mean_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_mean_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_prod_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_prod_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_prod_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_prod_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_prod_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_prod_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_sum_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_sum_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_sum_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_sum_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_sum_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_mul_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_mul_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_mul_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_mul_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_mul_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_mul_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_mul_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_mul_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_neg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_neg_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_nn_functional_relu_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_positive_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_positive_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_positive_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_positive_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_positive_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_positive_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_rad2deg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_rad2deg_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_rad2deg_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_rad2deg_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_rad2deg_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_randn_like_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_randn_like_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_randn_like_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_round_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_round_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_round_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_round_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_round_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_round_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sgn_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sgn_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sgn_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sgn_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sgn_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sgn_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sign_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sign_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sign_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sign_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sign_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sign_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_signbit_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_signbit_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_signbit_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_signbit_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_signbit_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_signbit_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_signbit_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sin_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sin_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sin_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sin_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sinh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sinh_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sinh_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sinh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sinh_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sinh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sinh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sinh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sqrt_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sqrt_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sqrt_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sqrt_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sqrt_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sqrt_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sqrt_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sqrt_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sum_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sum_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sum_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sum_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sum_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sum_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sum_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sum_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sum_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tan_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tan_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tan_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tan_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tan_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tan_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tanh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tanh_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tanh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tanh_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tanh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_to_sparse_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_to_sparse_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_to_sparse_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_to_sparse_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_to_sparse_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_to_sparse_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_to_sparse_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_to_sparse_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_trunc_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_trunc_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_trunc_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_zeros_like_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_zeros_like_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_zeros_like_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_zeros_like_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_zeros_like_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_zeros_like_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_zeros_like_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_zeros_like_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_angle_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_angle_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_angle_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_angle_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asin_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asin_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asin_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asin_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asin_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asin_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asin_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asinh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asinh_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asinh_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asinh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asinh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asinh_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asinh_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asinh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asinh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asinh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atan_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atan_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atan_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atan_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atan_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atan_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_ceil_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_ceil_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_ceil_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_ceil_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_ceil_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_conj_physical_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_conj_physical_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_conj_physical_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_conj_physical_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_conj_physical_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_conj_physical_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_conj_physical_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_conj_physical_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_conj_physical_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_deg2rad_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_deg2rad_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_deg2rad_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_deg2rad_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_deg2rad_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_deg2rad_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_deg2rad_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erf_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erf_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erfinv_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erfinv_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erfinv_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_expm1_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_expm1_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_floor_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_floor_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_floor_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_floor_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_frac_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_frac_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isinf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isinf_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isinf_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isinf_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isinf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isinf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isinf_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isinf_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isinf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isnan_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isnan_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isnan_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isnan_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isnan_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isnan_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isnan_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isnan_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isneginf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isneginf_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isposinf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isposinf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isposinf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isposinf_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isposinf_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isposinf_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_log1p_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_log1p_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_log1p_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_log1p_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_log1p_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_log1p_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_log1p_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amax_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amax_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amax_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amax_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amin_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amin_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amin_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amin_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_mean_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_mean_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_mean_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_mean_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_prod_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_prod_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_prod_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_prod_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_prod_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_prod_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_prod_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_prod_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_sum_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_sum_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_sum_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_sum_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_sum_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_sum_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_mul_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_mul_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_mul_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_mul_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_mul_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_neg_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_neg_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_neg_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_nn_functional_relu_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_nn_functional_relu_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_nn_functional_relu_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_nn_functional_relu_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_nn_functional_relu_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_positive_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_positive_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_positive_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_rad2deg_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_rad2deg_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_rad2deg_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_rad2deg_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_rad2deg_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_randn_like_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_randn_like_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_randn_like_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_round_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_round_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_round_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_round_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sgn_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sgn_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sgn_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sgn_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sgn_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sgn_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sgn_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sgn_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sgn_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sign_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sign_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sign_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sign_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_signbit_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_signbit_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_signbit_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_signbit_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_signbit_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sin_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sin_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sin_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sin_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sin_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sin_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sin_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sin_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sinh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sinh_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sinh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sinh_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sinh_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sinh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sinh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sqrt_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sqrt_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sqrt_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sqrt_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sqrt_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sqrt_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sqrt_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sqrt_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sqrt_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sum_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sum_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sum_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sum_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sum_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sum_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sum_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sum_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sum_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tan_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tan_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tan_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tan_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tan_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tan_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tan_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tanh_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tanh_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tanh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tanh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tanh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_to_sparse_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_to_sparse_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_to_sparse_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_to_sparse_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_to_sparse_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_to_sparse_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_trunc_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_trunc_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_trunc_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_trunc_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_trunc_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_zeros_like_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_zeros_like_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_zeros_like_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_zeros_like_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_zeros_like_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_zeros_like_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_zeros_like_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_zeros_like_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_invalid_input_SparseBSC_target_sparse_compressed_tensor_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_invalid_input_SparseBSC_target_validate_sparse_compressed_tensor_args_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_invalid_input_SparseBSR_target_validate_sparse_compressed_tensor_args_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_invalid_input_SparseCSC_target_sparse_compressed_tensor_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_invalid_input_SparseCSC_target_sparse_compressed_tensor_no_size_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_invalid_input_SparseCSC_target_validate_sparse_compressed_tensor_args_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_invalid_input_SparseCSR_target_sparse_compressed_tensor_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_invalid_input_SparseCSR_target_sparse_compressed_tensor_no_size_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_invalid_input_SparseCSR_target_validate_sparse_compressed_tensor_args_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_invalid_input_csr_large_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_layout_SparseBSC_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_layout_SparseCSC_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_pickle_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_print_SparseBSC_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_print_SparseCSC_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int32_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int64_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int64_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int64_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int64_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int32_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int32_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int32_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int32_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int64_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int64_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int64_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int64_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int64_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int64_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int32_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int32_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int32_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int32_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int32_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int32_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int64_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int64_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int64_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int64_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int64_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int64_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int32_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int32_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int32_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int32_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int32_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int32_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int32_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int64_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int64_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int64_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_TensorAsKey_cuda, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_addmm_meta_cuda, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_16_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_16_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_16_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_32_int64_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_64_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_64_int64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_64_int64_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_error_messages_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_16_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_16_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_16x32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_16x32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_16x32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_2_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_2x3_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_2x3_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_64_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_64_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16_out_dtype_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16_out_dtype_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16_out_dtype_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16_out_dtype_unspecified_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16_out_dtype_unspecified_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16_out_dtype_unspecified_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16x32_out_dtype_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16x32_out_dtype_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_32_out_dtype_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_32_out_dtype_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_32_out_dtype_unspecified_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_32_out_dtype_unspecified_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_16_out_dtype_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_16_out_dtype_unspecified_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_16_out_dtype_unspecified_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_16_out_dtype_unspecified_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_16_out_dtype_unspecified_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_16x32_out_dtype_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_16x32_out_dtype_unspecified_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_16x32_out_dtype_unspecified_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_16x32_out_dtype_unspecified_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_16x32_out_dtype_unspecified_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_32_out_dtype_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_32_out_dtype_unspecified_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_32_out_dtype_unspecified_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16_out_dtype_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16_out_dtype_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16_out_dtype_unspecified_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16x32_out_dtype_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16x32_out_dtype_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16x32_out_dtype_unspecified_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16x32_out_dtype_unspecified_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16x32_out_dtype_unspecified_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_32_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_32_out_dtype_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_32_out_dtype_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_32_out_dtype_unspecified_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_32_out_dtype_unspecified_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16_out_dtype_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16_out_dtype_unspecified_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16_out_dtype_unspecified_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16_out_dtype_unspecified_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16x32_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16x32_out_dtype_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16x32_out_dtype_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16x32_out_dtype_unspecified_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16x32_out_dtype_unspecified_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_32_out_dtype_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_32_out_dtype_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_32_out_dtype_unspecified_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_32_out_dtype_unspecified_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_sampled_addmm_block_size_16_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_sampled_addmm_block_size_16_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_sampled_addmm_block_size_16_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_sampled_addmm_block_size_32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_sampled_addmm_block_size_32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_sampled_addmm_block_size_32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_scaled_dot_product_attention_block_size_16_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_scaled_dot_product_attention_block_size_16_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_scaled_dot_product_attention_block_size_32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_scaled_dot_product_attention_block_size_64_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_scatter_mm_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_scatter_mm_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op__int_bsr_dense_addmm_out_dtype_unspecified_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op__int_bsr_dense_addmm_out_dtype_unspecified_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op__int_bsr_dense_addmm_out_dtype_unspecified_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op_bsr_dense_addmm_out_dtype_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op_bsr_dense_addmm_out_dtype_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op_bsr_dense_addmm_out_dtype_unspecified_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op_bsr_dense_addmm_out_dtype_unspecified_cuda_float32 2025-10-10T02:23:51.3984402Z 2025-10-10T02:28:02.0882356Z 2025-10-10T02:28:02.0886025Z test_sparse 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_sparse_1.1_c4a4003efa99e723_.log 2025-10-10T02:28:02.2013492Z Running 3063 items in this shard: test/test_sparse.py::TestSparseLegacyAndDeprecation::test_legacy_warnings, test/test_sparse.py::TestSparseOneOff::test_cuda_from_cpu, test/test_sparse.py::TestSparseOneOff::test_cuda_sparse_cpu_dense_add, test/test_sparse.py::TestSparseMeta::test_add_meta_SparseBSC_float64, test/test_sparse.py::TestSparseMeta::test_add_meta_SparseBSR_float64, test/test_sparse.py::TestSparseMeta::test_add_meta_SparseCOO_float64, test/test_sparse.py::TestSparseMeta::test_add_meta_SparseCSC_float64, test/test_sparse.py::TestSparseMeta::test_add_meta_SparseCSR_float64, test/test_sparse.py::TestSparseMeta::test_fake_SparseBSC_float64, test/test_sparse.py::TestSparseMeta::test_fake_SparseBSR_float64, test/test_sparse.py::TestSparseMeta::test_fake_SparseCOO_float64, test/test_sparse.py::TestSparseMeta::test_fake_SparseCSC_float64, test/test_sparse.py::TestSparseMeta::test_fake_SparseCSR_float64, test/test_sparse.py::TestSparseMeta::test_meta_SparseBSC_float64, test/test_sparse.py::TestSparseMeta::test_meta_SparseBSR_float64, test/test_sparse.py::TestSparseMeta::test_meta_SparseCOO_float64, test/test_sparse.py::TestSparseMeta::test_meta_SparseCSC_float64, test/test_sparse.py::TestSparseMeta::test_meta_SparseCSR_float64, test/test_sparse.py::TestSparseMeta::test_print_meta_SparseBSC_float64, test/test_sparse.py::TestSparseMeta::test_print_meta_SparseBSR_float64, test/test_sparse.py::TestSparseMeta::test_print_meta_SparseCOO_float64, test/test_sparse.py::TestSparseMeta::test_print_meta_SparseCSC_float64, test/test_sparse.py::TestSparseMeta::test_print_meta_SparseCSR_float64, test/test_sparse.py::TestSparseMeta::test_sum_meta_SparseBSC_float64, test/test_sparse.py::TestSparseMeta::test_sum_meta_SparseBSR_float64, test/test_sparse.py::TestSparseMeta::test_sum_meta_SparseCOO_float64, test/test_sparse.py::TestSparseMeta::test_sum_meta_SparseCSC_float64, test/test_sparse.py::TestSparseMeta::test_sum_meta_SparseCSR_float64, test/test_sparse.py::TestSparseMeta::test_to_meta_SparseBSC_float64, test/test_sparse.py::TestSparseMeta::test_to_meta_SparseBSR_float64, test/test_sparse.py::TestSparseMeta::test_to_meta_SparseCOO_float64, test/test_sparse.py::TestSparseMeta::test_to_meta_SparseCSC_float64, test/test_sparse.py::TestSparseMeta::test_to_meta_SparseCSR_float64, test/test_sparse.py::TestSparseMeta::test_zeros_like_fake_SparseBSC_float64, test/test_sparse.py::TestSparseMeta::test_zeros_like_fake_SparseBSR_float64, test/test_sparse.py::TestSparseMeta::test_zeros_like_fake_SparseCOO_float64, test/test_sparse.py::TestSparseMeta::test_zeros_like_fake_SparseCSC_float64, test/test_sparse.py::TestSparseMeta::test_zeros_like_fake_SparseCSR_float64, test/test_sparse.py::TestSparseMeta::test_zeros_like_meta_SparseBSC_float64, test/test_sparse.py::TestSparseMeta::test_zeros_like_meta_SparseBSR_float64, test/test_sparse.py::TestSparseMeta::test_zeros_like_meta_SparseCOO_float64, test/test_sparse.py::TestSparseMeta::test_zeros_like_meta_SparseCSC_float64, test/test_sparse.py::TestSparseMeta::test_zeros_like_meta_SparseCSR_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_abs_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_abs_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_abs_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_abs_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_abs_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_abs_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_abs_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_abs_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_abs_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_asin_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_asin_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_asin_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_asin_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_asin_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_asin_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_asin_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_asin_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_asin_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_asinh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_asinh_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_asinh_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_asinh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_asinh_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_asinh_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_asinh_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_asinh_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_asinh_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_atan_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_atan_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_atan_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_atan_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_atan_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_atan_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_atan_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_atan_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_atan_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_atanh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_atanh_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_atanh_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_atanh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_atanh_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_atanh_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_atanh_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_atanh_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_atanh_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_ceil_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_ceil_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_ceil_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_ceil_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_ceil_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_ceil_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_ceil_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_conj_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_conj_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_conj_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_conj_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_conj_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_conj_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_conj_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_conj_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_conj_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_conj_physical_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_conj_physical_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_conj_physical_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_conj_physical_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_conj_physical_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_conj_physical_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_conj_physical_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_conj_physical_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_conj_physical_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_deg2rad_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_deg2rad_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_deg2rad_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_deg2rad_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_deg2rad_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_deg2rad_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_deg2rad_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_erf_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_erf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_erf_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_erf_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_erf_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_erf_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_erf_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_expm1_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_expm1_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_expm1_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_expm1_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_expm1_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_expm1_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_expm1_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_expm1_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_expm1_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_floor_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_floor_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_floor_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_floor_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_floor_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_floor_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_floor_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_frac_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_frac_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isinf_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isinf_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isinf_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isinf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isinf_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isinf_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isinf_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isinf_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isinf_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isnan_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isnan_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isnan_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isnan_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isnan_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isnan_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isnan_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isnan_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isnan_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isneginf_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isneginf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isneginf_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isneginf_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isneginf_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isneginf_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isneginf_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isposinf_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isposinf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isposinf_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isposinf_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isposinf_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isposinf_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_isposinf_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_log1p_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_log1p_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_log1p_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_log1p_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_log1p_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_log1p_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_log1p_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_log1p_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_log1p_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_nan_to_num_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_nan_to_num_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_nan_to_num_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_nan_to_num_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_nan_to_num_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_nan_to_num_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_nan_to_num_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_neg_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_neg_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_neg_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_neg_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_neg_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_neg_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_neg_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_neg_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_neg_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_nn_functional_relu_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_nn_functional_relu_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_nn_functional_relu_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_nn_functional_relu_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_nn_functional_relu_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_nn_functional_relu_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_nn_functional_relu_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_positive_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_positive_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_positive_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_positive_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_positive_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_positive_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_positive_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_positive_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_positive_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_rad2deg_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_rad2deg_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_rad2deg_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_rad2deg_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_rad2deg_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_rad2deg_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_rad2deg_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_round_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_round_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_round_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_round_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_round_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_round_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_round_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sgn_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sgn_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sgn_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sgn_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sgn_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sgn_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sgn_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sgn_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sgn_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sign_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sign_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sign_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sign_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sign_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sign_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sign_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_signbit_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_signbit_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_signbit_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_signbit_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_signbit_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_signbit_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_signbit_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sin_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sin_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sin_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sin_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sin_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sin_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sin_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sin_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sin_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sinh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sinh_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sinh_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sinh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sinh_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sinh_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sinh_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sinh_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sinh_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sqrt_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sqrt_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sqrt_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sqrt_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sqrt_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sqrt_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sqrt_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sqrt_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_sqrt_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_tan_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_tan_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_tan_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_tan_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_tan_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_tan_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_tan_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_tan_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_tan_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_tanh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_tanh_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_tanh_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_tanh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_tanh_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_tanh_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_tanh_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_tanh_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_tanh_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_trunc_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_trunc_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_trunc_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_trunc_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_trunc_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_trunc_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_inplace_trunc_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_abs_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_abs_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_abs_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_abs_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_abs_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_abs_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_abs_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_abs_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_abs_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_asin_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_asin_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_asin_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_asin_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_asin_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_asin_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_asin_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_asin_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_asin_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_asinh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_asinh_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_asinh_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_asinh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_asinh_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_asinh_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_asinh_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_asinh_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_asinh_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_atan_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_atan_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_atan_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_atan_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_atan_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_atan_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_atan_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_atan_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_atan_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_atanh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_atanh_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_atanh_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_atanh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_atanh_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_atanh_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_atanh_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_atanh_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_atanh_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_ceil_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_ceil_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_ceil_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_ceil_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_ceil_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_ceil_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_ceil_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_conj_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_conj_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_conj_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_conj_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_conj_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_conj_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_conj_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_conj_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_conj_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_conj_physical_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_conj_physical_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_conj_physical_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_conj_physical_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_conj_physical_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_conj_physical_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_conj_physical_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_conj_physical_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_conj_physical_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_deg2rad_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_deg2rad_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_deg2rad_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_deg2rad_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_deg2rad_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_deg2rad_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_deg2rad_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_erf_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_erf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_erf_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_erf_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_erf_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_erf_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_erf_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_expm1_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_expm1_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_expm1_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_expm1_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_expm1_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_expm1_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_expm1_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_expm1_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_expm1_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_floor_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_floor_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_floor_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_floor_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_floor_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_floor_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_floor_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_frac_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_frac_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isinf_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isinf_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isinf_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isinf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isinf_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isinf_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isinf_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isinf_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isinf_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isnan_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isnan_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isnan_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isnan_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isnan_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isnan_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isnan_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isnan_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isnan_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isneginf_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isneginf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isneginf_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isneginf_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isneginf_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isneginf_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isneginf_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isposinf_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isposinf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isposinf_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isposinf_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isposinf_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isposinf_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_isposinf_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_log1p_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_log1p_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_log1p_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_log1p_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_log1p_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_log1p_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_log1p_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_log1p_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_log1p_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_nan_to_num_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_nan_to_num_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_nan_to_num_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_nan_to_num_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_nan_to_num_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_nan_to_num_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_nan_to_num_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_neg_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_neg_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_neg_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_neg_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_neg_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_neg_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_neg_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_neg_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_neg_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_nn_functional_relu_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_nn_functional_relu_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_nn_functional_relu_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_nn_functional_relu_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_nn_functional_relu_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_nn_functional_relu_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_nn_functional_relu_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_positive_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_positive_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_positive_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_positive_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_positive_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_positive_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_positive_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_positive_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_positive_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_rad2deg_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_rad2deg_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_rad2deg_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_rad2deg_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_rad2deg_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_rad2deg_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_rad2deg_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_round_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_round_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_round_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_round_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_round_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_round_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_round_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sgn_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sgn_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sgn_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sgn_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sgn_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sgn_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sgn_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sgn_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sgn_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sign_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sign_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sign_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sign_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sign_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sign_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sign_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_signbit_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_signbit_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_signbit_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_signbit_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_signbit_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_signbit_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_signbit_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sin_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sin_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sin_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sin_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sin_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sin_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sin_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sin_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sin_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sinh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sinh_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sinh_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sinh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sinh_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sinh_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sinh_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sinh_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sinh_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sqrt_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sqrt_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sqrt_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sqrt_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sqrt_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sqrt_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sqrt_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sqrt_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_sqrt_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_tan_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_tan_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_tan_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_tan_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_tan_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_tan_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_tan_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_tan_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_tan_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_tanh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_tanh_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_tanh_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_tanh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_tanh_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_tanh_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_tanh_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_tanh_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_tanh_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_trunc_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_trunc_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_trunc_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_trunc_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_trunc_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_trunc_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_out_trunc_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_abs_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_abs_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_abs_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_abs_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_abs_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_abs_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_abs_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_abs_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_abs_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_asin_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_asin_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_asin_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_asin_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_asin_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_asin_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_asin_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_asin_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_asin_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_asinh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_asinh_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_asinh_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_asinh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_asinh_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_asinh_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_asinh_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_asinh_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_asinh_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_atan_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_atan_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_atan_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_atan_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_atan_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_atan_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_atan_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_atan_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_atan_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_atanh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_atanh_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_atanh_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_atanh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_atanh_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_atanh_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_atanh_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_atanh_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_atanh_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_ceil_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_ceil_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_ceil_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_ceil_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_ceil_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_ceil_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_ceil_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_conj_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_conj_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_conj_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_conj_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_conj_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_conj_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_conj_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_conj_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_conj_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_conj_physical_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_conj_physical_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_conj_physical_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_conj_physical_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_conj_physical_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_conj_physical_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_conj_physical_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_conj_physical_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_conj_physical_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_deg2rad_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_deg2rad_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_deg2rad_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_deg2rad_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_deg2rad_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_deg2rad_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_deg2rad_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_erf_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_erf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_erf_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_erf_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_erf_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_erf_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_erf_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_expm1_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_expm1_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_expm1_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_expm1_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_expm1_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_expm1_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_expm1_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_expm1_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_expm1_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_floor_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_floor_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_floor_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_floor_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_floor_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_floor_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_floor_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_frac_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_frac_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isinf_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isinf_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isinf_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isinf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isinf_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isinf_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isinf_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isinf_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isinf_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isnan_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isnan_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isnan_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isnan_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isnan_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isnan_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isnan_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isnan_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isnan_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isneginf_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isneginf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isneginf_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isneginf_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isneginf_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isneginf_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isneginf_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isposinf_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isposinf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isposinf_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isposinf_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isposinf_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isposinf_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_isposinf_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_log1p_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_log1p_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_log1p_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_log1p_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_log1p_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_log1p_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_log1p_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_log1p_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_log1p_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_nan_to_num_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_nan_to_num_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_nan_to_num_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_nan_to_num_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_nan_to_num_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_nan_to_num_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_nan_to_num_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_neg_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_neg_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_neg_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_neg_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_neg_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_neg_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_neg_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_neg_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_neg_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_nn_functional_relu_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_nn_functional_relu_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_nn_functional_relu_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_nn_functional_relu_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_nn_functional_relu_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_nn_functional_relu_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_nn_functional_relu_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_positive_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_positive_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_positive_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_positive_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_positive_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_positive_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_positive_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_positive_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_positive_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_rad2deg_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_rad2deg_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_rad2deg_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_rad2deg_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_rad2deg_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_rad2deg_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_rad2deg_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_round_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_round_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_round_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_round_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_round_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_round_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_round_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sgn_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sgn_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sgn_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sgn_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sgn_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sgn_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sgn_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sgn_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sgn_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sign_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sign_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sign_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sign_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sign_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sign_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sign_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_signbit_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_signbit_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_signbit_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_signbit_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_signbit_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_signbit_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_signbit_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sin_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sin_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sin_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sin_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sin_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sin_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sin_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sin_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sin_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sinh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sinh_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sinh_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sinh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sinh_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sinh_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sinh_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sinh_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sinh_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sqrt_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sqrt_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sqrt_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sqrt_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sqrt_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sqrt_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sqrt_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sqrt_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_sqrt_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_tan_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_tan_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_tan_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_tan_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_tan_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_tan_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_tan_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_tan_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_tan_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_tanh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_tanh_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_tanh_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_tanh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_tanh_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_tanh_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_tanh_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_tanh_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_tanh_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_trunc_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_trunc_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_trunc_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_trunc_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_trunc_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_trunc_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_consistency_trunc_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_abs_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_abs_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_asin_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_asin_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_asinh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_asinh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_atan_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_atan_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_atanh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_atanh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_ceil_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_conj_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_conj_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_conj_physical_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_conj_physical_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_deg2rad_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_erf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_expm1_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_expm1_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_floor_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_frac_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_isinf_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_isinf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_isnan_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_isnan_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_isneginf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_isposinf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_log1p_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_log1p_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_nan_to_num_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_neg_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_neg_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_nn_functional_relu_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_positive_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_positive_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_rad2deg_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_round_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_sgn_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_sgn_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_sign_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_signbit_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_sin_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_sin_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_sinh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_sinh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_sqrt_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_sqrt_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_tan_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_tan_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_tanh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_tanh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_fn_grad_trunc_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_abs_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_abs_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_abs_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_abs_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_abs_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_abs_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_abs_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_abs_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_abs_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_asin_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_asin_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_asin_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_asin_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_asin_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_asin_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_asin_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_asin_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_asin_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_asinh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_asinh_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_asinh_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_asinh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_asinh_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_asinh_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_asinh_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_asinh_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_asinh_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_atan_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_atan_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_atan_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_atan_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_atan_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_atan_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_atan_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_atan_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_atan_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_atanh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_atanh_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_atanh_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_atanh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_atanh_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_atanh_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_atanh_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_atanh_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_atanh_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_ceil_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_ceil_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_ceil_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_ceil_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_ceil_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_ceil_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_ceil_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_conj_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_conj_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_conj_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_conj_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_conj_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_conj_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_conj_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_conj_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_conj_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_conj_physical_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_conj_physical_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_conj_physical_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_conj_physical_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_conj_physical_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_conj_physical_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_conj_physical_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_conj_physical_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_conj_physical_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_deg2rad_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_deg2rad_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_deg2rad_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_deg2rad_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_deg2rad_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_deg2rad_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_deg2rad_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_erf_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_erf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_erf_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_erf_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_erf_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_erf_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_erf_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_expm1_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_expm1_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_expm1_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_expm1_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_expm1_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_expm1_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_expm1_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_expm1_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_expm1_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_floor_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_floor_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_floor_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_floor_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_floor_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_floor_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_floor_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_frac_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_frac_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isinf_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isinf_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isinf_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isinf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isinf_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isinf_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isinf_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isinf_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isinf_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isnan_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isnan_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isnan_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isnan_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isnan_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isnan_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isnan_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isnan_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isnan_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isneginf_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isneginf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isneginf_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isneginf_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isneginf_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isneginf_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isneginf_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isposinf_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isposinf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isposinf_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isposinf_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isposinf_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isposinf_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_isposinf_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_log1p_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_log1p_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_log1p_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_log1p_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_log1p_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_log1p_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_log1p_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_log1p_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_log1p_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_nan_to_num_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_nan_to_num_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_nan_to_num_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_nan_to_num_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_nan_to_num_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_nan_to_num_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_nan_to_num_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_neg_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_neg_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_neg_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_neg_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_neg_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_neg_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_neg_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_neg_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_neg_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_nn_functional_relu_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_nn_functional_relu_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_nn_functional_relu_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_nn_functional_relu_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_nn_functional_relu_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_nn_functional_relu_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_nn_functional_relu_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_positive_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_positive_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_positive_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_positive_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_positive_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_positive_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_positive_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_positive_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_positive_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_rad2deg_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_rad2deg_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_rad2deg_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_rad2deg_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_rad2deg_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_rad2deg_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_rad2deg_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_round_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_round_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_round_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_round_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_round_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_round_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_round_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sgn_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sgn_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sgn_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sgn_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sgn_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sgn_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sgn_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sgn_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sgn_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sign_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sign_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sign_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sign_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sign_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sign_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sign_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_signbit_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_signbit_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_signbit_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_signbit_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_signbit_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_signbit_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_signbit_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sin_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sin_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sin_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sin_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sin_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sin_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sin_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sin_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sin_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sinh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sinh_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sinh_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sinh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sinh_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sinh_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sinh_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sinh_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sinh_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sqrt_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sqrt_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sqrt_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sqrt_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sqrt_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sqrt_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sqrt_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sqrt_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_sqrt_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_tan_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_tan_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_tan_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_tan_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_tan_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_tan_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_tan_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_tan_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_tan_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_tanh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_tanh_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_tanh_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_tanh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_tanh_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_tanh_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_tanh_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_tanh_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_tanh_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_trunc_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_trunc_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_trunc_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_trunc_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_trunc_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_trunc_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zero_dims_trunc_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_abs_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_abs_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_abs_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_abs_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_abs_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_abs_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_abs_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_abs_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_abs_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_asin_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_asin_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_asin_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_asin_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_asin_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_asin_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_asin_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_asin_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_asin_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_asinh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_asinh_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_asinh_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_asinh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_asinh_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_asinh_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_asinh_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_asinh_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_asinh_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_atan_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_atan_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_atan_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_atan_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_atan_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_atan_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_atan_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_atan_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_atan_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_atanh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_atanh_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_atanh_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_atanh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_atanh_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_atanh_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_atanh_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_atanh_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_atanh_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_ceil_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_ceil_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_ceil_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_ceil_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_ceil_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_ceil_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_ceil_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_conj_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_conj_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_conj_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_conj_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_conj_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_conj_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_conj_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_conj_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_conj_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_conj_physical_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_conj_physical_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_conj_physical_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_conj_physical_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_conj_physical_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_conj_physical_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_conj_physical_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_conj_physical_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_conj_physical_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_deg2rad_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_deg2rad_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_deg2rad_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_deg2rad_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_deg2rad_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_deg2rad_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_deg2rad_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_erf_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_erf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_erf_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_erf_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_erf_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_erf_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_erf_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_expm1_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_expm1_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_expm1_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_expm1_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_expm1_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_expm1_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_expm1_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_expm1_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_expm1_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_floor_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_floor_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_floor_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_floor_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_floor_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_floor_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_floor_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_frac_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_frac_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isinf_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isinf_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isinf_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isinf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isinf_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isinf_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isinf_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isinf_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isinf_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isnan_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isnan_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isnan_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isnan_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isnan_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isnan_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isnan_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isnan_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isnan_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isneginf_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isneginf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isneginf_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isneginf_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isneginf_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isneginf_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isneginf_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isposinf_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isposinf_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isposinf_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isposinf_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isposinf_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isposinf_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_isposinf_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_log1p_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_log1p_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_log1p_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_log1p_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_log1p_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_log1p_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_log1p_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_log1p_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_log1p_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_nan_to_num_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_nan_to_num_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_nan_to_num_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_nan_to_num_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_nan_to_num_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_nan_to_num_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_nan_to_num_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_neg_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_neg_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_neg_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_neg_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_neg_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_neg_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_neg_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_neg_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_neg_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_nn_functional_relu_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_nn_functional_relu_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_nn_functional_relu_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_nn_functional_relu_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_nn_functional_relu_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_nn_functional_relu_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_nn_functional_relu_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_positive_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_positive_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_positive_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_positive_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_positive_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_positive_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_positive_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_positive_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_positive_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_rad2deg_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_rad2deg_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_rad2deg_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_rad2deg_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_rad2deg_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_rad2deg_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_rad2deg_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_round_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_round_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_round_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_round_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_round_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_round_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_round_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sgn_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sgn_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sgn_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sgn_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sgn_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sgn_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sgn_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sgn_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sgn_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sign_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sign_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sign_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sign_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sign_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sign_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sign_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_signbit_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_signbit_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_signbit_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_signbit_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_signbit_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_signbit_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_signbit_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sin_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sin_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sin_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sin_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sin_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sin_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sin_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sin_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sin_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sinh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sinh_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sinh_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sinh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sinh_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sinh_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sinh_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sinh_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sinh_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sqrt_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sqrt_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sqrt_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sqrt_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sqrt_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sqrt_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sqrt_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sqrt_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_sqrt_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_tan_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_tan_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_tan_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_tan_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_tan_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_tan_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_tan_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_tan_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_tan_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_tanh_cuda_complex128, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_tanh_cuda_complex64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_tanh_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_tanh_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_tanh_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_tanh_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_tanh_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_tanh_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_tanh_cuda_uint8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_trunc_cuda_float32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_trunc_cuda_float64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_trunc_cuda_int16, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_trunc_cuda_int32, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_trunc_cuda_int64, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_trunc_cuda_int8, test/test_sparse.py::TestSparseUnaryUfuncsCUDA::test_sparse_zeros_trunc_cuda_uint8, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_amax_cuda_bfloat16, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_amax_cuda_float16, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_amax_cuda_float32, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_amax_cuda_float64, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_amax_cuda_int16, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_amax_cuda_int32, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_amax_cuda_int64, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_amax_cuda_int8, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_amax_cuda_uint8, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_amin_cuda_bfloat16, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_amin_cuda_float16, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_amin_cuda_float32, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_amin_cuda_float64, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_amin_cuda_int16, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_amin_cuda_int32, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_amin_cuda_int64, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_amin_cuda_int8, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_amin_cuda_uint8, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_prod_cuda_bfloat16, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_prod_cuda_bool, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_prod_cuda_complex128, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_prod_cuda_complex64, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_prod_cuda_float16, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_prod_cuda_float32, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_prod_cuda_float64, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_prod_cuda_int16, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_prod_cuda_int32, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_prod_cuda_int64, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_prod_cuda_int8, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_prod_cuda_uint8, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_sum_cuda_bfloat16, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_sum_cuda_bool, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_sum_cuda_complex128, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_sum_cuda_complex64, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_sum_cuda_float16, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_sum_cuda_float32, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_sum_cuda_float64, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_sum_cuda_int16, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_sum_cuda_int32, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_sum_cuda_int64, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_sum_cuda_int8, test/test_sparse.py::TestSparseMaskedReductionsCUDA::test_future_empty_dim_masked_sum_cuda_uint8, test/test_sparse.py::TestSparseCUDA::test_Sparse_to_Sparse_copy__cuda_bfloat16, test/test_sparse.py::TestSparseCUDA::test_Sparse_to_Sparse_copy__cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_Sparse_to_Sparse_copy__cuda_float64, test/test_sparse.py::TestSparseCUDA::test_Sparse_to_Sparse_copy_multi_gpu_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_Sparse_to_Sparse_copy_multi_gpu_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_add_dense_sparse_mismatch_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_add_dense_sparse_mismatch_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_add_noncontiguous_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_add_noncontiguous_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_add_sub_nnz_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_add_sub_nnz_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_add_zeros_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_add_zeros_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_any_cuda, test/test_sparse.py::TestSparseCUDA::test_asin_arcsin_cuda_float32, test/test_sparse.py::TestSparseCUDA::test_asin_arcsin_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_asin_arcsin_cuda_int16, test/test_sparse.py::TestSparseCUDA::test_asin_arcsin_cuda_int32, test/test_sparse.py::TestSparseCUDA::test_asin_arcsin_cuda_int64, test/test_sparse.py::TestSparseCUDA::test_asin_arcsin_cuda_int8, test/test_sparse.py::TestSparseCUDA::test_asin_arcsin_cuda_uint8, test/test_sparse.py::TestSparseCUDA::test_assign_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_basic_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_basic_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_basic_ops_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_bmm_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_bmm_deterministic_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_bmm_oob_cuda, test/test_sparse.py::TestSparseCUDA::test_bmm_windows_error_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_cat_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_cat_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_change_tensor_metadata_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_change_tensor_metadata_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_clone_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_clone_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_coalesce_accepts_large_tensor_cuda_float32, test/test_sparse.py::TestSparseCUDA::test_coalesce_cuda_bfloat16, test/test_sparse.py::TestSparseCUDA::test_coalesce_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_coalesce_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_coalesce_reference_cycle_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_coalesce_transpose_mm_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_contig_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_contig_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_contig_hybrid_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_contig_hybrid_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_ctor_is_coalesced_with_gradcheck_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_ctor_large_sizes_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_ctor_size_checks_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_ctor_size_checks_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_cuda_empty_cuda, test/test_sparse.py::TestSparseCUDA::test_div_by_sparse_error_cuda, test/test_sparse.py::TestSparseCUDA::test_div_rounding_mode_cuda_float32, test/test_sparse.py::TestSparseCUDA::test_div_rounding_mode_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_dsmm_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_dtypes_cuda, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_False_cuda_bfloat16, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_False_cuda_bool, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_False_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_False_cuda_complex64, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_False_cuda_float16, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_False_cuda_float32, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_False_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_False_cuda_int16, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_False_cuda_int32, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_False_cuda_int64, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_False_cuda_int8, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_False_cuda_uint8, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_True_cuda_bfloat16, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_True_cuda_bool, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_True_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_True_cuda_complex64, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_True_cuda_float16, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_True_cuda_float32, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_True_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_True_cuda_int16, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_True_cuda_int32, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_True_cuda_int64, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_True_cuda_int8, test/test_sparse.py::TestSparseCUDA::test_empty_full_requires_grad_True_cuda_uint8, test/test_sparse.py::TestSparseCUDA::test_empty_like_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_empty_like_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_factory_copy_cuda, test/test_sparse.py::TestSparseCUDA::test_factory_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_factory_cuda_complex64, test/test_sparse.py::TestSparseCUDA::test_factory_cuda_float16, test/test_sparse.py::TestSparseCUDA::test_factory_cuda_float32, test/test_sparse.py::TestSparseCUDA::test_factory_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_factory_dense_dim_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_factory_dense_dim_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_factory_device_type_inference_cuda, test/test_sparse.py::TestSparseCUDA::test_factory_empty_indices_cuda, test/test_sparse.py::TestSparseCUDA::test_factory_nnz_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_factory_nnz_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_factory_nnz_zero_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_factory_nnz_zero_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_factory_size_check_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_factory_size_check_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_factory_type_inference_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_factory_type_inference_cuda_complex64, test/test_sparse.py::TestSparseCUDA::test_factory_type_inference_cuda_float16, test/test_sparse.py::TestSparseCUDA::test_factory_type_inference_cuda_float32, test/test_sparse.py::TestSparseCUDA::test_factory_type_inference_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_factory_type_inference_cuda_int64, test/test_sparse.py::TestSparseCUDA::test_floor_divide_by_sparse_error_cuda, test/test_sparse.py::TestSparseCUDA::test_full_broadcast_to_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_full_broadcast_to_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_hsmm_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_index_select_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_index_select_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_index_select_empty_and_non_contiguous_index_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_index_select_empty_and_non_contiguous_index_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_index_select_exhaustive_index_large_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_index_select_exhaustive_index_large_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_index_select_exhaustive_index_small_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_index_select_exhaustive_index_small_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_index_select_parallelization_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_index_select_parallelization_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_is_nonzero_cuda, test/test_sparse.py::TestSparseCUDA::test_is_sparse_cuda, test/test_sparse.py::TestSparseCUDA::test_isnan_cuda, test/test_sparse.py::TestSparseCUDA::test_legacy_new_cuda, test/test_sparse.py::TestSparseCUDA::test_legacy_new_device_cuda, test/test_sparse.py::TestSparseCUDA::test_log1p_cuda_float32, test/test_sparse.py::TestSparseCUDA::test_log1p_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_log1p_cuda_int16, test/test_sparse.py::TestSparseCUDA::test_log1p_cuda_int32, test/test_sparse.py::TestSparseCUDA::test_log1p_cuda_int64, test/test_sparse.py::TestSparseCUDA::test_log1p_cuda_int8, test/test_sparse.py::TestSparseCUDA::test_log1p_cuda_uint8, test/test_sparse.py::TestSparseCUDA::test_log_softmax_float_cuda_float32, test/test_sparse.py::TestSparseCUDA::test_log_softmax_zero_nnz_cuda_float32, test/test_sparse.py::TestSparseCUDA::test_log_softmax_zero_nnz_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_mm_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_mm_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_mv_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_narrow_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_narrow_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_neg_negative_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_neg_negative_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_new_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_new_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_new_device_multi_gpu_cuda, test/test_sparse.py::TestSparseCUDA::test_new_device_single_gpu_cuda, test/test_sparse.py::TestSparseCUDA::test_norm_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_norm_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_permute_masked_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_permute_masked_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_permute_sparse_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_permute_sparse_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_pickle_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_print_coalesced_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_print_uncoalesced_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_resize_as_cuda, test/test_sparse.py::TestSparseCUDA::test_resize_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_resize_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_saddmm_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_saddmm_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_same_gpu_cuda, test/test_sparse.py::TestSparseCUDA::test_scalar_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_scalar_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_select_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_select_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_select_no_type_promotion_cuda_int16, test/test_sparse.py::TestSparseCUDA::test_select_no_type_promotion_cuda_int32, test/test_sparse.py::TestSparseCUDA::test_select_no_type_promotion_cuda_int64, test/test_sparse.py::TestSparseCUDA::test_select_no_type_promotion_cuda_int8, test/test_sparse.py::TestSparseCUDA::test_select_no_type_promotion_cuda_uint8, test/test_sparse.py::TestSparseCUDA::test_shared_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_shared_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_small_nnz_coalesced_cuda, test/test_sparse.py::TestSparseCUDA::test_softmax_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_softmax_zero_nnz_cuda_float32, test/test_sparse.py::TestSparseCUDA::test_softmax_zero_nnz_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_spadd_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_sparse_add_coalesce_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_sparse_add_coalesce_cuda_complex64, test/test_sparse.py::TestSparseCUDA::test_sparse_add_coalesce_cuda_float32, test/test_sparse.py::TestSparseCUDA::test_sparse_add_coalesce_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_sparse_add_out_bfloat16_cuda_float32, test/test_sparse.py::TestSparseCUDA::test_sparse_addmm_cuda_bfloat16, test/test_sparse.py::TestSparseCUDA::test_sparse_addmm_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_sparse_addmm_cuda_float16, test/test_sparse.py::TestSparseCUDA::test_sparse_addmm_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_sparse_bool_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_sparse_bool_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_sparse_broadcast_to_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_sparse_broadcast_to_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_sparse_dense_mul_cuda_bfloat16, test/test_sparse.py::TestSparseCUDA::test_sparse_dense_mul_cuda_bool, test/test_sparse.py::TestSparseCUDA::test_sparse_dense_mul_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_sparse_dense_mul_cuda_complex64, test/test_sparse.py::TestSparseCUDA::test_sparse_dense_mul_cuda_float16, test/test_sparse.py::TestSparseCUDA::test_sparse_dense_mul_cuda_float32, test/test_sparse.py::TestSparseCUDA::test_sparse_dense_mul_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_sparse_dense_mul_cuda_int16, test/test_sparse.py::TestSparseCUDA::test_sparse_dense_mul_cuda_int32, test/test_sparse.py::TestSparseCUDA::test_sparse_dense_mul_cuda_int64, test/test_sparse.py::TestSparseCUDA::test_sparse_dense_mul_cuda_int8, test/test_sparse.py::TestSparseCUDA::test_sparse_dense_mul_cuda_uint8, test/test_sparse.py::TestSparseCUDA::test_sparse_mask_backward_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_sparse_mask_backward_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_sparse_mask_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_sparse_mask_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_sparse_mask_hybrid_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_sparse_mask_hybrid_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_sparse_matmul_cuda_bfloat16, test/test_sparse.py::TestSparseCUDA::test_sparse_matmul_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_sparse_matmul_cuda_complex64, test/test_sparse.py::TestSparseCUDA::test_sparse_matmul_cuda_float16, test/test_sparse.py::TestSparseCUDA::test_sparse_matmul_cuda_float32, test/test_sparse.py::TestSparseCUDA::test_sparse_matmul_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_sparse_mm_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_sparse_mul_masked_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_sparse_mul_sparse_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_sparse_sparse_mul_cuda_bfloat16, test/test_sparse.py::TestSparseCUDA::test_sparse_sparse_mul_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_sparse_sparse_mul_cuda_complex64, test/test_sparse.py::TestSparseCUDA::test_sparse_sparse_mul_cuda_float16, test/test_sparse.py::TestSparseCUDA::test_sparse_sparse_mul_cuda_float32, test/test_sparse.py::TestSparseCUDA::test_sparse_sparse_mul_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_sparse_sparse_mul_cuda_int16, test/test_sparse.py::TestSparseCUDA::test_sparse_sparse_mul_cuda_int32, test/test_sparse.py::TestSparseCUDA::test_sparse_sparse_mul_cuda_int64, test/test_sparse.py::TestSparseCUDA::test_sparse_sparse_mul_cuda_int8, test/test_sparse.py::TestSparseCUDA::test_sparse_sparse_mul_cuda_uint8, test/test_sparse.py::TestSparseCUDA::test_sparse_spdiags_cuda_bool, test/test_sparse.py::TestSparseCUDA::test_sparse_spdiags_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_sparse_spdiags_cuda_complex64, test/test_sparse.py::TestSparseCUDA::test_sparse_spdiags_cuda_float32, test/test_sparse.py::TestSparseCUDA::test_sparse_spdiags_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_sparse_spdiags_cuda_int16, test/test_sparse.py::TestSparseCUDA::test_sparse_spdiags_cuda_int32, test/test_sparse.py::TestSparseCUDA::test_sparse_spdiags_cuda_int64, test/test_sparse.py::TestSparseCUDA::test_sparse_spdiags_cuda_int8, test/test_sparse.py::TestSparseCUDA::test_sparse_spdiags_cuda_uint8, test/test_sparse.py::TestSparseCUDA::test_sparse_sum_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_sparse_to_numpy_cuda, test/test_sparse.py::TestSparseCUDA::test_sspaddmm_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_sspaddmm_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_storage_not_null_cuda, test/test_sparse.py::TestSparseCUDA::test_sum_cuda_bool, test/test_sparse.py::TestSparseCUDA::test_sum_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_sum_cuda_complex64, test/test_sparse.py::TestSparseCUDA::test_sum_cuda_float32, test/test_sparse.py::TestSparseCUDA::test_sum_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_sum_cuda_int16, test/test_sparse.py::TestSparseCUDA::test_sum_cuda_int32, test/test_sparse.py::TestSparseCUDA::test_sum_cuda_int64, test/test_sparse.py::TestSparseCUDA::test_sum_cuda_int8, test/test_sparse.py::TestSparseCUDA::test_sum_cuda_uint8, test/test_sparse.py::TestSparseCUDA::test_t_empty_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_t_empty_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_to_dense_hybrid_masked_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_to_dense_hybrid_masked_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_to_dense_hybrid_sparse_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_to_dense_hybrid_sparse_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_to_dense_with_gradcheck_masked_cuda_bfloat16, test/test_sparse.py::TestSparseCUDA::test_to_dense_with_gradcheck_masked_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_to_dense_with_gradcheck_masked_cuda_complex64, test/test_sparse.py::TestSparseCUDA::test_to_dense_with_gradcheck_masked_cuda_float16, test/test_sparse.py::TestSparseCUDA::test_to_dense_with_gradcheck_masked_cuda_float32, test/test_sparse.py::TestSparseCUDA::test_to_dense_with_gradcheck_masked_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_to_dense_with_gradcheck_sparse_cuda_bfloat16, test/test_sparse.py::TestSparseCUDA::test_to_dense_with_gradcheck_sparse_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_to_dense_with_gradcheck_sparse_cuda_complex64, test/test_sparse.py::TestSparseCUDA::test_to_dense_with_gradcheck_sparse_cuda_float16, test/test_sparse.py::TestSparseCUDA::test_to_dense_with_gradcheck_sparse_cuda_float32, test/test_sparse.py::TestSparseCUDA::test_to_dense_with_gradcheck_sparse_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_to_sparse_cuda_bfloat16, test/test_sparse.py::TestSparseCUDA::test_to_sparse_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_to_sparse_cuda_complex64, test/test_sparse.py::TestSparseCUDA::test_to_sparse_cuda_float16, test/test_sparse.py::TestSparseCUDA::test_to_sparse_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_to_sparse_cuda_int32, test/test_sparse.py::TestSparseCUDA::test_transpose_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_transpose_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_unsqueeze_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_unsqueeze_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_zeros_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_zeros_cuda_float64, test/test_sparse.py::TestSparseCUDA::test_zeros_like_cuda_complex128, test/test_sparse.py::TestSparseCUDA::test_zeros_like_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_as_sparse_gradcheck_SparseBSC_masked_fast_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_as_sparse_gradcheck_SparseBSC_masked_slow_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_as_sparse_gradcheck_SparseBSC_nonmasked_fast_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_as_sparse_gradcheck_SparseBSC_nonmasked_slow_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_as_sparse_gradcheck_SparseBSR_masked_fast_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_as_sparse_gradcheck_SparseBSR_masked_slow_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_as_sparse_gradcheck_SparseBSR_nonmasked_fast_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_as_sparse_gradcheck_SparseBSR_nonmasked_slow_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_as_sparse_gradcheck_SparseCOO_masked_fast_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_as_sparse_gradcheck_SparseCOO_masked_slow_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_as_sparse_gradcheck_SparseCOO_nonmasked_fast_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_as_sparse_gradcheck_SparseCOO_nonmasked_slow_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_as_sparse_gradcheck_SparseCSC_masked_fast_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_as_sparse_gradcheck_SparseCSC_masked_slow_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_as_sparse_gradcheck_SparseCSC_nonmasked_fast_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_as_sparse_gradcheck_SparseCSC_nonmasked_slow_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_as_sparse_gradcheck_SparseCSR_masked_fast_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_as_sparse_gradcheck_SparseCSR_masked_slow_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_as_sparse_gradcheck_SparseCSR_nonmasked_fast_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_as_sparse_gradcheck_SparseCSR_nonmasked_slow_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSC_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSC_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSC_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSC_cuda_complex32, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSC_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSC_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSC_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSC_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSC_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSC_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSC_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSC_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSC_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSR_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSR_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSR_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSR_cuda_complex32, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSR_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSR_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSR_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSR_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSR_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSR_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSR_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSR_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseBSR_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCOO_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCOO_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCOO_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCOO_cuda_complex32, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCOO_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCOO_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCOO_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCOO_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCOO_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCOO_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCOO_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCOO_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCOO_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSC_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSC_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSC_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSC_cuda_complex32, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSC_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSC_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSC_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSC_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSC_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSC_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSC_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSC_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSC_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSR_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSR_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSR_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSR_cuda_complex32, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSR_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSR_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSR_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSR_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSR_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSR_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSR_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSR_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_binary_operation_mul_SparseCSR_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_check_sparse_tensor_invariants_SparseBSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_check_sparse_tensor_invariants_SparseBSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_check_sparse_tensor_invariants_SparseCOO_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_check_sparse_tensor_invariants_SparseCSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_check_sparse_tensor_invariants_SparseCSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_constructor_autograd_SparseBSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_constructor_autograd_SparseBSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_constructor_autograd_SparseCOO_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_constructor_autograd_SparseCSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_constructor_autograd_SparseCSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_constructor_mismatched_pinned_memory_SparseBSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_constructor_mismatched_pinned_memory_SparseBSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_constructor_mismatched_pinned_memory_SparseCOO_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_constructor_mismatched_pinned_memory_SparseCSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_constructor_mismatched_pinned_memory_SparseCSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_constructor_pin_memory_SparseBSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_constructor_pin_memory_SparseBSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_constructor_pin_memory_SparseCOO_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_constructor_pin_memory_SparseCSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_constructor_pin_memory_SparseCSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_constructor_pin_memory_Strided_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_constructor_pinned_memory_SparseBSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_constructor_pinned_memory_SparseBSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_constructor_pinned_memory_SparseCOO_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_constructor_pinned_memory_SparseCSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_constructor_pinned_memory_SparseCSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_constructor_pinned_memory_Strided_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_dataloader_SparseBSC_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_dataloader_SparseBSR_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_dataloader_SparseCOO_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_dataloader_SparseCSC_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_dataloader_SparseCSR_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_generate_simple_inputs_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseBSC_masked_fast_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseBSC_masked_fast_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseBSC_masked_slow_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseBSC_masked_slow_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseBSC_sparse_fast_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseBSC_sparse_fast_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseBSC_sparse_slow_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseBSC_sparse_slow_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseBSR_masked_fast_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseBSR_masked_fast_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseBSR_masked_slow_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseBSR_masked_slow_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseBSR_sparse_fast_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseBSR_sparse_fast_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseBSR_sparse_slow_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseBSR_sparse_slow_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCOO_masked_fast_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCOO_masked_fast_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCOO_masked_slow_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCOO_masked_slow_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCOO_sparse_fast_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCOO_sparse_fast_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCOO_sparse_slow_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCOO_sparse_slow_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCSC_masked_fast_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCSC_masked_fast_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCSC_masked_slow_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCSC_masked_slow_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCSC_sparse_fast_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCSC_sparse_fast_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCSC_sparse_slow_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCSC_sparse_slow_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCSR_masked_fast_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCSR_masked_fast_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCSR_masked_slow_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCSR_masked_slow_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCSR_sparse_fast_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCSR_sparse_fast_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCSR_sparse_slow_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_mm_SparseCSR_sparse_slow_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_to_dense_SparseBSC_int64_masked_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_to_dense_SparseBSC_int64_masked_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_to_dense_SparseBSC_int64_sparse_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_to_dense_SparseBSC_int64_sparse_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_to_dense_SparseBSR_int64_masked_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_to_dense_SparseBSR_int64_masked_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_to_dense_SparseBSR_int64_sparse_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_to_dense_SparseBSR_int64_sparse_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_to_dense_SparseCOO_int64_masked_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_to_dense_SparseCOO_int64_masked_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_to_dense_SparseCOO_int64_sparse_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_to_dense_SparseCOO_int64_sparse_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_to_dense_SparseCSC_int64_masked_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_to_dense_SparseCSC_int64_masked_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_to_dense_SparseCSC_int64_sparse_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_to_dense_SparseCSC_int64_sparse_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_to_dense_SparseCSR_int64_masked_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_to_dense_SparseCSR_int64_masked_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_to_dense_SparseCSR_int64_sparse_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_gradcheck_to_dense_SparseCSR_int64_sparse_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_invalid_blocksize_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseBSC_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseBSC_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseBSC_cuda_complex32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseBSC_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseBSC_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseBSC_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseBSC_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseBSR_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseBSR_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseBSR_cuda_complex32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseBSR_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseBSR_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseBSR_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseBSR_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseCOO_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseCOO_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseCOO_cuda_complex32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseCOO_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseCOO_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseCOO_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseCOO_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseCSC_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseCSC_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseCSC_cuda_complex32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseCSC_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseCSC_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseCSC_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseCSC_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseCSR_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseCSR_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseCSR_cuda_complex32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseCSR_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseCSR_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseCSR_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_randn_like_SparseCSR_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSC_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSC_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSC_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSC_cuda_complex32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSC_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSC_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSC_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSC_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSC_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSC_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSC_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSC_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSC_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSR_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSR_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSR_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSR_cuda_complex32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSR_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSR_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSR_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSR_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSR_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSR_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSR_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSR_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseBSR_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCOO_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCOO_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCOO_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCOO_cuda_complex32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCOO_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCOO_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCOO_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCOO_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCOO_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCOO_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCOO_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCOO_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCOO_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSC_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSC_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSC_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSC_cuda_complex32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSC_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSC_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSC_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSC_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSC_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSC_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSC_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSC_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSC_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSR_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSR_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSR_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSR_cuda_complex32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSR_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSR_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSR_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSR_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSR_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSR_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSR_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSR_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_like_fns_zeros_like_SparseCSR_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_method_pin_memory_SparseBSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_method_pin_memory_SparseBSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_method_pin_memory_SparseCOO_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_method_pin_memory_SparseCSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_method_pin_memory_SparseCSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_method_pin_memory_Strided_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_backward_sum_SparseBSC_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_backward_sum_SparseBSC_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_backward_sum_SparseBSC_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_backward_sum_SparseBSC_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_backward_sum_SparseBSR_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_backward_sum_SparseBSR_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_backward_sum_SparseBSR_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_backward_sum_SparseBSR_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_backward_sum_SparseCOO_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_backward_sum_SparseCOO_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_backward_sum_SparseCOO_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_backward_sum_SparseCOO_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_backward_sum_SparseCSC_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_backward_sum_SparseCSC_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_backward_sum_SparseCSC_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_backward_sum_SparseCSC_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_backward_sum_SparseCSR_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_backward_sum_SparseCSR_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_backward_sum_SparseCSR_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_backward_sum_SparseCSR_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSC_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSC_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSC_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSC_cuda_complex32, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSC_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSC_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSC_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSC_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSC_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSC_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSC_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSC_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSC_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSR_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSR_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSR_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSR_cuda_complex32, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSR_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSR_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSR_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSR_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSR_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSR_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSR_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSR_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseBSR_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCOO_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCOO_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCOO_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCOO_cuda_complex32, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCOO_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCOO_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCOO_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCOO_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCOO_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCOO_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCOO_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCOO_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCOO_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSC_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSC_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSC_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSC_cuda_complex32, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSC_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSC_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSC_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSC_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSC_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSC_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSC_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSC_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSC_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSR_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSR_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSR_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSR_cuda_complex32, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSR_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSR_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSR_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSR_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSR_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSR_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSR_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSR_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_reductions_sum_SparseCSR_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSC_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSC_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSC_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSC_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSC_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSC_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSC_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSC_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSC_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSC_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSC_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSC_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSR_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSR_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSR_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSR_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSR_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSR_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSR_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSR_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSR_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSR_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSR_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseBSR_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCOO_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCOO_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCOO_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCOO_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCOO_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCOO_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCOO_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCOO_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCOO_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCOO_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCOO_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCOO_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSC_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSC_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSC_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSC_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSC_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSC_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSC_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSC_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSC_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSC_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSC_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSC_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSR_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSR_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSR_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSR_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSR_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSR_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSR_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSR_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSR_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSR_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSR_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_sparse_mask_SparseCSR_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSC_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseBSR_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCOO_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSC_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_dense_SparseCSR_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSC_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseBSR_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCOO_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSC_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSC_SparseCSR_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSC_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseBSR_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCOO_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSC_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseBSR_SparseCSR_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSC_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseBSR_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCOO_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSC_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCOO_SparseCSR_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSC_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseBSR_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCOO_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSC_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSC_SparseCSR_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSC_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseBSR_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCOO_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSC_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_SparseCSR_SparseCSR_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSC_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseBSR_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCOO_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSC_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int32_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int32_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int32_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int32_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int32_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int32_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int32_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int32_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int32_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int32_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int32_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int32_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int64_cuda_bfloat16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int64_cuda_bool, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int64_cuda_complex128, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int64_cuda_complex64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int64_cuda_float16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int64_cuda_float32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int64_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int64_cuda_int16, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int64_cuda_int32, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int64_cuda_int64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int64_cuda_int8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_Strided_SparseCSR_int64_cuda_uint8, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_identity_SparseBSC_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_identity_SparseBSR_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_identity_SparseCOO_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_identity_SparseCSC_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_identity_SparseCSR_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_to_sparse_identity_Strided_cuda_float64, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_ccol_indices_SparseBSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_ccol_indices_SparseBSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_ccol_indices_SparseCOO_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_ccol_indices_SparseCSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_ccol_indices_SparseCSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_ccol_indices_Strided_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_coalesce_SparseBSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_coalesce_SparseBSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_coalesce_SparseCOO_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_coalesce_SparseCSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_coalesce_SparseCSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_coalesce_Strided_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_col_indices_SparseBSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_col_indices_SparseBSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_col_indices_SparseCOO_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_col_indices_SparseCSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_col_indices_SparseCSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_col_indices_Strided_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_crow_indices_SparseBSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_crow_indices_SparseBSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_crow_indices_SparseCOO_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_crow_indices_SparseCSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_crow_indices_SparseCSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_crow_indices_Strided_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_indices_SparseBSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_indices_SparseBSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_indices_SparseCOO_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_indices_SparseCSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_indices_SparseCSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_indices_Strided_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_is_coalesced_SparseBSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_is_coalesced_SparseBSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_is_coalesced_SparseCOO_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_is_coalesced_SparseCSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_is_coalesced_SparseCSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_is_coalesced_Strided_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_row_indices_SparseBSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_row_indices_SparseBSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_row_indices_SparseCOO_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_row_indices_SparseCSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_row_indices_SparseCSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_row_indices_Strided_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_values_SparseBSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_values_SparseBSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_values_SparseCOO_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_values_SparseCSC_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_values_SparseCSR_cuda, test/test_sparse.py::TestSparseAnyCUDA::test_unsupported_backend_error_message_values_Strided_cuda 2025-10-10T02:28:02.2927180Z 2025-10-10T02:28:04.5709112Z Uploading artifacts took 2.48 seconds 2025-10-10T02:28:06.5970419Z Running test batch 'tests to run' cost 5615.8 seconds 2025-10-10T02:28:07.3224670Z 2025-10-10T02:28:07.3225069Z real 93m41.669s 2025-10-10T02:28:07.3225351Z user 208m36.124s 2025-10-10T02:28:07.3225890Z sys 44m48.375s 2025-10-10T02:28:07.3226214Z + assert_git_not_dirty 2025-10-10T02:28:07.3226569Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 != *rocm* ]] 2025-10-10T02:28:07.3227004Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 != *xla* ]] 2025-10-10T02:28:07.3232052Z ++ git status --porcelain 2025-10-10T02:28:07.3232364Z ++ grep -v '?? third_party' 2025-10-10T02:28:11.5444967Z ++ true 2025-10-10T02:28:11.5446140Z + git_status= 2025-10-10T02:28:11.5446452Z + [[ -n '' ]] 2025-10-10T02:28:11.5446756Z + test_libtorch 2 2025-10-10T02:28:11.5447201Z + local SHARD=2 2025-10-10T02:28:11.5447515Z + [[ slow != \s\l\o\w ]] 2025-10-10T02:28:11.5447868Z + test_aot_compilation 2025-10-10T02:28:11.5448246Z + echo 'Testing Ahead of Time compilation' 2025-10-10T02:28:11.5448707Z Testing Ahead of Time compilation 2025-10-10T02:28:11.5450532Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libc10.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libc10_cuda.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libc10d_cuda_test.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2025-10-10T02:28:11.5473112Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_cuda_linalg.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_global_deps.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_nvshmem.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_python.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorchbind_test.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2025-10-10T02:28:11.5489563Z + '[' -f /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin/test_mobile_nnc ']' 2025-10-10T02:28:11.5490503Z + '[' -f /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin/aot_model_compiler_test ']' 2025-10-10T02:28:11.5491189Z + test_custom_script_ops 2025-10-10T02:28:11.5491566Z + echo 'Testing custom script operators' 2025-10-10T02:28:11.5491998Z Testing custom script operators 2025-10-10T02:28:11.5492479Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *s390x* ]] 2025-10-10T02:28:11.5493228Z + CUSTOM_OP_BUILD=/var/lib/jenkins/workspace/build/custom_test_artifacts/custom-op-build 2025-10-10T02:28:11.5493740Z + pushd test/custom_operator 2025-10-10T02:28:11.5494040Z ~/workspace/test/custom_operator ~/workspace 2025-10-10T02:28:11.5494539Z + cp -a /var/lib/jenkins/workspace/build/custom_test_artifacts/custom-op-build build 2025-10-10T02:28:11.5696400Z + python test_custom_ops.py -v 2025-10-10T02:28:14.2255664Z Test results will be stored in test-reports/python-unittest/test_custom_ops 2025-10-10T02:28:14.2269043Z 2025-10-10T02:28:14.2269312Z Running tests... 2025-10-10T02:28:14.2269653Z ---------------------------------------------------------------------- 2025-10-10T02:28:14.4877145Z test_abstract_impl_pystub_faketensor (__main__.TestCustomOperators) ... /var/lib/jenkins/workspace/test/custom_operator/my_custom_ops.py:13: FutureWarning: `create_unbacked_symint` is deprecated, please use `new_dynamic_size` instead 2025-10-10T02:28:14.4878206Z nnz = ctx.create_unbacked_symint() 2025-10-10T02:28:14.4961781Z ok (0.269s) 2025-10-10T02:28:14.5006740Z test_abstract_impl_pystub_meta (__main__.TestCustomOperators) ... ok (0.004s) 2025-10-10T02:28:14.5026508Z test_calling_custom_op (__main__.TestCustomOperators) ... ok (0.002s) 2025-10-10T02:28:14.5493885Z test_calling_custom_op_inside_script_module (__main__.TestCustomOperators) ... ok (0.046s) 2025-10-10T02:28:14.5500280Z test_calling_custom_op_string (__main__.TestCustomOperators) ... ok (0.001s) 2025-10-10T02:28:14.5524383Z test_calling_custom_op_with_autograd (__main__.TestCustomOperators) ... /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:849: UserWarning: Using backward() with create_graph=True will create a reference cycle between the parameter and its gradient which can cause a memory leak. We recommend using autograd.grad when creating the graph to avoid this. If you have to use this function, make sure to reset the .grad fields of your parameters to None after use to break the cycle and avoid the leak. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/engine.cpp:1293.) 2025-10-10T02:28:14.5526901Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-10-10T02:28:14.5537665Z ok (0.004s) 2025-10-10T02:28:14.5545507Z test_calling_custom_op_with_autograd_in_nograd_mode (__main__.TestCustomOperators) ... ok (0.001s) 2025-10-10T02:28:14.5549204Z test_custom_library_is_loaded (__main__.TestCustomOperators) ... ok (0.000s) 2025-10-10T02:28:14.7202230Z test_dynamo_pystub_suggestion (__main__.TestCustomOperators) ... ok (0.165s) 2025-10-10T02:28:14.8474943Z test_op_with_incorrect_abstract_impl_pystub (__main__.TestCustomOperators) ... ok (0.001s) 2025-10-10T02:28:14.8482014Z test_op_with_no_abstract_impl_pystub (__main__.TestCustomOperators) ... ok (0.001s) 2025-10-10T02:28:14.8569005Z test_saving_and_loading_script_module_with_custom_op (__main__.TestCustomOperators) ... ok (0.009s) 2025-10-10T02:28:14.8569563Z 2025-10-10T02:28:14.8569730Z ---------------------------------------------------------------------- 2025-10-10T02:28:14.8570296Z Ran 12 tests in 0.630s 2025-10-10T02:28:14.8570538Z 2025-10-10T02:28:14.8570622Z OK 2025-10-10T02:28:14.8570741Z 2025-10-10T02:28:14.8570847Z Generating XML reports... 2025-10-10T02:28:14.8608057Z Generated XML report: test-reports/python-unittest/test_custom_ops/TEST-TestCustomOperators-20251010022814.xml 2025-10-10T02:28:15.5996634Z + python model.py --export-script-module=model.pt 2025-10-10T02:28:17.2891943Z + build/test_custom_ops ./model.pt 2025-10-10T02:28:17.6612629Z [W1010 02:28:17.640937779 engine.cpp:1293] Warning: Using backward() with create_graph=True will create a reference cycle between the parameter and its gradient which can cause a memory leak. We recommend using autograd.grad when creating the graph to avoid this. If you have to use this function, make sure to reset the .grad fields of your parameters to None after use to break the cycle and avoid the leak. (function operator()) 2025-10-10T02:28:17.9505584Z ok 2025-10-10T02:28:18.1830624Z + popd 2025-10-10T02:28:18.1830892Z ~/workspace 2025-10-10T02:28:18.1831545Z + assert_git_not_dirty 2025-10-10T02:28:18.1831904Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 != *rocm* ]] 2025-10-10T02:28:18.1832319Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 != *xla* ]] 2025-10-10T02:28:18.1838958Z ++ git status --porcelain 2025-10-10T02:28:18.1839575Z ++ grep -v '?? third_party' 2025-10-10T02:28:18.5923332Z ++ true 2025-10-10T02:28:18.5925467Z + git_status= 2025-10-10T02:28:18.5925847Z + [[ -n '' ]] 2025-10-10T02:28:18.5926168Z + test_custom_backend 2025-10-10T02:28:18.5926545Z + echo 'Testing custom backends' 2025-10-10T02:28:18.5926905Z Testing custom backends 2025-10-10T02:28:18.5927496Z + CUSTOM_BACKEND_BUILD=/var/lib/jenkins/workspace/build/custom_test_artifacts/custom-backend-build 2025-10-10T02:28:18.5928033Z + pushd test/custom_backend 2025-10-10T02:28:18.5928328Z ~/workspace/test/custom_backend ~/workspace 2025-10-10T02:28:18.5928845Z + cp -a /var/lib/jenkins/workspace/build/custom_test_artifacts/custom-backend-build build 2025-10-10T02:28:18.6132080Z + python test_custom_backend.py -v 2025-10-10T02:28:21.2738457Z Test results will be stored in test-reports/python-unittest/test_custom_backend 2025-10-10T02:28:21.2750242Z 2025-10-10T02:28:21.2750568Z Running tests... 2025-10-10T02:28:21.2750980Z ---------------------------------------------------------------------- 2025-10-10T02:28:21.2754986Z test_execute (__main__.TestCustomBackend) 2025-10-10T02:28:21.3310922Z Test execution using the custom backend. ... ok (0.056s) 2025-10-10T02:28:21.3314887Z test_save_load (__main__.TestCustomBackend) 2025-10-10T02:28:21.3510608Z Test that a lowered module can be executed correctly ... ok (0.020s) 2025-10-10T02:28:21.3511053Z 2025-10-10T02:28:21.3511271Z ---------------------------------------------------------------------- 2025-10-10T02:28:21.3511750Z Ran 2 tests in 0.076s 2025-10-10T02:28:21.3511918Z 2025-10-10T02:28:21.3512001Z OK 2025-10-10T02:28:21.3512118Z 2025-10-10T02:28:21.3512227Z Generating XML reports... 2025-10-10T02:28:21.3540294Z Generated XML report: test-reports/python-unittest/test_custom_backend/TEST-TestCustomBackend-20251010022821.xml 2025-10-10T02:28:21.9609620Z + python backend.py --export-module-to=model.pt 2025-10-10T02:28:23.6949189Z + build/test_custom_backend ./model.pt 2025-10-10T02:28:24.0437868Z Testing custom_backend 2025-10-10T02:28:24.1034494Z OK 2025-10-10T02:28:24.2496867Z + rm -f ./model.pt 2025-10-10T02:28:24.2530056Z + popd 2025-10-10T02:28:24.2530666Z ~/workspace 2025-10-10T02:28:24.2531253Z + assert_git_not_dirty 2025-10-10T02:28:24.2532073Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 != *rocm* ]] 2025-10-10T02:28:24.2532652Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 != *xla* ]] 2025-10-10T02:28:24.2537877Z ++ git status --porcelain 2025-10-10T02:28:24.2538524Z ++ grep -v '?? third_party' 2025-10-10T02:28:24.6636758Z ++ true 2025-10-10T02:28:24.6639390Z + git_status= 2025-10-10T02:28:24.6639720Z + [[ -n '' ]] 2025-10-10T02:28:24.6640157Z + test_torch_function_benchmark 2025-10-10T02:28:24.6640769Z + echo 'Testing __torch_function__ benchmarks' 2025-10-10T02:28:24.6641275Z Testing __torch_function__ benchmarks 2025-10-10T02:28:24.6641597Z + pushd benchmarks/overrides_benchmark 2025-10-10T02:28:24.6641976Z ~/workspace/benchmarks/overrides_benchmark ~/workspace 2025-10-10T02:28:24.6642352Z + python bench.py -n 1 -m 2 2025-10-10T02:28:26.0151991Z Type tensor had a minimum time of 0.00667572021484375 us and a standard deviation of 0.3899426374118775 us. 2025-10-10T02:28:26.0152963Z Type SubTensor had a minimum time of 0.024080276489257812 us and a standard deviation of 0.04787882062373683 us. 2025-10-10T02:28:26.0154962Z Type WithTorchFunction had a minimum time of 0.0152587890625 us and a standard deviation of 0.023770822735968977 us. 2025-10-10T02:28:26.0155978Z Type SubWithTorchFunction had a minimum time of 0.020503997802734375 us and a standard deviation of 0.009609481821826193 us. 2025-10-10T02:28:26.3458637Z + python pyspybench.py Tensor -n 1 2025-10-10T02:28:28.0378447Z + python pyspybench.py SubTensor -n 1 2025-10-10T02:28:29.7196143Z + python pyspybench.py WithTorchFunction -n 1 2025-10-10T02:28:31.3908253Z + python pyspybench.py SubWithTorchFunction -n 1 2025-10-10T02:28:33.0745198Z + popd 2025-10-10T02:28:33.0745467Z ~/workspace 2025-10-10T02:28:33.0745704Z + assert_git_not_dirty 2025-10-10T02:28:33.0746059Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 != *rocm* ]] 2025-10-10T02:28:33.0746483Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 != *xla* ]] 2025-10-10T02:28:33.0752988Z ++ git status --porcelain 2025-10-10T02:28:33.0753302Z ++ grep -v '?? third_party' 2025-10-10T02:28:33.4866808Z ++ true 2025-10-10T02:28:33.4869221Z + git_status= 2025-10-10T02:28:33.4869517Z + [[ -n '' ]] 2025-10-10T02:28:33.4870145Z + sccache_epilogue 2025-10-10T02:28:33.4870533Z + echo '::group::Sccache Compilation Log' 2025-10-10T02:28:33.4871308Z ##[group]Sccache Compilation Log 2025-10-10T02:28:33.4871676Z + echo '=================== sccache compilation log ===================' 2025-10-10T02:28:33.4872100Z =================== sccache compilation log =================== 2025-10-10T02:28:33.4872704Z + python /var/lib/jenkins/workspace/.ci/pytorch/print_sccache_log.py /var/lib/jenkins/sccache_error.log 2025-10-10T02:28:33.5022192Z + echo '=========== If your build fails, please take a look at the log above for possible reasons ===========' 2025-10-10T02:28:33.5023058Z =========== If your build fails, please take a look at the log above for possible reasons =========== 2025-10-10T02:28:33.5023512Z + sccache --show-stats 2025-10-10T02:28:33.5061872Z Compile requests 1139 2025-10-10T02:28:33.5062470Z Compile requests executed 56 2025-10-10T02:28:33.5062881Z Cache hits 10 2025-10-10T02:28:33.5063232Z Cache hits (C/C++) 10 2025-10-10T02:28:33.5063531Z Cache misses 46 2025-10-10T02:28:33.5063829Z Cache misses (C/C++) 46 2025-10-10T02:28:33.5064134Z Cache hits rate 17.86 % 2025-10-10T02:28:33.5064450Z Cache hits rate (C/C++) 17.86 % 2025-10-10T02:28:33.5064787Z Cache timeouts 0 2025-10-10T02:28:33.5065115Z Cache read errors 0 2025-10-10T02:28:33.5065412Z Forced recaches 0 2025-10-10T02:28:33.5065715Z Cache write errors 0 2025-10-10T02:28:33.5066015Z Cache errors 0 2025-10-10T02:28:33.5066318Z Compilations 46 2025-10-10T02:28:33.5066616Z Compilation failures 0 2025-10-10T02:28:33.5066938Z Non-cacheable compilations 0 2025-10-10T02:28:33.5067259Z Non-cacheable calls 22 2025-10-10T02:28:33.5067570Z Non-compilation calls 1061 2025-10-10T02:28:33.5067978Z Unsupported compiler calls 0 2025-10-10T02:28:33.5068416Z Average cache write 0.065 s 2025-10-10T02:28:33.5068836Z Average compiler 14.128 s 2025-10-10T02:28:33.5069337Z Average cache read hit 0.069 s 2025-10-10T02:28:33.5069730Z Failed distributed compilations 0 2025-10-10T02:28:33.5069942Z 2025-10-10T02:28:33.5070038Z Non-cacheable reasons: 2025-10-10T02:28:33.5070289Z -E 19 2025-10-10T02:28:33.5070593Z unknown source language 3 2025-10-10T02:28:33.5070788Z 2025-10-10T02:28:33.5071024Z Cache location s3, name: ossci-compiler-cache-circleci-v2, prefix: / 2025-10-10T02:28:33.5071443Z Version (client) 0.10.0 2025-10-10T02:28:33.5071745Z + sccache --stop-server 2025-10-10T02:28:33.5092263Z Stopping sccache server... 2025-10-10T02:28:33.5095959Z Compile requests 1139 2025-10-10T02:28:33.5096430Z Compile requests executed 56 2025-10-10T02:28:33.5096842Z Cache hits 10 2025-10-10T02:28:33.5097253Z Cache hits (C/C++) 10 2025-10-10T02:28:33.5097671Z Cache misses 46 2025-10-10T02:28:33.5098080Z Cache misses (C/C++) 46 2025-10-10T02:28:33.5098691Z Cache hits rate 17.86 % 2025-10-10T02:28:33.5099006Z Cache hits rate (C/C++) 17.86 % 2025-10-10T02:28:33.5099311Z Cache timeouts 0 2025-10-10T02:28:33.5099610Z Cache read errors 0 2025-10-10T02:28:33.5099908Z Forced recaches 0 2025-10-10T02:28:33.5100213Z Cache write errors 0 2025-10-10T02:28:33.5100513Z Cache errors 0 2025-10-10T02:28:33.5100818Z Compilations 46 2025-10-10T02:28:33.5101124Z Compilation failures 0 2025-10-10T02:28:33.5101443Z Non-cacheable compilations 0 2025-10-10T02:28:33.5101792Z Non-cacheable calls 22 2025-10-10T02:28:33.5102246Z Non-compilation calls 1061 2025-10-10T02:28:33.5102648Z Unsupported compiler calls 0 2025-10-10T02:28:33.5102968Z Average cache write 0.065 s 2025-10-10T02:28:33.5103394Z Average compiler 14.128 s 2025-10-10T02:28:33.5103825Z Average cache read hit 0.069 s 2025-10-10T02:28:33.5104252Z Failed distributed compilations 0 2025-10-10T02:28:33.5104469Z 2025-10-10T02:28:33.5104575Z Non-cacheable reasons: 2025-10-10T02:28:33.5104831Z -E 19 2025-10-10T02:28:33.5105133Z unknown source language 3 2025-10-10T02:28:33.5105330Z 2025-10-10T02:28:33.5105683Z Cache location s3, name: ossci-compiler-cache-circleci-v2, prefix: / 2025-10-10T02:28:33.5106167Z Version (client) 0.10.0 2025-10-10T02:28:33.5106458Z + echo ::endgroup:: 2025-10-10T02:28:33.5106906Z ##[endgroup] 2025-10-10T02:28:33.5107121Z + cleanup_workspace 2025-10-10T02:28:33.5107596Z + echo 'sudo may print the following warning message that can be ignored. The chown command will still run.' 2025-10-10T02:28:33.5108302Z sudo may print the following warning message that can be ignored. The chown command will still run. 2025-10-10T02:28:33.5108882Z + echo ' sudo: setrlimit(RLIMIT_STACK): Operation not permitted' 2025-10-10T02:28:33.5109318Z sudo: setrlimit(RLIMIT_STACK): Operation not permitted 2025-10-10T02:28:33.5109825Z + echo 'For more details refer to https://github.com/sudo-project/sudo/issues/42' 2025-10-10T02:28:33.5110371Z For more details refer to https://github.com/sudo-project/sudo/issues/42 2025-10-10T02:28:33.5110810Z + sudo chown -R 1000 /var/lib/jenkins/workspace 2025-10-10T02:28:34.5936062Z ##[group]Run pytorch/test-infra/.github/actions/upload-benchmark-results@main 2025-10-10T02:28:34.5936528Z with: 2025-10-10T02:28:34.5936771Z benchmark-results-dir: test/test-reports 2025-10-10T02:28:34.5937096Z dry-run: false 2025-10-10T02:28:34.5937328Z schema-version: v3 2025-10-10T02:28:34.5937741Z github-token: *** 2025-10-10T02:28:34.5937970Z env: 2025-10-10T02:28:34.5947643Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:28:34.5948076Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:28:34.5948830Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:28:34.5949481Z ##[endgroup] 2025-10-10T02:28:34.5977757Z ##[group]Run set -eux 2025-10-10T02:28:34.5978031Z set -eux 2025-10-10T02:28:34.5978255Z  2025-10-10T02:28:34.5978464Z if [[ -n "" ]]; then 2025-10-10T02:28:34.5978730Z  source "" 2025-10-10T02:28:34.5978963Z fi 2025-10-10T02:28:34.5979335Z python3 -mpip install boto3==1.35.33 psutil==7.0.0 pynvml==12.0.0 2025-10-10T02:28:34.5979732Z  2025-10-10T02:28:34.5979946Z DEVICE_NAME="" 2025-10-10T02:28:34.5980200Z DEVICE_TYPE="" 2025-10-10T02:28:34.5980432Z  2025-10-10T02:28:34.5980663Z if command -v nvidia-smi; then 2025-10-10T02:28:34.5981091Z  # NB: I'm using PyTorch here to get the device name, however, it needs to 2025-10-10T02:28:34.5981626Z  # install the correct version of PyTorch manually for now. Any PyTorch 2025-10-10T02:28:34.5982139Z  # version is fine, I just use 2.7.1 to satify PYPIDEP linter 2025-10-10T02:28:34.5982535Z  python3 -mpip install torch==2.7.1 2025-10-10T02:28:34.5982869Z elif command -v rocminfo; then 2025-10-10T02:28:34.5983281Z  # NB: Installing torch on ROCm runner with pip here causes CI to fail 2025-10-10T02:28:34.5983799Z  # with a memoryview is too large error only on MI300 runners. Is pip 2025-10-10T02:28:34.5984321Z  # version on ROCm runner there too old? As a workaround, let's use the 2025-10-10T02:28:34.5984773Z  # GPU device name coming from rocminfo instead 2025-10-10T02:28:34.5985137Z  DEVICE_NAME=rocm 2025-10-10T02:28:34.5985608Z  DEVICE_TYPE=$(rocminfo | grep "Marketing Name" | tail -n1 | awk -F':' '{print $2}' | xargs) 2025-10-10T02:28:34.5986064Z fi 2025-10-10T02:28:34.5986260Z  2025-10-10T02:28:34.5986520Z echo "DEVICE_NAME=$DEVICE_NAME" >> $GITHUB_ENV 2025-10-10T02:28:34.5986906Z echo "DEVICE_TYPE=$DEVICE_TYPE" >> $GITHUB_ENV 2025-10-10T02:28:34.6001778Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:28:34.6002127Z env: 2025-10-10T02:28:34.6002341Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:28:34.6002662Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:28:34.6003208Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:28:34.6003686Z ##[endgroup] 2025-10-10T02:28:34.6041406Z + [[ -n '' ]] 2025-10-10T02:28:34.6041832Z + python3 -mpip install boto3==1.35.33 psutil==7.0.0 pynvml==12.0.0 2025-10-10T02:28:34.8463051Z Defaulting to user installation because normal site-packages is not writeable 2025-10-10T02:28:36.1129587Z Collecting boto3==1.35.33 2025-10-10T02:28:36.1321830Z Downloading boto3-1.35.33-py3-none-any.whl (139 kB) 2025-10-10T02:28:36.4880192Z Collecting psutil==7.0.0 2025-10-10T02:28:36.4925872Z Downloading psutil-7.0.0-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (277 kB) 2025-10-10T02:28:36.5282401Z Collecting pynvml==12.0.0 2025-10-10T02:28:36.5331123Z Downloading pynvml-12.0.0-py3-none-any.whl (26 kB) 2025-10-10T02:28:36.5408409Z Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in /usr/lib/python3.9/site-packages (from boto3==1.35.33) (0.10.0) 2025-10-10T02:28:37.8573599Z Collecting botocore<1.36.0,>=1.35.33 2025-10-10T02:28:37.8626038Z Downloading botocore-1.35.99-py3-none-any.whl (13.3 MB) 2025-10-10T02:28:38.0866695Z Collecting s3transfer<0.11.0,>=0.10.0 2025-10-10T02:28:38.0906996Z Downloading s3transfer-0.10.4-py3-none-any.whl (83 kB) 2025-10-10T02:28:38.1430229Z Collecting nvidia-ml-py<13.0.0a0,>=12.0.0 2025-10-10T02:28:38.1471330Z Downloading nvidia_ml_py-12.575.51-py3-none-any.whl (47 kB) 2025-10-10T02:28:38.1571274Z Requirement already satisfied: urllib3<1.27,>=1.25.4 in /usr/lib/python3.9/site-packages (from botocore<1.36.0,>=1.35.33->boto3==1.35.33) (1.25.10) 2025-10-10T02:28:38.1577539Z Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /usr/lib/python3.9/site-packages (from botocore<1.36.0,>=1.35.33->boto3==1.35.33) (2.8.1) 2025-10-10T02:28:38.2949083Z Requirement already satisfied: six>=1.5 in /usr/lib/python3.9/site-packages (from python-dateutil<3.0.0,>=2.1->botocore<1.36.0,>=1.35.33->boto3==1.35.33) (1.15.0) 2025-10-10T02:28:38.4315814Z Installing collected packages: botocore, s3transfer, nvidia-ml-py, pynvml, psutil, boto3 2025-10-10T02:28:38.9993484Z Attempting uninstall: nvidia-ml-py 2025-10-10T02:28:38.9994458Z Found existing installation: nvidia-ml-py 11.525.84 2025-10-10T02:28:39.0010471Z Uninstalling nvidia-ml-py-11.525.84: 2025-10-10T02:28:39.0263925Z Successfully uninstalled nvidia-ml-py-11.525.84 2025-10-10T02:28:39.0880757Z Attempting uninstall: psutil 2025-10-10T02:28:39.0881243Z Found existing installation: psutil 5.9.8 2025-10-10T02:28:39.0969777Z Uninstalling psutil-5.9.8: 2025-10-10T02:28:39.0977206Z Successfully uninstalled psutil-5.9.8 2025-10-10T02:28:39.2690881Z Successfully installed boto3-1.35.33 botocore-1.35.99 nvidia-ml-py-12.575.51 psutil-7.0.0 pynvml-12.0.0 s3transfer-0.10.4 2025-10-10T02:28:39.3970257Z + DEVICE_NAME= 2025-10-10T02:28:39.3970616Z /usr/bin/nvidia-smi 2025-10-10T02:28:39.3970948Z + DEVICE_TYPE= 2025-10-10T02:28:39.3971201Z + command -v nvidia-smi 2025-10-10T02:28:39.3971508Z + python3 -mpip install torch==2.7.1 2025-10-10T02:28:39.6473734Z Defaulting to user installation because normal site-packages is not writeable 2025-10-10T02:28:39.9379209Z Collecting torch==2.7.1 2025-10-10T02:28:39.9590085Z Downloading torch-2.7.1-cp39-cp39-manylinux_2_28_x86_64.whl (821.1 MB) 2025-10-10T02:28:54.3570530Z Collecting nvidia-nvjitlink-cu12==12.6.85 2025-10-10T02:28:54.3612070Z Downloading nvidia_nvjitlink_cu12-12.6.85-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl (19.7 MB) 2025-10-10T02:28:54.6367657Z Collecting nvidia-cuda-nvrtc-cu12==12.6.77 2025-10-10T02:28:54.6446259Z Downloading nvidia_cuda_nvrtc_cu12-12.6.77-py3-none-manylinux2014_x86_64.whl (23.7 MB) 2025-10-10T02:28:55.0209851Z Requirement already satisfied: jinja2 in /usr/lib/python3.9/site-packages (from torch==2.7.1) (2.11.3) 2025-10-10T02:28:55.0533580Z Collecting nvidia-cufft-cu12==11.3.0.4 2025-10-10T02:28:55.0612654Z Downloading nvidia_cufft_cu12-11.3.0.4-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (200.2 MB) 2025-10-10T02:28:58.5266731Z Collecting triton==3.3.1 2025-10-10T02:28:58.5312065Z Downloading triton-3.3.1-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (155.6 MB) 2025-10-10T02:29:00.0147560Z Collecting nvidia-curand-cu12==10.3.7.77 2025-10-10T02:29:00.0232033Z Downloading nvidia_curand_cu12-10.3.7.77-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (56.3 MB) 2025-10-10T02:29:01.0113716Z Collecting sympy>=1.13.3 2025-10-10T02:29:01.0152098Z Downloading sympy-1.14.0-py3-none-any.whl (6.3 MB) 2025-10-10T02:29:01.1094060Z Collecting nvidia-cuda-cupti-cu12==12.6.80 2025-10-10T02:29:01.1200913Z Downloading nvidia_cuda_cupti_cu12-12.6.80-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (8.9 MB) 2025-10-10T02:29:01.2485712Z Collecting nvidia-cublas-cu12==12.6.4.1 2025-10-10T02:29:01.2555506Z Downloading nvidia_cublas_cu12-12.6.4.1-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (393.1 MB) 2025-10-10T02:29:07.9930535Z Collecting nvidia-cufile-cu12==1.11.1.6 2025-10-10T02:29:08.0018750Z Downloading nvidia_cufile_cu12-1.11.1.6-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (1.1 MB) 2025-10-10T02:29:08.0166040Z Requirement already satisfied: typing-extensions>=4.10.0 in /home/ec2-user/.local/lib/python3.9/site-packages (from torch==2.7.1) (4.15.0) 2025-10-10T02:29:08.0431584Z Collecting nvidia-nccl-cu12==2.26.2 2025-10-10T02:29:08.0471651Z Downloading nvidia_nccl_cu12-2.26.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (201.3 MB) 2025-10-10T02:29:11.1365490Z Collecting nvidia-nvtx-cu12==12.6.77 2025-10-10T02:29:11.1504479Z Downloading nvidia_nvtx_cu12-12.6.77-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (89 kB) 2025-10-10T02:29:11.2201527Z Collecting fsspec 2025-10-10T02:29:11.2239700Z Downloading fsspec-2025.9.0-py3-none-any.whl (199 kB) 2025-10-10T02:29:11.2587602Z Collecting nvidia-cuda-runtime-cu12==12.6.77 2025-10-10T02:29:11.2660708Z Downloading nvidia_cuda_runtime_cu12-12.6.77-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (897 kB) 2025-10-10T02:29:11.3073447Z Collecting nvidia-cusparse-cu12==12.5.4.2 2025-10-10T02:29:11.3110911Z Downloading nvidia_cusparse_cu12-12.5.4.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (216.6 MB) 2025-10-10T02:29:15.1781689Z Collecting nvidia-cudnn-cu12==9.5.1.17 2025-10-10T02:29:15.1875950Z Downloading nvidia_cudnn_cu12-9.5.1.17-py3-none-manylinux_2_28_x86_64.whl (571.0 MB) 2025-10-10T02:29:25.0736043Z Collecting networkx 2025-10-10T02:29:25.0773833Z Downloading networkx-3.2.1-py3-none-any.whl (1.6 MB) 2025-10-10T02:29:25.1554811Z Collecting filelock 2025-10-10T02:29:25.1593023Z Downloading filelock-3.19.1-py3-none-any.whl (15 kB) 2025-10-10T02:29:25.1959513Z Collecting nvidia-cusparselt-cu12==0.6.3 2025-10-10T02:29:25.2006619Z Downloading nvidia_cusparselt_cu12-0.6.3-py3-none-manylinux2014_x86_64.whl (156.8 MB) 2025-10-10T02:29:27.4850452Z Collecting nvidia-cusolver-cu12==11.7.1.2 2025-10-10T02:29:27.4902916Z Downloading nvidia_cusolver_cu12-11.7.1.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (158.2 MB) 2025-10-10T02:29:29.4793766Z Requirement already satisfied: setuptools>=40.8.0 in /usr/lib/python3.9/site-packages (from triton==3.3.1->torch==2.7.1) (59.6.0) 2025-10-10T02:29:29.5099043Z Collecting mpmath<1.4,>=1.1.0 2025-10-10T02:29:29.5137783Z Downloading mpmath-1.3.0-py3-none-any.whl (536 kB) 2025-10-10T02:29:29.6079518Z Requirement already satisfied: MarkupSafe>=0.23 in /usr/lib64/python3.9/site-packages (from jinja2->torch==2.7.1) (1.1.1) 2025-10-10T02:29:29.9677706Z Installing collected packages: nvidia-nvjitlink-cu12, nvidia-cusparse-cu12, nvidia-cublas-cu12, mpmath, triton, sympy, nvidia-nvtx-cu12, nvidia-nccl-cu12, nvidia-cusparselt-cu12, nvidia-cusolver-cu12, nvidia-curand-cu12, nvidia-cufile-cu12, nvidia-cufft-cu12, nvidia-cudnn-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, networkx, fsspec, filelock, torch 2025-10-10T02:29:39.3305508Z WARNING: The scripts proton and proton-viewer are installed in '/home/ec2-user/.local/bin' which is not on PATH. 2025-10-10T02:29:39.3306356Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-10-10T02:29:43.4218610Z WARNING: The script isympy is installed in '/home/ec2-user/.local/bin' which is not on PATH. 2025-10-10T02:29:43.4219625Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-10-10T02:30:14.9179703Z WARNING: The scripts torchfrtrace and torchrun are installed in '/home/ec2-user/.local/bin' which is not on PATH. 2025-10-10T02:30:14.9180571Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-10-10T02:30:15.1282856Z Successfully installed filelock-3.19.1 fsspec-2025.9.0 mpmath-1.3.0 networkx-3.2.1 nvidia-cublas-cu12-12.6.4.1 nvidia-cuda-cupti-cu12-12.6.80 nvidia-cuda-nvrtc-cu12-12.6.77 nvidia-cuda-runtime-cu12-12.6.77 nvidia-cudnn-cu12-9.5.1.17 nvidia-cufft-cu12-11.3.0.4 nvidia-cufile-cu12-1.11.1.6 nvidia-curand-cu12-10.3.7.77 nvidia-cusolver-cu12-11.7.1.2 nvidia-cusparse-cu12-12.5.4.2 nvidia-cusparselt-cu12-0.6.3 nvidia-nccl-cu12-2.26.2 nvidia-nvjitlink-cu12-12.6.85 nvidia-nvtx-cu12-12.6.77 sympy-1.14.0 torch-2.7.1 triton-3.3.1 2025-10-10T02:30:15.8265149Z + echo DEVICE_NAME= 2025-10-10T02:30:15.8267411Z + echo DEVICE_TYPE= 2025-10-10T02:30:15.8293077Z ##[group]Run set -eux 2025-10-10T02:30:15.8293389Z set -eux 2025-10-10T02:30:15.8293618Z  2025-10-10T02:30:15.8293864Z if [[ -z "${GITHUB_TOKEN}" ]]; then 2025-10-10T02:30:15.8294362Z  echo "Missing github-token input" 2025-10-10T02:30:15.8294673Z  exit 1 2025-10-10T02:30:15.8294889Z fi 2025-10-10T02:30:15.8305960Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:30:15.8306320Z env: 2025-10-10T02:30:15.8306530Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:15.8307042Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:15.8307593Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:15.8308078Z DEVICE_NAME: 2025-10-10T02:30:15.8308295Z DEVICE_TYPE: 2025-10-10T02:30:15.8308710Z GITHUB_TOKEN: *** 2025-10-10T02:30:15.8321359Z ##[endgroup] 2025-10-10T02:30:15.8359618Z + [[ -z *** ]] 2025-10-10T02:30:15.8507106Z ##[group]Run pytorch/test-infra/.github/actions/get-workflow-job-id@main 2025-10-10T02:30:15.8507520Z with: 2025-10-10T02:30:15.8507883Z github-token: *** 2025-10-10T02:30:15.8508114Z env: 2025-10-10T02:30:15.8508327Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:15.8508665Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:15.8509205Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:15.8509683Z DEVICE_NAME: 2025-10-10T02:30:15.8509903Z DEVICE_TYPE: 2025-10-10T02:30:15.8510122Z ##[endgroup] 2025-10-10T02:30:15.8566321Z ##[group]Run set -eux 2025-10-10T02:30:15.8566585Z set -eux 2025-10-10T02:30:15.8566823Z  2025-10-10T02:30:15.8567354Z python3 "${GITHUB_ACTION_PATH}/../../scripts/get_workflow_job_id.py" "${GITHUB_RUN_ID}" "${RUNNER_NAME}" 2025-10-10T02:30:15.8576599Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:30:15.8576989Z env: 2025-10-10T02:30:15.8577198Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:15.8577522Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:15.8578043Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:15.8578601Z DEVICE_NAME: 2025-10-10T02:30:15.8578838Z DEVICE_TYPE: 2025-10-10T02:30:15.8579216Z GITHUB_TOKEN: *** 2025-10-10T02:30:15.8579445Z ##[endgroup] 2025-10-10T02:30:15.8611906Z + python3 /home/ec2-user/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/get-workflow-job-id/../../scripts/get_workflow_job_id.py 18392306083 i-088ba17e0301f2c3f 2025-10-10T02:30:16.7045566Z setting job-id=52406799277 2025-10-10T02:30:16.7046145Z setting job-name=linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 2, 3, linux.g5.4xlarge.nvidia.gpu) 2025-10-10T02:30:16.7250104Z ##[group]Run set -eux 2025-10-10T02:30:16.7250397Z set -eux 2025-10-10T02:30:16.7250635Z  2025-10-10T02:30:16.7250868Z if [[ -n "" ]]; then 2025-10-10T02:30:16.7251147Z  source "" 2025-10-10T02:30:16.7251389Z fi 2025-10-10T02:30:16.7251620Z  2025-10-10T02:30:16.7252001Z python3 "${GITHUB_ACTION_PATH}/../../scripts/benchmarks/gather_metadata.py" \ 2025-10-10T02:30:16.7252511Z  --schema-version "${SCHEMA_VERSION}" \ 2025-10-10T02:30:16.7252853Z  --repo "${REPO}" \ 2025-10-10T02:30:16.7253165Z  --head-branch "${HEAD_BRANCH}" \ 2025-10-10T02:30:16.7253503Z  --head-sha "${HEAD_SHA}" \ 2025-10-10T02:30:16.7253843Z  --workflow-id "${WORKFLOW_RUN_ID}" \ 2025-10-10T02:30:16.7254194Z  --run-attempt "${RUN_ATTEMPT}" \ 2025-10-10T02:30:16.7254711Z  --job-id "${JOB_ID}" \ 2025-10-10T02:30:16.7255031Z  --job-name "${JOB_NAME}" 2025-10-10T02:30:16.7264346Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:30:16.7264710Z env: 2025-10-10T02:30:16.7264927Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:16.7265266Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:16.7265978Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:16.7266537Z DEVICE_NAME: 2025-10-10T02:30:16.7266862Z DEVICE_TYPE: 2025-10-10T02:30:16.7267083Z SCHEMA_VERSION: v3 2025-10-10T02:30:16.7267327Z REPO: pytorch/pytorch 2025-10-10T02:30:16.7267582Z HEAD_BRANCH: refs/heads/main 2025-10-10T02:30:16.7267882Z HEAD_SHA: 344e6365a0068c2d2847fcec0c55dd53291d475e 2025-10-10T02:30:16.7268208Z WORKFLOW_RUN_ID: 18392306083 2025-10-10T02:30:16.7268469Z RUN_ATTEMPT: 1 2025-10-10T02:30:16.7268690Z JOB_ID: 52406799277 2025-10-10T02:30:16.7269144Z JOB_NAME: linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 2, 3, linux.g5.4xlarge.nvidia.gpu) 2025-10-10T02:30:16.7269639Z ##[endgroup] 2025-10-10T02:30:16.7300922Z + [[ -n '' ]] 2025-10-10T02:30:16.7302731Z + python3 /home/ec2-user/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/benchmarks/gather_metadata.py --schema-version v3 --repo pytorch/pytorch --head-branch refs/heads/main --head-sha 344e6365a0068c2d2847fcec0c55dd53291d475e --workflow-id 18392306083 --run-attempt 1 --job-id 52406799277 --job-name 'linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 2, 3, linux.g5.4xlarge.nvidia.gpu)' 2025-10-10T02:30:16.7861591Z ##[group]Run set -eux 2025-10-10T02:30:16.7861865Z set -eux 2025-10-10T02:30:16.7862086Z  2025-10-10T02:30:16.7862307Z if [[ -n "" ]]; then 2025-10-10T02:30:16.7862584Z  source "" 2025-10-10T02:30:16.7862826Z fi 2025-10-10T02:30:16.7863042Z  2025-10-10T02:30:16.7863438Z python3 "${GITHUB_ACTION_PATH}/../../scripts/benchmarks/gather_runners_info.py" 2025-10-10T02:30:16.7872744Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:30:16.7873108Z env: 2025-10-10T02:30:16.7873321Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:16.7873652Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:16.7874193Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:16.7874680Z DEVICE_NAME: 2025-10-10T02:30:16.7874919Z DEVICE_TYPE: 2025-10-10T02:30:16.7875146Z ##[endgroup] 2025-10-10T02:30:16.7906907Z + [[ -n '' ]] 2025-10-10T02:30:16.7907900Z + python3 /home/ec2-user/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/benchmarks/gather_runners_info.py 2025-10-10T02:30:17.7430286Z /home/ec2-user/.local/lib/python3.9/site-packages/torch/_subclasses/functional_tensor.py:276: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:81.) 2025-10-10T02:30:17.7432847Z cpu = _conversion_method_template(device=torch.device("cpu")) 2025-10-10T02:30:18.7072558Z ##[group]Run set -eux 2025-10-10T02:30:18.7072835Z set -eux 2025-10-10T02:30:18.7073061Z  2025-10-10T02:30:18.7073314Z # TODO (huydhn): Implement this part 2025-10-10T02:30:18.7073689Z echo "dependencies={}" >> "${GITHUB_OUTPUT}" 2025-10-10T02:30:18.7095008Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:30:18.7095419Z env: 2025-10-10T02:30:18.7095645Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:18.7096000Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:18.7096528Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:18.7096996Z DEVICE_NAME: 2025-10-10T02:30:18.7097216Z DEVICE_TYPE: 2025-10-10T02:30:18.7097423Z ##[endgroup] 2025-10-10T02:30:18.7236276Z + echo 'dependencies={}' 2025-10-10T02:30:18.7352438Z ##[group]Run set -eux 2025-10-10T02:30:18.7352738Z set -eux 2025-10-10T02:30:18.7352970Z  2025-10-10T02:30:18.7353190Z if [[ -n "" ]]; then 2025-10-10T02:30:18.7353473Z  source "" 2025-10-10T02:30:18.7353715Z fi 2025-10-10T02:30:18.7353937Z  2025-10-10T02:30:18.7354196Z if [[ ! -d "${BENCHMARK_RESULTS_DIR}" ]]; then 2025-10-10T02:30:18.7354623Z  echo "${BENCHMARK_RESULTS_DIR} does not exist, skipping" 2025-10-10T02:30:18.7355203Z  # We don't want the job to fail if the directory doesn't exist 2025-10-10T02:30:18.7355575Z  exit 0 2025-10-10T02:30:18.7355821Z fi 2025-10-10T02:30:18.7356052Z  2025-10-10T02:30:18.7356285Z if [[ "${DRY_RUN}" == "true" ]]; then 2025-10-10T02:30:18.7356737Z  python3 "${GITHUB_ACTION_PATH}/../../scripts/upload_benchmark_results.py" \ 2025-10-10T02:30:18.7357280Z  --benchmark-results-dir "${BENCHMARK_RESULTS_DIR}" \ 2025-10-10T02:30:18.7357687Z  --metadata "${BENCHMARK_METADATA}" \ 2025-10-10T02:30:18.7358037Z  --runners "${RUNNER_INFO}" \ 2025-10-10T02:30:18.7358389Z  --dependencies "${DEPENDENCIES}" \ 2025-10-10T02:30:18.7358716Z  --dry-run 2025-10-10T02:30:18.7358955Z else 2025-10-10T02:30:18.7359521Z  python3 "${GITHUB_ACTION_PATH}/../../scripts/upload_benchmark_results.py" \ 2025-10-10T02:30:18.7360045Z  --benchmark-results-dir "${BENCHMARK_RESULTS_DIR}" \ 2025-10-10T02:30:18.7360450Z  --metadata "${BENCHMARK_METADATA}" \ 2025-10-10T02:30:18.7360794Z  --runners "${RUNNER_INFO}" \ 2025-10-10T02:30:18.7361125Z  --dependencies "${DEPENDENCIES}" 2025-10-10T02:30:18.7361429Z fi 2025-10-10T02:30:18.7370762Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:30:18.7371111Z env: 2025-10-10T02:30:18.7371332Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:18.7371652Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:18.7372176Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:18.7372648Z DEVICE_NAME: 2025-10-10T02:30:18.7372865Z DEVICE_TYPE: 2025-10-10T02:30:18.7373110Z BENCHMARK_RESULTS_DIR: test/test-reports 2025-10-10T02:30:18.7373407Z DRY_RUN: false 2025-10-10T02:30:18.7374608Z BENCHMARK_METADATA: {"timestamp": 1760063416, "schema_version": "v3", "name": "linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 2, 3, linux.g5.4xlarge.nvidia.gpu)", "repo": "pytorch/pytorch", "head_branch": "refs/heads/main", "head_sha": "344e6365a0068c2d2847fcec0c55dd53291d475e", "workflow_id": 18392306083, "run_attempt": 1, "job_id": 52406799277} 2025-10-10T02:30:18.7376307Z RUNNER_INFO: [{"cpu_info": "x86_64", "cpu_count": 16, "avail_mem_in_gb": 62, "extra_info": {"hostname": "ip-10-0-20-73.ec2.internal"}, "name": "cuda", "type": "NVIDIA A10G", "gpu_count": 1, "avail_gpu_mem_in_gb": 22}] 2025-10-10T02:30:18.7377139Z DEPENDENCIES: {} 2025-10-10T02:30:18.7377371Z ##[endgroup] 2025-10-10T02:30:18.7408719Z + [[ -n '' ]] 2025-10-10T02:30:18.7408998Z + [[ ! -d test/test-reports ]] 2025-10-10T02:30:18.7409291Z + [[ false == \t\r\u\e ]] 2025-10-10T02:30:18.7412216Z + python3 /home/ec2-user/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/upload_benchmark_results.py --benchmark-results-dir test/test-reports --metadata '{"timestamp": 1760063416, "schema_version": "v3", "name": "linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 2, 3, linux.g5.4xlarge.nvidia.gpu)", "repo": "pytorch/pytorch", "head_branch": "refs/heads/main", "head_sha": "344e6365a0068c2d2847fcec0c55dd53291d475e", "workflow_id": 18392306083, "run_attempt": 1, "job_id": 52406799277}' --runners '[{"cpu_info": "x86_64", "cpu_count": 16, "avail_mem_in_gb": 62, "extra_info": {"hostname": "ip-10-0-20-73.ec2.internal"}, "name": "cuda", "type": "NVIDIA A10G", "gpu_count": 1, "avail_gpu_mem_in_gb": 22}]' --dependencies '{}' 2025-10-10T02:30:18.9426810Z /home/ec2-user/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/upload_benchmark_results.py:236: UserWarning: {'included': [{'test_file': 'inductor/test_dependencies'}, {'test_file': 'inductor/test_efficient_conv_bn_eval'}, {'test_file': 'test_ops'}, {'test_file': 'test_ci_sanity_check_fail'}, {'test_file': 'test_torchfuzz_repros'}, {'test_file': 'test_opaque_obj'}, {'test_file': 'test_testing'}, {'test_file': 'test_public_bindings'}, {'test_file': 'doctests'}, {'test_file': 'test_cpp_extensions_aot_ninja'}, {'test_file': 'test_cpp_extensions_aot_no_ninja'}, {'test_file': 'inductor/test_aot_inductor'}, {'test_file': 'inductor/test_torchinductor'}, {'test_file': 'inductor/test_triton_kernels'}, {'test_file': 'inductor/test_triton_heuristics'}, {'test_file': 'inductor/test_profiler'}, {'test_file': 'inductor/test_codecache'}, {'test_file': 'dynamo/test_structured_trace'}, {'test_file': 'inductor/test_flex_attention'}, {'test_file': 'inductor/test_torchinductor_strided_blocks'}, {'test_file': 'inductor/test_max_autotune'}, {'test_file': 'inductor/test_triton_cpu_backend'}, {'test_file': 'inductor/test_fxir_backend'}, {'test_file': 'inductor/test_best_config'}, {'test_file': 'inductor/test_torchinductor_opinfo'}, {'test_file': 'inductor/test_static_cuda_launcher'}, {'test_file': 'inductor/test_cooperative_reductions'}, {'test_file': 'inductor/test_async_compile'}, {'test_file': 'inductor/test_kernel_benchmark'}, {'test_file': 'inductor/test_cuda_repro'}, {'test_file': 'dynamo/test_callback'}, {'test_file': 'inductor/test_fp8'}, {'test_file': 'inductor/test_torchinductor_dynamic_shapes'}, {'test_file': 'inductor/test_analysis'}, {'test_file': 'inductor/test_triton_syntax'}, {'test_file': 'inductor/test_triton_extension_backend'}, {'test_file': 'inductor/test_utils'}, {'test_file': 'inductor/test_coordinate_descent_tuner'}, {'test_file': 'inductor/test_inplace_padding'}, {'test_file': 'inductor/test_template_heuristics_registry'}, {'test_file': 'inductor/test_halide'}, {'test_file': 'inductor/test_select_algorithm'}, {'test_file': 'inductor/test_extension_backend'}, {'test_file': 'inductor/test_inductor_scheduler'}, {'test_file': 'inductor/test_padding'}, {'test_file': 'inductor/test_codegen_triton'}, {'test_file': 'inductor/test_torchinductor_codegen_dynamic_shapes'}, {'test_file': 'inductor/test_triton_wrapper'}, {'test_file': 'inductor/test_snode_runtime'}, {'test_file': 'inductor/test_pattern_matcher'}, {'test_file': 'inductor/test_cpp_wrapper_hipify'}, {'test_file': 'inductor/test_cudacodecache'}, {'test_file': 'dynamo/test_compile'}, {'test_file': 'dynamo/test_package'}, {'test_file': 'dynamo/test_utils'}, {'test_file': 'export/test_retraceability'}, {'test_file': 'inductor/test_external_callables'}, {'test_file': 'dynamo/test_dynamic_shapes'}, {'test_file': 'export/test_export_training_ir_to_run_decomp'}, {'test_file': 'inductor/test_indexing'}, {'test_file': 'inductor/test_minifier'}, {'test_file': 'inductor/test_perf'}, {'test_file': 'inductor/test_pad_mm'}, {'test_file': 'inductor/test_inductor_annotations'}, {'test_file': 'inductor/test_ck_backend'}, {'test_file': 'inductor/test_b2b_gemm'}, {'test_file': 'inductor/test_inductor_utils'}, {'test_file': 'inductor/test_op_completeness'}, {'test_file': 'inductor/test_multi_kernel'}, {'test_file': 'inductor/test_autoheuristic'}, {'test_file': 'export/test_serdes'}, {'test_file': 'inductor/test_cpu_select_algorithm'}, {'test_file': 'dynamo/test_deque_reconstruct'}, {'test_file': 'inductor/test_cuda_select_algorithm'}, {'test_file': 'export/test_strict_export_v2'}, {'test_file': 'inductor/test_deterministic'}, {'test_file': 'inductor/test_flex_decoding'}, {'test_file': 'export/test_unflatten_training_ir'}, {'test_file': 'inductor/test_aot_inductor_arrayref'}, {'test_file': 'dynamo/test_fx_passes_pre_grad'}, {'test_file': 'inductor/test_aot_inductor_windows'}, {'test_file': 'inductor/test_compiled_autograd'}, {'test_file': 'inductor/test_metrics'}, {'test_file': 'inductor/test_custom_post_grad_passes'}, {'test_file': 'inductor/test_aot_inductor_package'}, {'test_file': 'inductor/test_xpu_basic'}, {'test_file': 'inductor/test_provenance_tracing'}, {'test_file': 'inductor/test_fx_fusion'}, {'test_file': 'inductor/test_loop_ordering'}, {'test_file': 'inductor/test_benchmark_fusion'}, {'test_file': 'export/test_functionalized_assertions'}, {'test_file': 'inductor/test_segmented_tree'}, {'test_file': 'inductor/test_compiled_optimizers'}, {'test_file': 'inductor/test_decompose_mem_bound_mm'}, {'test_file': 'dynamo/test_base_output'}, {'test_file': 'dynamo/test_backends'}, {'test_file': 'dynamo/test_fx_graph_runnable'}, {'test_file': 'inductor/test_compile_worker'}, {'test_file': 'inductor/test_move_constructors_to_cuda'}, {'test_file': 'inductor/test_subgraph_choice'}, {'test_file': 'export/test_export_strict'}, {'test_file': 'inductor/test_cutedsl_template'}, {'test_file': 'dynamo/test_inline_and_install'}, {'test_file': 'export/test_tree_utils'}, {'test_file': 'dynamo/test_recompiles'}, {'test_file': 'dynamo/test_einops'}, {'test_file': 'inductor/test_foreach'}, {'test_file': 'inductor/test_minifier_utils'}, {'test_file': 'dynamo/test_sdpa'}, {'test_file': 'inductor/test_cutlass_backend'}, {'test_file': 'inductor/test_compile_subprocess'}, {'test_file': 'export/test_cpp_serdes'}, {'test_file': 'inductor/test_debug_trace'}, {'test_file': 'inductor/test_memory'}, {'test_file': 'dynamo/test_frame_init'}, {'test_file': 'inductor/test_kernel_optimization'}, {'test_file': 'inductor/test_combo_kernels'}, {'test_file': 'inductor/test_inplacing_pass'}, {'test_file': 'dynamo/test_skip_non_tensor'}, {'test_file': 'inductor/test_op_dtype_prop'}, {'test_file': 'dynamo/test_reconstruct'}, {'test_file': 'export/test_dynamic_shapes'}, {'test_file': 'inductor/test_remote_cache'}, {'test_file': 'dynamo/test_interop'}, {'test_file': 'inductor/test_device_assert'}, {'test_file': 'inductor/test_smoke'}, {'test_file': 'dynamo/test_skip_guard_eval_unsafe'}, {'test_file': 'export/test_tools'}, {'test_file': 'inductor/test_gpu_cpp_wrapper'}, {'test_file': 'export/test_export_with_inline_and_install'}, {'test_file': 'export/test_serialize'}, {'test_file': 'dynamo/test_functions'}, {'test_file': 'inductor/test_cpu_cpp_wrapper'}, {'test_file': 'inductor/test_benchmarking'}, {'test_file': 'inductor/test_quantization'}, {'test_file': 'inductor/test_aot_inductor_custom_ops'}, {'test_file': 'inductor/test_scatter_optimization'}, {'test_file': 'inductor/test_group_batch_fusion'}, {'test_file': 'inductor/test_split_cat_fx_passes'}, {'test_file': 'dynamo/test_view'}, {'test_file': 'dynamo/test_fx_annotate'}, {'test_file': 'inductor/test_control_deps'}, {'test_file': 'dynamo/test_pre_dispatch'}, {'test_file': 'dynamo/test_subgraphs'}, {'test_file': 'inductor/test_mkldnn_pattern_matcher'}, {'test_file': 'dynamo/test_decorators'}, {'test_file': 'dynamo/test_pgo'}, {'test_file': 'inductor/test_cutlass_evt'}, {'test_file': 'dynamo/test_buffers_override'}, {'test_file': 'inductor/test_online_softmax'}, {'test_file': 'inductor/test_mem_estimation'}, {'test_file': 'test_model_exports_to_core_aten'}, {'test_file': 'inductor/test_helion_kernels'}, {'test_file': 'inductor/test_aot_inductor_utils'}, {'test_file': 'export/test_package'}, {'test_file': 'dynamo/test_ctx_manager'}, {'test_file': 'inductor/test_cudagraph_trees'}, {'test_file': 'inductor/test_block_analysis'}, {'test_file': 'dynamo/test_autograd_function'}, {'test_file': 'dynamo/test_nops'}, {'test_file': 'dynamo/test_config'}, {'test_file': 'inductor/test_control_flow'}, {'test_file': 'export/test_db'}, {'test_file': 'inductor/test_unbacked_symints'}, {'test_file': 'inductor/test_fused_attention'}, {'test_file': 'dynamo/test_export_mutations'}, {'test_file': 'inductor/test_config'}, {'test_file': 'dynamo/test_guard_serialization'}, {'test_file': 'inductor/test_graph_transform_observer'}, {'test_file': 'dynamo/test_unittest'}, {'test_file': 'inductor/test_cache'}, {'test_file': 'dynamo/test_after_aot'}, {'test_file': 'inductor/test_compile'}, {'test_file': 'export/test_export_opinfo'}, {'test_file': 'inductor/test_custom_lowering'}, {'test_file': 'dynamo/test_graph_region_tracker'}, {'test_file': 'dynamo/test_dicts'}, {'test_file': 'inductor/test_fuzzer'}, {'test_file': 'dynamo/test_modules'}, {'test_file': 'dynamo/test_metrics_context'}, {'test_file': 'dynamo/test_install_free_tensors'}, {'test_file': 'inductor/test_memory_planning'}, {'test_file': 'inductor/test_ordered_set'}, {'test_file': 'inductor/test_split_cat_fx_aten_passes'}, {'test_file': 'dynamo/test_activation_checkpointing'}, {'test_file': 'dynamo/test_compiler_bisector'}, {'test_file': 'dynamo/test_aot_compile'}, {'test_file': 'dynamo/test_modes'}, {'test_file': 'inductor/test_layout_optim'}, {'test_file': 'inductor/test_auto_functionalize'}, {'test_file': 'inductor/test_torchinductor_codegen_config_overrides'}, {'test_file': 'dynamo/test_profiler'}, {'test_file': 'dynamo/test_global'}, {'test_file': 'inductor/test_inductor_freezing'}, {'test_file': 'dynamo/test_model_output'}, {'test_file': 'export/test_torchbind'}, {'test_file': 'dynamo/test_nested_graph_breaks'}, {'test_file': 'dynamo/test_backward_higher_order_ops'}, {'test_file': 'export/test_passes'}, {'test_file': 'inductor/test_torchbind'}, {'test_file': 'inductor/test_custom_partitioner_fn'}, {'test_file': 'inductor/test_alignment'}, {'test_file': 'dynamo/test_sources'}, {'test_file': 'dynamo/test_resume'}, {'test_file': 'dynamo/test_debug_utils'}, {'test_file': 'export/test_swap'}, {'test_file': 'dynamo/test_aot_autograd_cache'}, {'test_file': 'inductor/test_binary_folding'}, {'test_file': 'dynamo/test_base_hop'}, {'test_file': 'dynamo/test_list'}, {'test_file': 'export/test_unflatten'}, {'test_file': 'inductor/test_needs_exact_strides'}, {'test_file': 'dynamo/test_verify_correctness'}, {'test_file': 'export/test_export'}, {'test_file': 'inductor/test_minifier_isolate'}, {'test_file': 'dynamo/test_logging'}, {'test_file': 'dynamo/test_deviceguard'}, {'test_file': 'dynamo/test_aot_autograd'}, {'test_file': 'inductor/test_augmented_graph_helper'}, {'test_file': 'dynamo/test_cudagraphs'}, {'test_file': 'inductor/test_caching'}, {'test_file': 'export/test_upgrader'}, {'test_file': 'dynamo/test_sets'}, {'test_file': 'dynamo/test_unspec'}, {'test_file': 'dynamo/test_python_dispatcher'}, {'test_file': 'dynamo/test_optimizers'}, {'test_file': 'dynamo/test_flat_apply'}, {'test_file': 'dynamo/test_higher_order_ops'}, {'test_file': 'export/test_nativert'}, {'test_file': 'inductor/test_cpu_repro'}, {'test_file': 'dynamo/test_graph_deduplication'}, {'test_file': 'dynamo/test_export'}, {'test_file': 'dynamo/test_error_messages'}, {'test_file': 'export/test_hop'}, {'test_file': 'dynamo/test_cudagraphs_expandable_segments'}, {'test_file': 'dynamo/test_recompile_ux'}, {'test_file': 'inductor/test_mmdecomp'}, {'test_file': 'dynamo/test_precompile_context'}, {'test_file': 'dynamo/test_bytecode_utils'}, {'test_file': 'export/test_pass_infra'}, {'test_file': 'dynamo/test_guard_manager'}, {'test_file': 'dynamo/test_minifier'}, {'test_file': 'export/test_converter'}, {'test_file': 'dynamo/test_torchrec'}, {'test_file': 'export/test_experimental'}, {'test_file': 'dynamo/test_input_attr_tracking'}, {'test_file': 'dynamo/test_exc'}, {'test_file': 'dynamo/test_hooks'}, {'test_file': 'dynamo/test_trace_rules'}, {'test_file': 'dynamo/test_exceptions'}, {'test_file': 'export/test_schema'}, {'test_file': 'inductor/test_distributed_patterns'}, {'test_file': 'inductor/test_mps_basic'}, {'test_file': 'inductor/test_cudagraph_trees_expandable_segments'}, {'test_file': 'dynamo/test_subclasses'}, {'test_file': 'dynamo/test_repros'}, {'test_file': 'dynamo/test_reorder_logs'}, {'test_file': 'dynamo/test_fake_distributed'}, {'test_file': 'dynamo/test_generator'}, {'test_file': 'export/test_lift_unlift'}, {'test_file': 'export/test_verifier'}, {'test_file': 'profiler/test_profiler'}, {'test_file': 'dynamo/test_misc'}, {'test_file': 'export/test_draft_export'}, {'test_file': 'export/test_sparse'}, {'test_file': 'dynamo/test_comptime'}, {'test_file': 'dynamo/test_python_autograd'}, {'test_file': 'test_torch'}, {'test_file': 'functorch/test_rearrange'}, {'test_file': 'functorch/test_parsing'}, {'test_file': 'test_package'}, {'test_file': 'test_fx'}, {'test_file': 'test_privateuseone_python_backend'}, {'test_file': 'test_comparison_utils'}, {'test_file': 'test_mkl_verbose'}, {'test_file': 'functorch/test_ac_logging'}, {'test_file': 'test_mkldnn_verbose'}, {'test_file': 'test_cpp_api_parity'}, {'test_file': 'profiler/test_kineto'}, {'test_file': 'test_matmul_cuda'}, {'test_file': 'test_transformers'}, {'test_file': 'test_meta'}, {'test_file': 'test_hop_infra'}, {'test_file': 'test_modules'}, {'test_file': 'test_license'}, {'test_file': 'test_utils_config_module'}, {'test_file': 'test_decomp'}, {'test_file': 'xpu/test_fusion'}, {'test_file': 'test_appending_byte_serializer'}, {'test_file': 'test_autoload'}, {'test_file': 'test_rename_privateuse1_to_existing_device'}, {'test_file': 'test_proxy_tensor'}, {'test_file': 'test_ao_sparsity'}, {'test_file': 'test_cuda_expandable_segments'}, {'test_file': 'torch_np/test_binary_ufuncs'}, {'test_file': 'test_extension_utils'}, {'test_file': 'torch_np/test_unary_ufuncs'}, {'test_file': 'test_nestedtensor'}, {'test_file': 'test_functionalization'}, {'test_file': 'higher_order_ops/test_invoke_subgraph'}, {'test_file': 'test_foreach'}, {'test_file': 'higher_order_ops/test_local_map'}, {'test_file': 'test_cuda'}, {'test_file': 'nn/test_parametrization'}, {'test_file': 'test_type_hints'}, {'test_file': 'torch_np/test_dtype'}, {'test_file': 'backends/xeon/test_launch'}, {'test_file': 'test_fx_experimental'}, {'test_file': 'test_tensorexpr_pybind'}, {'test_file': 'test_file_check'}, {'test_file': 'test_expanded_weights'}, {'test_file': 'test_flop_counter'}, {'test_file': 'xpu/test_gemm'}, {'test_file': 'cpp_extensions/torch_stable_test_extension/torch_stable_test/test_torch_stable'}, {'test_file': 'nn/test_multihead_attention'}, {'test_file': 'test_show_pickle'}, {'test_file': 'test_module_tracker'}, {'test_file': 'test_fx_passes'}, {'test_file': 'typing/test_python_operators'}, {'test_file': 'functorch/test_minifier'}, {'test_file': 'test_jiterator'}, {'test_file': 'torch_np/numpy_tests/core/test_scalarinherit'}, {'test_file': 'higher_order_ops/test_with_effects'}, {'test_file': 'torch_np/test_basic'}, {'test_file': 'test_openmp'}, {'test_file': 'higher_order_ops/test_invoke_quant'}, {'test_file': 'profiler/test_record_function'}, {'test_file': 'test_tensorexpr'}, {'test_file': 'test_native_functions'}, {'test_file': 'functorch/test_logging'}, {'test_file': 'test_jit_fuser_te'}, {'test_file': 'test_namedtensor'}, {'test_file': 'test_custom_ops'}, {'test_file': 'xpu/test_conv'}, {'test_file': 'torch_np/test_nep50_examples'}, {'test_file': 'functorch/test_ac_knapsack'}, {'test_file': 'test_weak'}, {'test_file': 'test_hub'}, {'test_file': 'lazy/test_bindings'}, {'test_file': 'test_complex'}, {'test_file': 'test_utils_filelock'}, {'test_file': 'test_content_store'}, {'test_file': 'torch_np/test_random'}, {'test_file': 'test_utils'}, {'test_file': 'lazy/test_functionalization'}, {'test_file': 'test_legacy_vmap'}, {'test_file': 'test_namedtuple_return_api'}, {'test_file': 'functorch/test_ac'}, {'test_file': 'test_set_default_mobile_cpu_allocator'}, {'test_file': 'test_typing'}, {'test_file': 'test_out_dtype_op'}, {'test_file': 'test_compile_benchmark_util'}, {'test_file': 'test_pytree'}, {'test_file': 'test_stateless'}, {'test_file': 'test_optim'}, {'test_file': 'profiler/test_cpp_thread'}, {'test_file': 'profiler/test_memory_profiler'}, {'test_file': 'test_binary_ufuncs'}, {'test_file': 'test_fx_reinplace_pass'}, {'test_file': 'torch_np/numpy_tests/core/test_einsum'}, {'test_file': 'test_multiprocessing'}, {'test_file': 'test_numpy_interop'}, {'test_file': 'lazy/test_step_closures'}, {'test_file': 'functorch/dim/test_getsetitem'}, {'test_file': 'test_per_overload_api'}, {'test_file': 'test_autograd'}, {'test_file': 'test_ops_jit'}, {'test_file': 'test_python_dispatch'}, {'test_file': 'test_jit'}, {'test_file': 'test_ops_fwd_gradients'}, {'test_file': 'functorch/test_dims'}, {'test_file': 'functorch/test_control_flow'}, {'test_file': 'torch_np/test_ufuncs_basic'}, {'test_file': 'test_autograd_fallback'}, {'test_file': 'test_ops_gradients'}, {'test_file': 'test_fake_tensor'}, {'test_file': 'test_functionalization_of_rng_ops'}, {'test_file': 'nn/test_packed_sequence'}, {'test_file': 'test_itt'}, {'test_file': 'test_segment_reductions'}, {'test_file': 'test_sparse_semi_structured'}, {'test_file': 'test_bundled_inputs'}, {'test_file': 'functorch/test_aot_joint_with_descriptors'}, {'test_file': 'functorch/test_ops'}, {'test_file': 'nn/test_lazy_modules'}, {'test_file': 'nn/test_pruning'}, {'test_file': 'test_pruning_op'}, {'test_file': 'test_mobile_optimizer'}, {'test_file': 'test_autocast'}, {'test_file': 'test_cuda_sanitizer'}, {'test_file': 'test_sympy_utils'}, {'test_file': 'test_cuda_multigpu'}, {'test_file': 'profiler/test_execution_trace'}, {'test_file': 'test_jit_disabled'}, {'test_file': 'test_monitor'}, {'test_file': 'functorch/test_memory_efficient_fusion'}, {'test_file': 'lazy/test_ts_opinfo'}, {'test_file': 'test_logging'}, {'test_file': 'functorch/test_vmap_registrations'}, {'test_file': 'test_masked'}, {'test_file': 'torch_np/numpy_tests/core/test_multiarray'}, {'test_file': 'test_subclass'}, {'test_file': 'test_mkldnn_fusion'}, {'test_file': 'torch_np/numpy_tests/lib/test_function_base'}, {'test_file': 'torch_np/numpy_tests/core/test_numeric'}, {'test_file': 'test_schema_check'}, {'test_file': 'test_import_stats'}, {'test_file': 'test_linalg'}, {'test_file': 'test_cpp_extensions_mtia_backend'}, {'test_file': 'optim/test_lrscheduler'}, {'test_file': 'test_dispatch'}, {'test_file': 'cpp_extensions/libtorch_agnostic_extension/test/test_libtorch_agnostic'}, {'test_file': 'cpp_extensions/python_agnostic_extension/test/test_python_agnostic'}, {'test_file': 'test_vulkan'}, {'test_file': 'torch_np/test_indexing'}, {'test_file': 'test_tensor_creation_ops'}, {'test_file': 'optim/test_swa_utils'}, {'test_file': 'nn/test_embedding'}, {'test_file': 'test_functional_optim'}, {'test_file': 'test_futures'}, {'test_file': 'test_cpp_extensions_stream_and_event'}, {'test_file': 'test_tensorboard'}, {'test_file': 'nn/test_dropout'}, {'test_file': 'test_maskedtensor'}, {'test_file': 'test_dynamic_shapes'}, {'test_file': 'functorch/dim/test_split'}, {'test_file': 'torch_np/numpy_tests/core/test_indexing'}, {'test_file': 'test_overrides'}, {'test_file': 'test_numba_integration'}, {'test_file': 'test_dataloader'}, {'test_file': 'test_datapipe'}, {'test_file': 'lazy/test_generator'}, {'test_file': 'torch_np/numpy_tests/lib/test_type_check'}, {'test_file': 'lazy/test_debug_util'}, {'test_file': 'test_jit_llga_fuser'}, {'test_file': 'test_numa_binding'}, {'test_file': 'torch_np/numpy_tests/lib/test_histograms'}, {'test_file': 'benchmark_utils/test_benchmark_utils'}, {'test_file': 'torch_np/numpy_tests/core/test_scalarmath'}, {'test_file': 'test_cpp_extensions_jit'}, {'test_file': 'test_indexing'}, {'test_file': 'profiler/test_torch_tidy'}, {'test_file': 'nn/test_module_hooks'}, {'test_file': 'test_native_mha'}, {'test_file': 'functorch/test_aotdispatch'}, {'test_file': 'nn/test_load_state_dict'}, {'test_file': 'torch_np/numpy_tests/linalg/test_linalg'}, {'test_file': 'test_shape_ops'}, {'test_file': 'torch_np/numpy_tests/core/test_shape_base'}, {'test_file': 'nn/test_convolution'}, {'test_file': 'torch_np/numpy_tests/core/test_dtype'}, {'test_file': 'test_unary_ufuncs'}, {'test_file': 'optim/test_optim'}, {'test_file': 'test_sparse_csr'}, {'test_file': 'test_scaled_matmul_cuda'}, {'test_file': 'test_sort_and_select'}, {'test_file': 'test_type_info'}, {'test_file': 'test_jit_autocast'}, {'test_file': 'test_xnnpack_integration'}, {'test_file': 'test_serialization'}, {'test_file': 'nn/test_pooling'}, {'test_file': 'torch_np/numpy_tests/lib/test_twodim_base'}, {'test_file': 'test_multiprocessing_spawn'}, {'test_file': 'test_function_schema'}, {'test_file': 'test_mkldnn'}, {'test_file': 'functorch/test_vmap'}, {'test_file': 'torch_np/numpy_tests/lib/test_shape_base_'}, {'test_file': 'torch_np/numpy_tests/fft/test_pocketfft'}, {'test_file': 'test_nn'}, {'test_file': 'test_scatter_gather_ops'}, {'test_file': 'torch_np/test_ndarray_methods'}, {'test_file': 'test_view_ops'}, {'test_file': 'torch_np/numpy_tests/core/test_dlpack'}, {'test_file': 'torch_np/numpy_tests/core/test_getlimits'}, {'test_file': 'test_accelerator'}, {'test_file': 'lazy/test_reuse_ir'}, {'test_file': 'test_cuda_trace'}, {'test_file': 'torch_np/numpy_tests/lib/test_index_tricks'}, {'test_file': 'nn/test_init'}, {'test_file': 'test_cuda_nvml_based_avail'}, {'test_file': 'torch_np/numpy_tests/core/test_numerictypes'}, {'test_file': 'test_type_promotion'}, {'test_file': 'torch_np/numpy_tests/core/test_scalar_methods'}, {'test_file': 'torch_np/numpy_tests/fft/test_helper'}, {'test_file': 'torch_np/test_function_base'}, {'test_file': 'profiler/test_profiler_tree'}, {'test_file': 'functorch/test_eager_transforms'}, {'test_file': 'test_reductions'}, {'test_file': 'test_sparse'}, {'test_file': 'test_cuda_primary_ctx'}, {'test_file': 'test_dlpack'}, {'test_file': 'torch_np/numpy_tests/lib/test_arraysetops'}, {'test_file': 'torch_np/test_scalars_0D_arrays'}, {'test_file': 'torch_np/test_reductions'}, {'test_file': 'test_prims'}, {'test_file': 'torch_np/numpy_tests/core/test_scalar_ctors'}, {'test_file': 'test_functional_autograd_benchmark'}, {'test_file': 'torch_np/numpy_tests/lib/test_arraypad'}, {'test_file': 'test_spectral_ops'}, {'test_file': 'profiler/test_python_tracer'}, {'test_file': 'distributions/test_distributions'}, {'test_file': 'test_autoload_enable'}, {'test_file': 'test_autoload_disable'}, {'test_file': 'test_openreg'}], 'excluded': []} from test/test-reports/td_exclusions-39c139e5f89b1ec3c657.json is not a benchmark record, skipping 2025-10-10T02:30:18.9490048Z warn(f"{result} from {filepath} is not a benchmark record, skipping") 2025-10-10T02:30:18.9609141Z ##[group]Run cat test/**/*_toprint.log || true 2025-10-10T02:30:18.9609524Z cat test/**/*_toprint.log || true 2025-10-10T02:30:18.9618004Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:30:18.9618366Z env: 2025-10-10T02:30:18.9618586Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:18.9618903Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:18.9619437Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:18.9619917Z DEVICE_NAME: 2025-10-10T02:30:18.9620141Z DEVICE_TYPE: 2025-10-10T02:30:18.9620366Z ##[endgroup] 2025-10-10T02:30:18.9769191Z cat: 'test/**/*_toprint.log': No such file or directory 2025-10-10T02:30:18.9863671Z ##[group]Run kill "$MONITOR_SCRIPT_PID" 2025-10-10T02:30:18.9864019Z kill "$MONITOR_SCRIPT_PID" 2025-10-10T02:30:18.9872152Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:30:18.9872500Z env: 2025-10-10T02:30:18.9872709Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:18.9873026Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:18.9873548Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:18.9874126Z DEVICE_NAME: 2025-10-10T02:30:18.9874351Z DEVICE_TYPE: 2025-10-10T02:30:18.9874586Z MONITOR_SCRIPT_PID: 59442 2025-10-10T02:30:18.9874842Z ##[endgroup] 2025-10-10T02:30:19.0025818Z Prepare all required actions 2025-10-10T02:30:19.0026207Z Getting action download info 2025-10-10T02:30:19.1627131Z Download action repository 'seemethere/upload-artifact-s3@v5' (SHA:baba72d0712b404f646cebe0730933554ebce96a) 2025-10-10T02:30:19.8060244Z Download action repository 'actions/upload-artifact@v4' (SHA:ea165f8d65b6e75b540449e92b4886f43607fa02) 2025-10-10T02:30:21.8071551Z ##[group]Run ./.github/actions/upload-test-artifacts 2025-10-10T02:30:21.8071902Z with: 2025-10-10T02:30:21.8072246Z file-suffix: test-slow-2-3-linux.g5.4xlarge.nvidia.gpu_52406799277 2025-10-10T02:30:21.8072671Z s3-bucket: gha-artifacts 2025-10-10T02:30:21.8072929Z env: 2025-10-10T02:30:21.8073147Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:21.8073482Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:21.8074034Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:21.8074515Z DEVICE_NAME: 2025-10-10T02:30:21.8074778Z DEVICE_TYPE: 2025-10-10T02:30:21.8075001Z ##[endgroup] 2025-10-10T02:30:21.8190670Z ##[group]Run # Remove any previous test jsons if they exist 2025-10-10T02:30:21.8191101Z # Remove any previous test jsons if they exist 2025-10-10T02:30:21.8191457Z rm -f test-jsons-*.zip 2025-10-10T02:30:21.8191973Z zip -r "test-jsons-${FILE_SUFFIX}.zip" test/test-reports -i '*.json' 2025-10-10T02:30:21.8201651Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:30:21.8201999Z env: 2025-10-10T02:30:21.8202206Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:21.8202522Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:21.8203043Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:21.8203530Z DEVICE_NAME: 2025-10-10T02:30:21.8203752Z DEVICE_TYPE: 2025-10-10T02:30:21.8204086Z FILE_SUFFIX: test-slow-2-3-linux.g5.4xlarge.nvidia.gpu_52406799277 2025-10-10T02:30:21.8204473Z ##[endgroup] 2025-10-10T02:30:21.9060661Z adding: test/test-reports/td_exclusions-39c139e5f89b1ec3c657.json (deflated 82%) 2025-10-10T02:30:21.9175067Z ##[group]Run # Remove any previous test reports if they exist 2025-10-10T02:30:21.9175520Z # Remove any previous test reports if they exist 2025-10-10T02:30:21.9175895Z rm -f test-reports-*.zip 2025-10-10T02:30:21.9176349Z zip -r "test-reports-${FILE_SUFFIX}.zip" test/test-reports -i '*.xml' -i '*.csv' 2025-10-10T02:30:21.9185311Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:30:21.9185668Z env: 2025-10-10T02:30:21.9185885Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:21.9186208Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:21.9186800Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:21.9187291Z DEVICE_NAME: 2025-10-10T02:30:21.9187521Z DEVICE_TYPE: 2025-10-10T02:30:21.9187870Z FILE_SUFFIX: test-slow-2-3-linux.g5.4xlarge.nvidia.gpu_52406799277 2025-10-10T02:30:21.9188263Z ##[endgroup] 2025-10-10T02:30:21.9461038Z adding: test/test-reports/python-pytest/inductor.test_dependencies/inductor.test_dependencies-4b7066fe46199c39.xml (deflated 27%) 2025-10-10T02:30:21.9462109Z adding: test/test-reports/python-pytest/inductor.test_dependencies/inductor.test_dependencies-5660debae7064c9d.xml (deflated 82%) 2025-10-10T02:30:21.9462991Z adding: test/test-reports/python-pytest/test_ops/test_ops-153fd416e97f907a.xml (deflated 28%) 2025-10-10T02:30:22.0221747Z adding: test/test-reports/python-pytest/test_ops/test_ops-26bdbc90861806fe.xml (deflated 98%) 2025-10-10T02:30:22.0222597Z adding: test/test-reports/python-pytest/test_torchfuzz_repros/test_torchfuzz_repros-26d55b0c6b42689f.xml (deflated 28%) 2025-10-10T02:30:22.0223498Z adding: test/test-reports/python-pytest/test_torchfuzz_repros/test_torchfuzz_repros-e63437035263cf6f.xml (deflated 87%) 2025-10-10T02:30:22.0224732Z adding: test/test-reports/python-pytest/test_opaque_obj/test_opaque_obj-f16007ccf588725f.xml (deflated 28%) 2025-10-10T02:30:22.0225542Z adding: test/test-reports/python-pytest/test_opaque_obj/test_opaque_obj-9737bcf04884f0b7.xml (deflated 85%) 2025-10-10T02:30:22.0226340Z adding: test/test-reports/python-pytest/test_opaque_obj/test_opaque_obj-fc7762df66962a95.xml (deflated 43%) 2025-10-10T02:30:22.0227145Z adding: test/test-reports/python-pytest/test_opaque_obj/test_opaque_obj-7b38027311ebcd3d.xml (deflated 85%) 2025-10-10T02:30:22.0227939Z adding: test/test-reports/python-pytest/test_opaque_obj/test_opaque_obj-d7344d8b55ee5f0d.xml (deflated 44%) 2025-10-10T02:30:22.0228733Z adding: test/test-reports/python-pytest/test_opaque_obj/test_opaque_obj-0ab798b2fefbd685.xml (deflated 85%) 2025-10-10T02:30:22.0229530Z adding: test/test-reports/python-pytest/test_opaque_obj/test_opaque_obj-635f7d94067fd76e.xml (deflated 43%) 2025-10-10T02:30:22.0230313Z adding: test/test-reports/python-pytest/test_opaque_obj/test_opaque_obj-245949ab9a0dd00e.xml (deflated 44%) 2025-10-10T02:30:22.0231106Z adding: test/test-reports/python-pytest/test_testing/test_testing-e6fc338eed1c2a0b.xml (deflated 28%) 2025-10-10T02:30:22.0277219Z adding: test/test-reports/python-pytest/test_testing/test_testing-cf9f108b60a282e0.xml (deflated 98%) 2025-10-10T02:30:22.0278076Z adding: test/test-reports/python-pytest/test_public_bindings/test_public_bindings-bac8245ef9ff336b.xml (deflated 28%) 2025-10-10T02:30:22.0279095Z adding: test/test-reports/python-pytest/test_public_bindings/test_public_bindings-183dbb9b8b90f2a1.xml (deflated 77%) 2025-10-10T02:30:22.0280035Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-e54e74e2d063ebc1.xml (deflated 28%) 2025-10-10T02:30:22.0304503Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-255c6dabde49ae1c.xml (deflated 96%) 2025-10-10T02:30:22.0305540Z adding: test/test-reports/python-pytest/inductor.test_torchinductor/inductor.test_torchinductor-31abbccf44e117e9.xml (deflated 45%) 2025-10-10T02:30:22.0331067Z adding: test/test-reports/python-pytest/inductor.test_torchinductor/inductor.test_torchinductor-fe21fa78af285fe1.xml (deflated 97%) 2025-10-10T02:30:22.0332416Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-1f9f5bf35bea1f66.xml (deflated 28%) 2025-10-10T02:30:22.0333807Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-74c42cdf4ad6e8c0.xml (deflated 29%) 2025-10-10T02:30:22.0335184Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-ffd572b1f7fc351d.xml (deflated 28%) 2025-10-10T02:30:22.0336583Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-11bbc75038e605c6.xml (deflated 28%) 2025-10-10T02:30:22.0337985Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-798065ede04bd44a.xml (deflated 28%) 2025-10-10T02:30:22.0341575Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-5fc96bb3bd666430.xml (deflated 97%) 2025-10-10T02:30:22.0350861Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-e5519ff255c14020.xml (deflated 98%) 2025-10-10T02:30:22.0358493Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-0ef0ab4212982b86.xml (deflated 98%) 2025-10-10T02:30:22.0367442Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-19490f9e37dfa988.xml (deflated 98%) 2025-10-10T02:30:22.0375436Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-3843c559f7694d3b.xml (deflated 98%) 2025-10-10T02:30:22.0376857Z adding: test/test-reports/python-pytest/inductor.test_static_cuda_launcher/inductor.test_static_cuda_launcher-6db5f7b8633f0085.xml (deflated 28%) 2025-10-10T02:30:22.0377962Z adding: test/test-reports/python-pytest/inductor.test_static_cuda_launcher/inductor.test_static_cuda_launcher-5d809f052ac18fb5.xml (deflated 91%) 2025-10-10T02:30:22.0379089Z adding: test/test-reports/python-pytest/inductor.test_cooperative_reductions/inductor.test_cooperative_reductions-675d18a1b1ad4605.xml (deflated 27%) 2025-10-10T02:30:22.0380983Z adding: test/test-reports/python-pytest/inductor.test_cooperative_reductions/inductor.test_cooperative_reductions-1386c5f11227e0ba.xml (deflated 98%) 2025-10-10T02:30:22.0382078Z adding: test/test-reports/python-pytest/inductor.test_async_compile/inductor.test_async_compile-c3aff3bddbf17c62.xml (deflated 28%) 2025-10-10T02:30:22.0383085Z adding: test/test-reports/python-pytest/inductor.test_async_compile/inductor.test_async_compile-75eb7c09097b5c01.xml (deflated 87%) 2025-10-10T02:30:22.0384134Z adding: test/test-reports/python-pytest/inductor.test_kernel_benchmark/inductor.test_kernel_benchmark-c61d22f6d6d5f2c5.xml (deflated 28%) 2025-10-10T02:30:22.0385195Z adding: test/test-reports/python-pytest/inductor.test_kernel_benchmark/inductor.test_kernel_benchmark-e0b3b80f7bbd9580.xml (deflated 90%) 2025-10-10T02:30:22.0386191Z adding: test/test-reports/python-pytest/inductor.test_cuda_repro/inductor.test_cuda_repro-235295736b5901a2.xml (deflated 28%) 2025-10-10T02:30:22.0387231Z adding: test/test-reports/python-pytest/inductor.test_cuda_repro/inductor.test_cuda_repro-75cd4e5eff65d4a8.xml (deflated 94%) 2025-10-10T02:30:22.0388167Z adding: test/test-reports/python-pytest/dynamo.test_callback/dynamo.test_callback-347acc17997e932d.xml (deflated 28%) 2025-10-10T02:30:22.0389078Z adding: test/test-reports/python-pytest/dynamo.test_callback/dynamo.test_callback-36cdb91df204ede1.xml (deflated 78%) 2025-10-10T02:30:22.0389963Z adding: test/test-reports/python-pytest/inductor.test_fp8/inductor.test_fp8-3d747be1da5c42ba.xml (deflated 28%) 2025-10-10T02:30:22.0394271Z adding: test/test-reports/python-pytest/inductor.test_fp8/inductor.test_fp8-b1cfa20e6c9b68ff.xml (deflated 98%) 2025-10-10T02:30:22.0395330Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_dynamic_shapes/inductor.test_torchinductor_dynamic_shapes-eb8985d8e80ac1ba.xml (deflated 66%) 2025-10-10T02:30:22.0418590Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_dynamic_shapes/inductor.test_torchinductor_dynamic_shapes-6c495ad3fbb50877.xml (deflated 96%) 2025-10-10T02:30:22.0419686Z adding: test/test-reports/python-pytest/inductor.test_analysis/inductor.test_analysis-016d485796707ffc.xml (deflated 28%) 2025-10-10T02:30:22.0420622Z adding: test/test-reports/python-pytest/inductor.test_analysis/inductor.test_analysis-544a89913d306103.xml (deflated 94%) 2025-10-10T02:30:22.0421602Z adding: test/test-reports/python-pytest/inductor.test_triton_syntax/inductor.test_triton_syntax-d88ac7b1f1bb44c1.xml (deflated 27%) 2025-10-10T02:30:22.0422616Z adding: test/test-reports/python-pytest/inductor.test_triton_syntax/inductor.test_triton_syntax-b4ebeb8d71cb1ed4.xml (deflated 45%) 2025-10-10T02:30:22.0423701Z adding: test/test-reports/python-pytest/inductor.test_triton_extension_backend/inductor.test_triton_extension_backend-69a73c63193d9db6.xml (deflated 28%) 2025-10-10T02:30:22.0424861Z adding: test/test-reports/python-pytest/inductor.test_triton_extension_backend/inductor.test_triton_extension_backend-14bd2ae2c44b0c53.xml (deflated 28%) 2025-10-10T02:30:22.0425896Z adding: test/test-reports/python-pytest/inductor.test_utils/inductor.test_utils-5335d43eab2e13d8.xml (deflated 28%) 2025-10-10T02:30:22.0426784Z adding: test/test-reports/python-pytest/inductor.test_utils/inductor.test_utils-13a5a0a687a66d53.xml (deflated 85%) 2025-10-10T02:30:22.0427816Z adding: test/test-reports/python-pytest/inductor.test_coordinate_descent_tuner/inductor.test_coordinate_descent_tuner-d41f6ba8b6df46e7.xml (deflated 28%) 2025-10-10T02:30:22.0429200Z adding: test/test-reports/python-pytest/inductor.test_coordinate_descent_tuner/inductor.test_coordinate_descent_tuner-244027fcb69c10d1.xml (deflated 82%) 2025-10-10T02:30:22.0430304Z adding: test/test-reports/python-pytest/inductor.test_inplace_padding/inductor.test_inplace_padding-db48dd8514a2fff4.xml (deflated 45%) 2025-10-10T02:30:22.0431342Z adding: test/test-reports/python-pytest/inductor.test_inplace_padding/inductor.test_inplace_padding-13c6240cd5686cd3.xml (deflated 83%) 2025-10-10T02:30:22.0432476Z adding: test/test-reports/python-pytest/inductor.test_template_heuristics_registry/inductor.test_template_heuristics_registry-12086d63aa19b7f9.xml (deflated 28%) 2025-10-10T02:30:22.0433723Z adding: test/test-reports/python-pytest/inductor.test_template_heuristics_registry/inductor.test_template_heuristics_registry-fc71b63e6cc8ca46.xml (deflated 82%) 2025-10-10T02:30:22.0434893Z adding: test/test-reports/python-pytest/inductor.test_select_algorithm/inductor.test_select_algorithm-165bf9605c405ed5.xml (deflated 29%) 2025-10-10T02:30:22.0435939Z adding: test/test-reports/python-pytest/inductor.test_select_algorithm/inductor.test_select_algorithm-e88483c9c840d1f5.xml (deflated 93%) 2025-10-10T02:30:22.0437046Z adding: test/test-reports/python-pytest/inductor.test_extension_backend/inductor.test_extension_backend-526c1eadd4c5429d.xml (deflated 28%) 2025-10-10T02:30:22.0438166Z adding: test/test-reports/python-pytest/inductor.test_extension_backend/inductor.test_extension_backend-fc22616d192f7dc7.xml (deflated 58%) 2025-10-10T02:30:22.0439238Z adding: test/test-reports/python-pytest/inductor.test_inductor_scheduler/inductor.test_inductor_scheduler-b84e8d38dc8f6093.xml (deflated 28%) 2025-10-10T02:30:22.0440320Z adding: test/test-reports/python-pytest/inductor.test_inductor_scheduler/inductor.test_inductor_scheduler-f85c90115262d487.xml (deflated 81%) 2025-10-10T02:30:22.0441312Z adding: test/test-reports/python-pytest/inductor.test_padding/inductor.test_padding-e693003044d29854.xml (deflated 65%) 2025-10-10T02:30:22.0442231Z adding: test/test-reports/python-pytest/inductor.test_padding/inductor.test_padding-975b19434aea76a7.xml (deflated 93%) 2025-10-10T02:30:22.0443205Z adding: test/test-reports/python-pytest/inductor.test_codegen_triton/inductor.test_codegen_triton-0ba3cc562562460c.xml (deflated 28%) 2025-10-10T02:30:22.0444220Z adding: test/test-reports/python-pytest/inductor.test_codegen_triton/inductor.test_codegen_triton-62d3def9ec13ab0d.xml (deflated 45%) 2025-10-10T02:30:22.0445404Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_codegen_dynamic_shapes/inductor.test_torchinductor_codegen_dynamic_shapes-79a5b6b3cf2660bf.xml (deflated 29%) 2025-10-10T02:30:22.0458272Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_codegen_dynamic_shapes/inductor.test_torchinductor_codegen_dynamic_shapes-084631b21e175db2.xml (deflated 96%) 2025-10-10T02:30:22.0459562Z adding: test/test-reports/python-pytest/export.test_export_training_ir_to_run_decomp/export.test_export_training_ir_to_run_decomp-bf020b2a199ca86e.xml (deflated 28%) 2025-10-10T02:30:22.0485861Z adding: test/test-reports/python-pytest/export.test_export_training_ir_to_run_decomp/export.test_export_training_ir_to_run_decomp-beb303251857afa2.xml (deflated 96%) 2025-10-10T02:30:22.0486944Z adding: test/test-reports/python-pytest/inductor.test_indexing/inductor.test_indexing-b4aca46783ab2774.xml (deflated 29%) 2025-10-10T02:30:22.0487937Z adding: test/test-reports/python-pytest/inductor.test_indexing/inductor.test_indexing-1e90d010c10e6369.xml (deflated 93%) 2025-10-10T02:30:22.0488873Z adding: test/test-reports/python-pytest/inductor.test_minifier/inductor.test_minifier-daafe270927d12d5.xml (deflated 28%) 2025-10-10T02:30:22.0489813Z adding: test/test-reports/python-pytest/inductor.test_minifier/inductor.test_minifier-051ec8c08e37e23c.xml (deflated 91%) 2025-10-10T02:30:22.0490787Z adding: test/test-reports/python-pytest/inductor.test_perf/inductor.test_perf-ed2925961fabd754.xml (deflated 28%) 2025-10-10T02:30:22.0491776Z adding: test/test-reports/python-pytest/inductor.test_perf/inductor.test_perf-0687fd152a565684.xml (deflated 95%) 2025-10-10T02:30:22.0492659Z adding: test/test-reports/python-pytest/inductor.test_pad_mm/inductor.test_pad_mm-3fdc799a50d8680b.xml (deflated 28%) 2025-10-10T02:30:22.0493544Z adding: test/test-reports/python-pytest/inductor.test_pad_mm/inductor.test_pad_mm-14725a1515df825c.xml (deflated 92%) 2025-10-10T02:30:22.0494547Z adding: test/test-reports/python-pytest/inductor.test_inductor_annotations/inductor.test_inductor_annotations-558ef0def864fc50.xml (deflated 28%) 2025-10-10T02:30:22.0507863Z adding: test/test-reports/python-pytest/inductor.test_inductor_annotations/inductor.test_inductor_annotations-02a272a72f67f043.xml (deflated 67%) 2025-10-10T02:30:22.0508988Z adding: test/test-reports/python-pytest/inductor.test_ck_backend/inductor.test_ck_backend-783b55162668161d.xml (deflated 28%) 2025-10-10T02:30:22.0509943Z adding: test/test-reports/python-pytest/inductor.test_ck_backend/inductor.test_ck_backend-9338c54623c67297.xml (deflated 94%) 2025-10-10T02:30:22.0510956Z adding: test/test-reports/python-pytest/inductor.test_inductor_utils/inductor.test_inductor_utils-a795f8737e2774e8.xml (deflated 28%) 2025-10-10T02:30:22.0511981Z adding: test/test-reports/python-pytest/inductor.test_inductor_utils/inductor.test_inductor_utils-2e778dcabac2bb81.xml (deflated 64%) 2025-10-10T02:30:22.0513126Z adding: test/test-reports/python-pytest/inductor.test_op_completeness/inductor.test_op_completeness-a75c6ebd13548147.xml (deflated 28%) 2025-10-10T02:30:22.0514170Z adding: test/test-reports/python-pytest/inductor.test_op_completeness/inductor.test_op_completeness-16461d9f03e94c05.xml (deflated 80%) 2025-10-10T02:30:22.0515187Z adding: test/test-reports/python-pytest/inductor.test_multi_kernel/inductor.test_multi_kernel-a0d1cdcb8d2b8f20.xml (deflated 28%) 2025-10-10T02:30:22.0516182Z adding: test/test-reports/python-pytest/inductor.test_multi_kernel/inductor.test_multi_kernel-a5fe25268d5bf91c.xml (deflated 92%) 2025-10-10T02:30:22.0517252Z adding: test/test-reports/python-pytest/inductor.test_autoheuristic/inductor.test_autoheuristic-190df5790bede2b3.xml (deflated 28%) 2025-10-10T02:30:22.0518278Z adding: test/test-reports/python-pytest/inductor.test_autoheuristic/inductor.test_autoheuristic-6e63983d4928d42c.xml (deflated 27%) 2025-10-10T02:30:22.0519232Z adding: test/test-reports/python-pytest/export.test_serdes/export.test_serdes-bd738fc4737a93b1.xml (deflated 28%) 2025-10-10T02:30:22.0523439Z adding: test/test-reports/python-pytest/export.test_serdes/export.test_serdes-555a65f1ed8ca2cb.xml (deflated 96%) 2025-10-10T02:30:22.0524407Z adding: test/test-reports/python-pytest/dynamo.test_deque_reconstruct/dynamo.test_deque_reconstruct-ee22e49d669fd011.xml (deflated 28%) 2025-10-10T02:30:22.0525450Z adding: test/test-reports/python-pytest/dynamo.test_deque_reconstruct/dynamo.test_deque_reconstruct-f534b1f6de57853e.xml (deflated 75%) 2025-10-10T02:30:22.0526543Z adding: test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-8442f772251b1e0c.xml (deflated 28%) 2025-10-10T02:30:22.0527756Z adding: test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-c0afe0a260234c65.xml (deflated 97%) 2025-10-10T02:30:22.0528825Z adding: test/test-reports/python-pytest/export.test_strict_export_v2/export.test_strict_export_v2-d3346334f93c748a.xml (deflated 28%) 2025-10-10T02:30:22.0539493Z adding: test/test-reports/python-pytest/export.test_strict_export_v2/export.test_strict_export_v2-65fbfe492ec8afc0.xml (deflated 96%) 2025-10-10T02:30:22.0540524Z adding: test/test-reports/python-pytest/inductor.test_deterministic/inductor.test_deterministic-56dcb5b397e086aa.xml (deflated 29%) 2025-10-10T02:30:22.0541659Z adding: test/test-reports/python-pytest/inductor.test_deterministic/inductor.test_deterministic-89cd92b287ea5750.xml (deflated 86%) 2025-10-10T02:30:22.0542795Z adding: test/test-reports/python-pytest/inductor.test_flex_decoding/inductor.test_flex_decoding-9cbda54d804d2498.xml (deflated 80%) 2025-10-10T02:30:22.0554016Z adding: test/test-reports/python-pytest/inductor.test_flex_decoding/inductor.test_flex_decoding-ea937a556f6da1d0.xml (deflated 98%) 2025-10-10T02:30:22.0555090Z adding: test/test-reports/python-pytest/export.test_unflatten_training_ir/export.test_unflatten_training_ir-b1e74d9ef47cb62b.xml (deflated 28%) 2025-10-10T02:30:22.0556184Z adding: test/test-reports/python-pytest/export.test_unflatten_training_ir/export.test_unflatten_training_ir-2f95045ef7f703de.xml (deflated 93%) 2025-10-10T02:30:22.0557341Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor_arrayref/inductor.test_aot_inductor_arrayref-be8bedfd35fb2a04.xml (deflated 28%) 2025-10-10T02:30:22.0565036Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor_arrayref/inductor.test_aot_inductor_arrayref-7d76b1fd4f153308.xml (deflated 97%) 2025-10-10T02:30:22.0566125Z adding: test/test-reports/python-pytest/dynamo.test_fx_passes_pre_grad/dynamo.test_fx_passes_pre_grad-72fb62f0dbf928dc.xml (deflated 28%) 2025-10-10T02:30:22.0567240Z adding: test/test-reports/python-pytest/dynamo.test_fx_passes_pre_grad/dynamo.test_fx_passes_pre_grad-31370142e45317a4.xml (deflated 44%) 2025-10-10T02:30:22.0568361Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor_windows/inductor.test_aot_inductor_windows-496e08c87bd9c219.xml (deflated 28%) 2025-10-10T02:30:22.0569462Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor_windows/inductor.test_aot_inductor_windows-e774e9b30b3811a9.xml (deflated 46%) 2025-10-10T02:30:22.0570551Z adding: test/test-reports/python-pytest/inductor.test_compiled_autograd/inductor.test_compiled_autograd-f0fb2e61210a8c85.xml (deflated 28%) 2025-10-10T02:30:22.0580704Z adding: test/test-reports/python-pytest/inductor.test_compiled_autograd/inductor.test_compiled_autograd-fe430f83a861b80c.xml (deflated 95%) 2025-10-10T02:30:22.0581714Z adding: test/test-reports/python-pytest/inductor.test_metrics/inductor.test_metrics-8730ac64254f3027.xml (deflated 28%) 2025-10-10T02:30:22.0582624Z adding: test/test-reports/python-pytest/inductor.test_metrics/inductor.test_metrics-9729e358309edf59.xml (deflated 83%) 2025-10-10T02:30:22.0583659Z adding: test/test-reports/python-pytest/inductor.test_custom_post_grad_passes/inductor.test_custom_post_grad_passes-d69c905b1f878300.xml (deflated 29%) 2025-10-10T02:30:22.0584802Z adding: test/test-reports/python-pytest/inductor.test_custom_post_grad_passes/inductor.test_custom_post_grad_passes-7227800257933c76.xml (deflated 85%) 2025-10-10T02:30:22.0585915Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor_package/inductor.test_aot_inductor_package-db7a843cadbc5960.xml (deflated 29%) 2025-10-10T02:30:22.0587016Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor_package/inductor.test_aot_inductor_package-7e52784b1c1e392a.xml (deflated 97%) 2025-10-10T02:30:22.0588122Z adding: test/test-reports/python-pytest/inductor.test_provenance_tracing/inductor.test_provenance_tracing-67d7ab647c4d99c9.xml (deflated 29%) 2025-10-10T02:30:22.0589219Z adding: test/test-reports/python-pytest/inductor.test_provenance_tracing/inductor.test_provenance_tracing-fab4bc337e1ccf71.xml (deflated 88%) 2025-10-10T02:30:22.0590234Z adding: test/test-reports/python-pytest/inductor.test_fx_fusion/inductor.test_fx_fusion-2d89a075bc2674c8.xml (deflated 28%) 2025-10-10T02:30:22.0591185Z adding: test/test-reports/python-pytest/inductor.test_fx_fusion/inductor.test_fx_fusion-770ebb0d62d49fc8.xml (deflated 78%) 2025-10-10T02:30:22.0592163Z adding: test/test-reports/python-pytest/inductor.test_loop_ordering/inductor.test_loop_ordering-6020a7dafb0b41c8.xml (deflated 28%) 2025-10-10T02:30:22.0593174Z adding: test/test-reports/python-pytest/inductor.test_loop_ordering/inductor.test_loop_ordering-6e3a09499ce0fcb7.xml (deflated 94%) 2025-10-10T02:30:22.0594443Z adding: test/test-reports/python-pytest/export.test_functionalized_assertions/export.test_functionalized_assertions-6ee5512f4a8a4875.xml (deflated 28%) 2025-10-10T02:30:22.0595629Z adding: test/test-reports/python-pytest/export.test_functionalized_assertions/export.test_functionalized_assertions-4a9ad6439f53b898.xml (deflated 66%) 2025-10-10T02:30:22.0596779Z adding: test/test-reports/python-pytest/inductor.test_segmented_tree/inductor.test_segmented_tree-feec0076f2dc29cb.xml (deflated 28%) 2025-10-10T02:30:22.0597810Z adding: test/test-reports/python-pytest/inductor.test_segmented_tree/inductor.test_segmented_tree-6f80472f73a980cf.xml (deflated 89%) 2025-10-10T02:30:22.0599101Z adding: test/test-reports/python-pytest/inductor.test_compiled_optimizers/inductor.test_compiled_optimizers-b33a12afc3b44ca1.xml (deflated 28%) 2025-10-10T02:30:22.0611550Z adding: test/test-reports/python-pytest/inductor.test_compiled_optimizers/inductor.test_compiled_optimizers-8265bc7cce28c283.xml (deflated 98%) 2025-10-10T02:30:22.0612681Z adding: test/test-reports/python-pytest/inductor.test_decompose_mem_bound_mm/inductor.test_decompose_mem_bound_mm-9899c9e787795eda.xml (deflated 28%) 2025-10-10T02:30:22.0613807Z adding: test/test-reports/python-pytest/inductor.test_decompose_mem_bound_mm/inductor.test_decompose_mem_bound_mm-e82bb81ed02c5ecc.xml (deflated 96%) 2025-10-10T02:30:22.0614939Z adding: test/test-reports/python-pytest/dynamo.test_base_output/dynamo.test_base_output-9a60d41690a3b377.xml (deflated 29%) 2025-10-10T02:30:22.0615883Z adding: test/test-reports/python-pytest/dynamo.test_base_output/dynamo.test_base_output-92a6740affbfcfbc.xml (deflated 82%) 2025-10-10T02:30:22.0616817Z adding: test/test-reports/python-pytest/dynamo.test_backends/dynamo.test_backends-f60e0997ddcf41f1.xml (deflated 28%) 2025-10-10T02:30:22.0617729Z adding: test/test-reports/python-pytest/dynamo.test_backends/dynamo.test_backends-1560677a252d174c.xml (deflated 89%) 2025-10-10T02:30:22.0618699Z adding: test/test-reports/python-pytest/dynamo.test_fx_graph_runnable/dynamo.test_fx_graph_runnable-4b1736901e9e42ee.xml (deflated 28%) 2025-10-10T02:30:22.0619727Z adding: test/test-reports/python-pytest/dynamo.test_fx_graph_runnable/dynamo.test_fx_graph_runnable-57cf965c8186284e.xml (deflated 90%) 2025-10-10T02:30:22.0620748Z adding: test/test-reports/python-pytest/inductor.test_compile_worker/inductor.test_compile_worker-4398b3f75a06f20e.xml (deflated 28%) 2025-10-10T02:30:22.0621773Z adding: test/test-reports/python-pytest/inductor.test_compile_worker/inductor.test_compile_worker-8b4cf90e0e83daef.xml (deflated 82%) 2025-10-10T02:30:22.0622876Z adding: test/test-reports/python-pytest/inductor.test_move_constructors_to_cuda/inductor.test_move_constructors_to_cuda-9297cc904ec6af95.xml (deflated 28%) 2025-10-10T02:30:22.0624057Z adding: test/test-reports/python-pytest/inductor.test_move_constructors_to_cuda/inductor.test_move_constructors_to_cuda-7fae4fd31fc26dfa.xml (deflated 85%) 2025-10-10T02:30:22.0625180Z adding: test/test-reports/python-pytest/inductor.test_subgraph_choice/inductor.test_subgraph_choice-a8d5a5e4f9f4690b.xml (deflated 28%) 2025-10-10T02:30:22.0626233Z adding: test/test-reports/python-pytest/inductor.test_subgraph_choice/inductor.test_subgraph_choice-48ed00a6d7e2fc82.xml (deflated 65%) 2025-10-10T02:30:22.0627243Z adding: test/test-reports/python-pytest/export.test_export_strict/export.test_export_strict-4000a01614739d7f.xml (deflated 28%) 2025-10-10T02:30:22.0632720Z adding: test/test-reports/python-pytest/export.test_export_strict/export.test_export_strict-70eb6d8bdd30f722.xml (deflated 95%) 2025-10-10T02:30:22.0633757Z adding: test/test-reports/python-pytest/inductor.test_cutedsl_template/inductor.test_cutedsl_template-9abff8bf21b76891.xml (deflated 28%) 2025-10-10T02:30:22.0634813Z adding: test/test-reports/python-pytest/inductor.test_cutedsl_template/inductor.test_cutedsl_template-d016408e952af5c3.xml (deflated 88%) 2025-10-10T02:30:22.0636064Z adding: test/test-reports/python-pytest/dynamo.test_inline_and_install/dynamo.test_inline_and_install-0bb40c424c64b499.xml (deflated 28%) 2025-10-10T02:30:22.0640047Z adding: test/test-reports/python-pytest/dynamo.test_inline_and_install/dynamo.test_inline_and_install-821e4f657e1050e0.xml (deflated 96%) 2025-10-10T02:30:22.0641041Z adding: test/test-reports/python-pytest/export.test_tree_utils/export.test_tree_utils-e9f384f61da60ea5.xml (deflated 28%) 2025-10-10T02:30:22.0641969Z adding: test/test-reports/python-pytest/export.test_tree_utils/export.test_tree_utils-817dc9e87d6f9ac8.xml (deflated 63%) 2025-10-10T02:30:22.0642910Z adding: test/test-reports/python-pytest/dynamo.test_recompiles/dynamo.test_recompiles-e118e3b393add75a.xml (deflated 28%) 2025-10-10T02:30:22.0643861Z adding: test/test-reports/python-pytest/dynamo.test_recompiles/dynamo.test_recompiles-8b4de8ba9f71b5f6.xml (deflated 90%) 2025-10-10T02:30:22.0644773Z adding: test/test-reports/python-pytest/dynamo.test_einops/dynamo.test_einops-0f0b4c1672595e2a.xml (deflated 28%) 2025-10-10T02:30:22.0645649Z adding: test/test-reports/python-pytest/dynamo.test_einops/dynamo.test_einops-86b81f07f05d5ddb.xml (deflated 71%) 2025-10-10T02:30:22.0646577Z adding: test/test-reports/python-pytest/inductor.test_foreach/inductor.test_foreach-58791f3918a8e0bd.xml (deflated 27%) 2025-10-10T02:30:22.0655478Z adding: test/test-reports/python-pytest/inductor.test_foreach/inductor.test_foreach-7542dff925a59516.xml (deflated 98%) 2025-10-10T02:30:22.0656528Z adding: test/test-reports/python-pytest/inductor.test_minifier_utils/inductor.test_minifier_utils-5511a7e9eb1feedd.xml (deflated 28%) 2025-10-10T02:30:22.0657553Z adding: test/test-reports/python-pytest/inductor.test_minifier_utils/inductor.test_minifier_utils-8fe469f340bfc19d.xml (deflated 72%) 2025-10-10T02:30:22.0658487Z adding: test/test-reports/python-pytest/dynamo.test_sdpa/dynamo.test_sdpa-5e9528f5f950bb1f.xml (deflated 28%) 2025-10-10T02:30:22.0659337Z adding: test/test-reports/python-pytest/dynamo.test_sdpa/dynamo.test_sdpa-c01bf031f2bafb90.xml (deflated 82%) 2025-10-10T02:30:22.0660316Z adding: test/test-reports/python-pytest/inductor.test_compile_subprocess/inductor.test_compile_subprocess-905c2babee641077.xml (deflated 43%) 2025-10-10T02:30:22.0682820Z adding: test/test-reports/python-pytest/inductor.test_compile_subprocess/inductor.test_compile_subprocess-6bee1e385b1df38b.xml (deflated 97%) 2025-10-10T02:30:22.0684886Z adding: test/test-reports/python-pytest/export.test_cpp_serdes/export.test_cpp_serdes-d920bd36b5b9948c.xml (deflated 28%) 2025-10-10T02:30:22.0694084Z adding: test/test-reports/python-pytest/export.test_cpp_serdes/export.test_cpp_serdes-21d3fa5cd73ae78d.xml (deflated 95%) 2025-10-10T02:30:22.0695068Z adding: test/test-reports/python-pytest/inductor.test_debug_trace/inductor.test_debug_trace-b0a6eb6fe38ed155.xml (deflated 28%) 2025-10-10T02:30:22.0696041Z adding: test/test-reports/python-pytest/inductor.test_debug_trace/inductor.test_debug_trace-210075157d415e37.xml (deflated 73%) 2025-10-10T02:30:22.0697239Z adding: test/test-reports/python-pytest/inductor.test_memory/inductor.test_memory-8e550969f1fa5c17.xml (deflated 43%) 2025-10-10T02:30:22.0698148Z adding: test/test-reports/python-pytest/inductor.test_memory/inductor.test_memory-1b092aa0f9c7b233.xml (deflated 85%) 2025-10-10T02:30:22.0699222Z adding: test/test-reports/python-pytest/dynamo.test_frame_init/dynamo.test_frame_init-08db5de57a0ff1cf.xml (deflated 28%) 2025-10-10T02:30:22.0700132Z adding: test/test-reports/python-pytest/dynamo.test_frame_init/dynamo.test_frame_init-616a82280b069283.xml (deflated 45%) 2025-10-10T02:30:22.0701144Z adding: test/test-reports/python-pytest/inductor.test_kernel_optimization/inductor.test_kernel_optimization-95ce07e2fc31aeb4.xml (deflated 45%) 2025-10-10T02:30:22.0702253Z adding: test/test-reports/python-pytest/inductor.test_kernel_optimization/inductor.test_kernel_optimization-1e8f53bd06e4e876.xml (deflated 27%) 2025-10-10T02:30:22.0703541Z adding: test/test-reports/python-pytest/inductor.test_combo_kernels/inductor.test_combo_kernels-f33a3c652edcde5b.xml (deflated 28%) 2025-10-10T02:30:22.0704551Z adding: test/test-reports/python-pytest/inductor.test_combo_kernels/inductor.test_combo_kernels-eb6379aed77b6f7e.xml (deflated 92%) 2025-10-10T02:30:22.0705569Z adding: test/test-reports/python-pytest/inductor.test_inplacing_pass/inductor.test_inplacing_pass-f6b16ad949efb334.xml (deflated 28%) 2025-10-10T02:30:22.0706604Z adding: test/test-reports/python-pytest/inductor.test_inplacing_pass/inductor.test_inplacing_pass-5a3caf53cc9e7599.xml (deflated 93%) 2025-10-10T02:30:22.0707614Z adding: test/test-reports/python-pytest/dynamo.test_skip_non_tensor/dynamo.test_skip_non_tensor-96367c64556c2746.xml (deflated 28%) 2025-10-10T02:30:22.0708590Z adding: test/test-reports/python-pytest/dynamo.test_skip_non_tensor/dynamo.test_skip_non_tensor-84014846de79f779.xml (deflated 87%) 2025-10-10T02:30:22.0709571Z adding: test/test-reports/python-pytest/inductor.test_op_dtype_prop/inductor.test_op_dtype_prop-8addf1d4e05562a9.xml (deflated 29%) 2025-10-10T02:30:22.0716713Z adding: test/test-reports/python-pytest/inductor.test_op_dtype_prop/inductor.test_op_dtype_prop-594e0ddd1148eff5.xml (deflated 99%) 2025-10-10T02:30:22.0717702Z adding: test/test-reports/python-pytest/dynamo.test_reconstruct/dynamo.test_reconstruct-365bfba3bb4fb0ad.xml (deflated 28%) 2025-10-10T02:30:22.0718748Z adding: test/test-reports/python-pytest/dynamo.test_reconstruct/dynamo.test_reconstruct-3d8ddf356c2a28fd.xml (deflated 91%) 2025-10-10T02:30:22.0719726Z adding: test/test-reports/python-pytest/export.test_dynamic_shapes/export.test_dynamic_shapes-3d4d88ed62b200da.xml (deflated 28%) 2025-10-10T02:30:22.0720721Z adding: test/test-reports/python-pytest/export.test_dynamic_shapes/export.test_dynamic_shapes-643975afe30f76a8.xml (deflated 64%) 2025-10-10T02:30:22.0721703Z adding: test/test-reports/python-pytest/inductor.test_remote_cache/inductor.test_remote_cache-dd67cfe4b94ddffe.xml (deflated 28%) 2025-10-10T02:30:22.0722700Z adding: test/test-reports/python-pytest/inductor.test_remote_cache/inductor.test_remote_cache-a103b38e9367d937.xml (deflated 73%) 2025-10-10T02:30:22.0723645Z adding: test/test-reports/python-pytest/dynamo.test_interop/dynamo.test_interop-00883093f1b6320a.xml (deflated 28%) 2025-10-10T02:30:22.0724524Z adding: test/test-reports/python-pytest/dynamo.test_interop/dynamo.test_interop-339685b8847501b5.xml (deflated 81%) 2025-10-10T02:30:22.0725457Z adding: test/test-reports/python-pytest/inductor.test_device_assert/inductor.test_device_assert-c3719f698acc680b.xml (deflated 28%) 2025-10-10T02:30:22.0726467Z adding: test/test-reports/python-pytest/inductor.test_device_assert/inductor.test_device_assert-2548b06723a0e377.xml (deflated 88%) 2025-10-10T02:30:22.0727617Z adding: test/test-reports/python-pytest/dynamo.test_skip_guard_eval_unsafe/dynamo.test_skip_guard_eval_unsafe-dc85d5e29ab99cbc.xml (deflated 28%) 2025-10-10T02:30:22.0728697Z adding: test/test-reports/python-pytest/dynamo.test_skip_guard_eval_unsafe/dynamo.test_skip_guard_eval_unsafe-a19ac4053c49a6c1.xml (deflated 81%) 2025-10-10T02:30:22.0729654Z adding: test/test-reports/python-pytest/export.test_tools/export.test_tools-ac73f527143153ca.xml (deflated 28%) 2025-10-10T02:30:22.0730498Z adding: test/test-reports/python-pytest/export.test_tools/export.test_tools-6407b397b7b7b0e6.xml (deflated 64%) 2025-10-10T02:30:22.0731436Z adding: test/test-reports/python-pytest/inductor.test_gpu_cpp_wrapper/inductor.test_gpu_cpp_wrapper-11c2b9a80b5b103e.xml (deflated 28%) 2025-10-10T02:30:22.0732460Z adding: test/test-reports/python-pytest/inductor.test_gpu_cpp_wrapper/inductor.test_gpu_cpp_wrapper-1b51ec119e8be1bc.xml (deflated 97%) 2025-10-10T02:30:22.0733557Z adding: test/test-reports/python-pytest/export.test_export_with_inline_and_install/export.test_export_with_inline_and_install-a64b2222c84ec746.xml (deflated 28%) 2025-10-10T02:30:22.0746318Z adding: test/test-reports/python-pytest/export.test_export_with_inline_and_install/export.test_export_with_inline_and_install-ecedbc438bbd10e3.xml (deflated 96%) 2025-10-10T02:30:22.0747525Z adding: test/test-reports/python-pytest/export.test_serialize/export.test_serialize-af96d27fabb559dc.xml (deflated 28%) 2025-10-10T02:30:22.0749774Z adding: test/test-reports/python-pytest/export.test_serialize/export.test_serialize-d9b2f074d6147ab8.xml (deflated 95%) 2025-10-10T02:30:22.0750714Z adding: test/test-reports/python-pytest/dynamo.test_functions/dynamo.test_functions-9a7e8828c03d8d32.xml (deflated 28%) 2025-10-10T02:30:22.0761879Z adding: test/test-reports/python-pytest/dynamo.test_functions/dynamo.test_functions-ae856a9efa5f45dc.xml (deflated 97%) 2025-10-10T02:30:22.0763061Z adding: test/test-reports/python-pytest/inductor.test_benchmarking/inductor.test_benchmarking-3c577efac6ea0e47.xml (deflated 28%) 2025-10-10T02:30:22.0764081Z adding: test/test-reports/python-pytest/inductor.test_benchmarking/inductor.test_benchmarking-fc85ced706f96646.xml (deflated 91%) 2025-10-10T02:30:22.0765115Z adding: test/test-reports/python-pytest/inductor.test_quantization/inductor.test_quantization-8750149c4fb943ba.xml (deflated 28%) 2025-10-10T02:30:22.0766143Z adding: test/test-reports/python-pytest/inductor.test_quantization/inductor.test_quantization-a9cd1ed930d5fcd4.xml (deflated 66%) 2025-10-10T02:30:22.0767344Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor_custom_ops/inductor.test_aot_inductor_custom_ops-ba0cf5ecf5897a7c.xml (deflated 28%) 2025-10-10T02:30:22.0768563Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor_custom_ops/inductor.test_aot_inductor_custom_ops-940e2563cda3def2.xml (deflated 95%) 2025-10-10T02:30:22.0769702Z adding: test/test-reports/python-pytest/inductor.test_scatter_optimization/inductor.test_scatter_optimization-f28c0a15e90bc13d.xml (deflated 28%) 2025-10-10T02:30:22.0770839Z adding: test/test-reports/python-pytest/inductor.test_scatter_optimization/inductor.test_scatter_optimization-4a73ecf7daf08b10.xml (deflated 87%) 2025-10-10T02:30:22.0771952Z adding: test/test-reports/python-pytest/inductor.test_group_batch_fusion/inductor.test_group_batch_fusion-41fa4f1567a401a2.xml (deflated 28%) 2025-10-10T02:30:22.0773026Z adding: test/test-reports/python-pytest/inductor.test_group_batch_fusion/inductor.test_group_batch_fusion-ee3908fed0e70210.xml (deflated 89%) 2025-10-10T02:30:22.0774106Z adding: test/test-reports/python-pytest/inductor.test_split_cat_fx_passes/inductor.test_split_cat_fx_passes-c0e10ff2e296cf0d.xml (deflated 28%) 2025-10-10T02:30:22.0775193Z adding: test/test-reports/python-pytest/inductor.test_split_cat_fx_passes/inductor.test_split_cat_fx_passes-8e948439bbc16ab0.xml (deflated 89%) 2025-10-10T02:30:22.0776155Z adding: test/test-reports/python-pytest/dynamo.test_view/dynamo.test_view-a4c749a5df799e90.xml (deflated 28%) 2025-10-10T02:30:22.0777037Z adding: test/test-reports/python-pytest/dynamo.test_view/dynamo.test_view-784137add2979573.xml (deflated 84%) 2025-10-10T02:30:22.0778006Z adding: test/test-reports/python-pytest/dynamo.test_fx_annotate/dynamo.test_fx_annotate-9f6f3b595bc67128.xml (deflated 28%) 2025-10-10T02:30:22.0778951Z adding: test/test-reports/python-pytest/dynamo.test_fx_annotate/dynamo.test_fx_annotate-03773cea65f7ebc8.xml (deflated 78%) 2025-10-10T02:30:22.0779930Z adding: test/test-reports/python-pytest/inductor.test_control_deps/inductor.test_control_deps-101e1375d9700f8b.xml (deflated 28%) 2025-10-10T02:30:22.0780926Z adding: test/test-reports/python-pytest/inductor.test_control_deps/inductor.test_control_deps-1ba0247bb32e31a1.xml (deflated 46%) 2025-10-10T02:30:22.0781904Z adding: test/test-reports/python-pytest/dynamo.test_pre_dispatch/dynamo.test_pre_dispatch-cb033647123ac84f.xml (deflated 28%) 2025-10-10T02:30:22.0782860Z adding: test/test-reports/python-pytest/dynamo.test_pre_dispatch/dynamo.test_pre_dispatch-dffe9b43b1015674.xml (deflated 73%) 2025-10-10T02:30:22.0783892Z adding: test/test-reports/python-pytest/dynamo.test_subgraphs/dynamo.test_subgraphs-6f2730de035cdab2.xml (deflated 29%) 2025-10-10T02:30:22.0784906Z adding: test/test-reports/python-pytest/dynamo.test_subgraphs/dynamo.test_subgraphs-e0845c003996c17a.xml (deflated 95%) 2025-10-10T02:30:22.0785940Z adding: test/test-reports/python-pytest/inductor.test_mkldnn_pattern_matcher/inductor.test_mkldnn_pattern_matcher-55452247426eb2ab.xml (deflated 28%) 2025-10-10T02:30:22.0787084Z adding: test/test-reports/python-pytest/inductor.test_mkldnn_pattern_matcher/inductor.test_mkldnn_pattern_matcher-a9d6007d1721d25b.xml (deflated 98%) 2025-10-10T02:30:22.0788122Z adding: test/test-reports/python-pytest/dynamo.test_decorators/dynamo.test_decorators-07cf3a6ef6c98735.xml (deflated 28%) 2025-10-10T02:30:22.0789053Z adding: test/test-reports/python-pytest/dynamo.test_decorators/dynamo.test_decorators-ac720cab64dd7bc9.xml (deflated 94%) 2025-10-10T02:30:22.0789940Z adding: test/test-reports/python-pytest/dynamo.test_pgo/dynamo.test_pgo-0ae097a031992435.xml (deflated 28%) 2025-10-10T02:30:22.0790774Z adding: test/test-reports/python-pytest/dynamo.test_pgo/dynamo.test_pgo-8c70a58686edbcfd.xml (deflated 88%) 2025-10-10T02:30:22.0791669Z adding: test/test-reports/python-pytest/inductor.test_cutlass_evt/inductor.test_cutlass_evt-347ed6f91f2cb308.xml (deflated 28%) 2025-10-10T02:30:22.0792641Z adding: test/test-reports/python-pytest/inductor.test_cutlass_evt/inductor.test_cutlass_evt-6f4948408ba80ceb.xml (deflated 83%) 2025-10-10T02:30:22.0793683Z adding: test/test-reports/python-pytest/dynamo.test_buffers_override/dynamo.test_buffers_override-f315b1ae89c5f781.xml (deflated 28%) 2025-10-10T02:30:22.0794704Z adding: test/test-reports/python-pytest/dynamo.test_buffers_override/dynamo.test_buffers_override-dbbc828878f1e640.xml (deflated 66%) 2025-10-10T02:30:22.0795721Z adding: test/test-reports/python-pytest/inductor.test_online_softmax/inductor.test_online_softmax-f53e372f04dec7f3.xml (deflated 28%) 2025-10-10T02:30:22.0796799Z adding: test/test-reports/python-pytest/inductor.test_online_softmax/inductor.test_online_softmax-baa627338584542c.xml (deflated 95%) 2025-10-10T02:30:22.0797825Z adding: test/test-reports/python-pytest/test_model_exports_to_core_aten/test_model_exports_to_core_aten-f24b142555c6c59d.xml (deflated 28%) 2025-10-10T02:30:22.0799037Z adding: test/test-reports/python-pytest/test_model_exports_to_core_aten/test_model_exports_to_core_aten-8fdb45b3e469824a.xml (deflated 58%) 2025-10-10T02:30:22.0800060Z adding: test/test-reports/python-pytest/inductor.test_helion_kernels/inductor.test_helion_kernels-70949e4de8d6a11b.xml (deflated 29%) 2025-10-10T02:30:22.0801078Z adding: test/test-reports/python-pytest/inductor.test_helion_kernels/inductor.test_helion_kernels-e9ab7fb66ed34149.xml (deflated 62%) 2025-10-10T02:30:22.0802126Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor_utils/inductor.test_aot_inductor_utils-f5ee2f3d1c5ba49e.xml (deflated 29%) 2025-10-10T02:30:22.0803200Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor_utils/inductor.test_aot_inductor_utils-e57da266abfce082.xml (deflated 28%) 2025-10-10T02:30:22.0804256Z adding: test/test-reports/python-pytest/export.test_package/export.test_package-9be8817d90f03336.xml (deflated 28%) 2025-10-10T02:30:22.0805172Z adding: test/test-reports/python-pytest/export.test_package/export.test_package-494db78087a31cfc.xml (deflated 78%) 2025-10-10T02:30:22.0806082Z adding: test/test-reports/python-pytest/dynamo.test_ctx_manager/dynamo.test_ctx_manager-697235e2cf7a1e54.xml (deflated 28%) 2025-10-10T02:30:22.0807165Z adding: test/test-reports/python-pytest/dynamo.test_ctx_manager/dynamo.test_ctx_manager-462fd48c3c6cc519.xml (deflated 95%) 2025-10-10T02:30:22.0808221Z adding: test/test-reports/python-pytest/inductor.test_cudagraph_trees/inductor.test_cudagraph_trees-dd082eaa152c4c4f.xml (deflated 28%) 2025-10-10T02:30:22.0809273Z adding: test/test-reports/python-pytest/inductor.test_cudagraph_trees/inductor.test_cudagraph_trees-69ef9e7eeda5c0f6.xml (deflated 95%) 2025-10-10T02:30:22.0810540Z adding: test/test-reports/python-pytest/inductor.test_block_analysis/inductor.test_block_analysis-f9182f2eaab40171.xml (deflated 28%) 2025-10-10T02:30:22.0811562Z adding: test/test-reports/python-pytest/inductor.test_block_analysis/inductor.test_block_analysis-1110a13f0c74af8e.xml (deflated 89%) 2025-10-10T02:30:22.0812585Z adding: test/test-reports/python-pytest/dynamo.test_autograd_function/dynamo.test_autograd_function-10012c905a4ad6ea.xml (deflated 28%) 2025-10-10T02:30:22.0813614Z adding: test/test-reports/python-pytest/dynamo.test_autograd_function/dynamo.test_autograd_function-5a37e4232d3977af.xml (deflated 93%) 2025-10-10T02:30:22.0814541Z adding: test/test-reports/python-pytest/dynamo.test_nops/dynamo.test_nops-8ff9102e33cedf1d.xml (deflated 29%) 2025-10-10T02:30:22.0815381Z adding: test/test-reports/python-pytest/dynamo.test_nops/dynamo.test_nops-d1e578c0ab7ef937.xml (deflated 78%) 2025-10-10T02:30:22.0816233Z adding: test/test-reports/python-pytest/dynamo.test_config/dynamo.test_config-52003610c75ebd65.xml (deflated 28%) 2025-10-10T02:30:22.0817092Z adding: test/test-reports/python-pytest/dynamo.test_config/dynamo.test_config-c0cc64f1063d68d8.xml (deflated 80%) 2025-10-10T02:30:22.0818019Z adding: test/test-reports/python-pytest/inductor.test_control_flow/inductor.test_control_flow-285ddd578d51726e.xml (deflated 28%) 2025-10-10T02:30:22.0826008Z adding: test/test-reports/python-pytest/inductor.test_control_flow/inductor.test_control_flow-594f702e08b686b6.xml (deflated 99%) 2025-10-10T02:30:22.0827049Z adding: test/test-reports/python-pytest/export.test_db/export.test_db-a9a0df19af4667ed.xml (deflated 29%) 2025-10-10T02:30:22.0827861Z adding: test/test-reports/python-pytest/export.test_db/export.test_db-bcdffdb042b76b82.xml (deflated 94%) 2025-10-10T02:30:22.0828793Z adding: test/test-reports/python-pytest/inductor.test_unbacked_symints/inductor.test_unbacked_symints-b86170322675bf7e.xml (deflated 28%) 2025-10-10T02:30:22.0829862Z adding: test/test-reports/python-pytest/inductor.test_unbacked_symints/inductor.test_unbacked_symints-17e91df4f9a44c43.xml (deflated 93%) 2025-10-10T02:30:22.0830901Z adding: test/test-reports/python-pytest/inductor.test_fused_attention/inductor.test_fused_attention-9a5ce06cb0402574.xml (deflated 29%) 2025-10-10T02:30:22.0832327Z adding: test/test-reports/python-pytest/inductor.test_fused_attention/inductor.test_fused_attention-4a8f601702f13e54.xml (deflated 97%) 2025-10-10T02:30:22.0833364Z adding: test/test-reports/python-pytest/dynamo.test_export_mutations/dynamo.test_export_mutations-9cc2b6720457ca1a.xml (deflated 28%) 2025-10-10T02:30:22.0834381Z adding: test/test-reports/python-pytest/dynamo.test_export_mutations/dynamo.test_export_mutations-b26ee8073c6b1bc5.xml (deflated 83%) 2025-10-10T02:30:22.0835346Z adding: test/test-reports/python-pytest/inductor.test_config/inductor.test_config-0ac88aaf09e7290c.xml (deflated 29%) 2025-10-10T02:30:22.0836260Z adding: test/test-reports/python-pytest/inductor.test_config/inductor.test_config-7c88d27dfee23390.xml (deflated 90%) 2025-10-10T02:30:22.0837295Z adding: test/test-reports/python-pytest/dynamo.test_guard_serialization/dynamo.test_guard_serialization-66c0194aa6f39896.xml (deflated 28%) 2025-10-10T02:30:22.0838504Z adding: test/test-reports/python-pytest/dynamo.test_guard_serialization/dynamo.test_guard_serialization-b6d432e2cf372a00.xml (deflated 94%) 2025-10-10T02:30:22.0839633Z adding: test/test-reports/python-pytest/inductor.test_graph_transform_observer/inductor.test_graph_transform_observer-4322b7373f0ec9ca.xml (deflated 28%) 2025-10-10T02:30:22.0840812Z adding: test/test-reports/python-pytest/inductor.test_graph_transform_observer/inductor.test_graph_transform_observer-73c93a474105ff98.xml (deflated 46%) 2025-10-10T02:30:22.0841848Z adding: test/test-reports/python-pytest/dynamo.test_unittest/dynamo.test_unittest-a04b33a423684477.xml (deflated 27%) 2025-10-10T02:30:22.0842814Z adding: test/test-reports/python-pytest/dynamo.test_unittest/dynamo.test_unittest-6795a81e8a90e36c.xml (deflated 38%) 2025-10-10T02:30:22.0843795Z adding: test/test-reports/python-pytest/inductor.test_cache/inductor.test_cache-d18b6670d6fe3714.xml (deflated 28%) 2025-10-10T02:30:22.0855580Z adding: test/test-reports/python-pytest/inductor.test_cache/inductor.test_cache-c14d159307d706bf.xml (deflated 99%) 2025-10-10T02:30:22.0856479Z adding: test/test-reports/python-pytest/dynamo.test_after_aot/dynamo.test_after_aot-1e380c612f378f61.xml (deflated 28%) 2025-10-10T02:30:22.0857377Z adding: test/test-reports/python-pytest/dynamo.test_after_aot/dynamo.test_after_aot-085dbe6bb3752e2d.xml (deflated 63%) 2025-10-10T02:30:22.0858281Z adding: test/test-reports/python-pytest/inductor.test_compile/inductor.test_compile-8b297b2aaaeb2892.xml (deflated 28%) 2025-10-10T02:30:22.0859204Z adding: test/test-reports/python-pytest/inductor.test_compile/inductor.test_compile-7b8b1c83f3bf8aa7.xml (deflated 88%) 2025-10-10T02:30:22.0860146Z adding: test/test-reports/python-pytest/export.test_export_opinfo/export.test_export_opinfo-8f0899bfeed6c31a.xml (deflated 28%) 2025-10-10T02:30:22.0861124Z adding: test/test-reports/python-pytest/export.test_export_opinfo/export.test_export_opinfo-5eb8442a1a30e1d1.xml (deflated 88%) 2025-10-10T02:30:22.0862121Z adding: test/test-reports/python-pytest/inductor.test_custom_lowering/inductor.test_custom_lowering-e5632fb097033c51.xml (deflated 28%) 2025-10-10T02:30:22.0863215Z adding: test/test-reports/python-pytest/inductor.test_custom_lowering/inductor.test_custom_lowering-de5900028a590a91.xml (deflated 81%) 2025-10-10T02:30:22.0864385Z adding: test/test-reports/python-pytest/dynamo.test_graph_region_tracker/dynamo.test_graph_region_tracker-a20a83f7b94904d2.xml (deflated 28%) 2025-10-10T02:30:22.0865455Z adding: test/test-reports/python-pytest/dynamo.test_graph_region_tracker/dynamo.test_graph_region_tracker-0fedeabd2b888f03.xml (deflated 90%) 2025-10-10T02:30:22.0866415Z adding: test/test-reports/python-pytest/dynamo.test_dicts/dynamo.test_dicts-7597b7ea1bdeeb3b.xml (deflated 28%) 2025-10-10T02:30:22.0867321Z adding: test/test-reports/python-pytest/dynamo.test_dicts/dynamo.test_dicts-4c31adca2269678e.xml (deflated 96%) 2025-10-10T02:30:22.0868287Z adding: test/test-reports/python-pytest/inductor.test_fuzzer/inductor.test_fuzzer-a902ccd5d51f31d2.xml (deflated 28%) 2025-10-10T02:30:22.0869198Z adding: test/test-reports/python-pytest/inductor.test_fuzzer/inductor.test_fuzzer-10fb3f37762ab2ab.xml (deflated 89%) 2025-10-10T02:30:22.0870102Z adding: test/test-reports/python-pytest/dynamo.test_modules/dynamo.test_modules-a6b7a19f1eac0e30.xml (deflated 28%) 2025-10-10T02:30:22.0870987Z adding: test/test-reports/python-pytest/dynamo.test_modules/dynamo.test_modules-53336f76d6608a43.xml (deflated 96%) 2025-10-10T02:30:22.0871912Z adding: test/test-reports/python-pytest/dynamo.test_metrics_context/dynamo.test_metrics_context-d0f897ceb7c784e0.xml (deflated 28%) 2025-10-10T02:30:22.0872928Z adding: test/test-reports/python-pytest/dynamo.test_metrics_context/dynamo.test_metrics_context-ac5e34ab569d9dde.xml (deflated 88%) 2025-10-10T02:30:22.0873969Z adding: test/test-reports/python-pytest/dynamo.test_install_free_tensors/dynamo.test_install_free_tensors-5ce44478338cdbf9.xml (deflated 28%) 2025-10-10T02:30:22.0875038Z adding: test/test-reports/python-pytest/dynamo.test_install_free_tensors/dynamo.test_install_free_tensors-70fe23d084f02fa6.xml (deflated 93%) 2025-10-10T02:30:22.0876085Z adding: test/test-reports/python-pytest/inductor.test_memory_planning/inductor.test_memory_planning-c28b645427819622.xml (deflated 28%) 2025-10-10T02:30:22.0877124Z adding: test/test-reports/python-pytest/inductor.test_memory_planning/inductor.test_memory_planning-34235c70faec1228.xml (deflated 79%) 2025-10-10T02:30:22.0878116Z adding: test/test-reports/python-pytest/inductor.test_ordered_set/inductor.test_ordered_set-3594e22ae8ef04bb.xml (deflated 28%) 2025-10-10T02:30:22.0879097Z adding: test/test-reports/python-pytest/inductor.test_ordered_set/inductor.test_ordered_set-f0a3a58df1e50dcc.xml (deflated 95%) 2025-10-10T02:30:22.0880307Z adding: test/test-reports/python-pytest/inductor.test_split_cat_fx_aten_passes/inductor.test_split_cat_fx_aten_passes-528869b85c420a55.xml (deflated 29%) 2025-10-10T02:30:22.0881447Z adding: test/test-reports/python-pytest/inductor.test_split_cat_fx_aten_passes/inductor.test_split_cat_fx_aten_passes-5c4849d5b5d64e05.xml (deflated 82%) 2025-10-10T02:30:22.0882601Z adding: test/test-reports/python-pytest/dynamo.test_activation_checkpointing/dynamo.test_activation_checkpointing-f04ea38813311ed6.xml (deflated 29%) 2025-10-10T02:30:22.0883767Z adding: test/test-reports/python-pytest/dynamo.test_activation_checkpointing/dynamo.test_activation_checkpointing-f0a45c3a7f4d96da.xml (deflated 92%) 2025-10-10T02:30:22.0884864Z adding: test/test-reports/python-pytest/dynamo.test_compiler_bisector/dynamo.test_compiler_bisector-0da12b95215f170b.xml (deflated 28%) 2025-10-10T02:30:22.0885903Z adding: test/test-reports/python-pytest/dynamo.test_compiler_bisector/dynamo.test_compiler_bisector-d70a06d55aa51ba8.xml (deflated 85%) 2025-10-10T02:30:22.0886898Z adding: test/test-reports/python-pytest/dynamo.test_aot_compile/dynamo.test_aot_compile-bde22014720bc017.xml (deflated 28%) 2025-10-10T02:30:22.0887900Z adding: test/test-reports/python-pytest/dynamo.test_aot_compile/dynamo.test_aot_compile-858a5afa5e3bc920.xml (deflated 89%) 2025-10-10T02:30:22.0888873Z adding: test/test-reports/python-pytest/dynamo.test_modes/dynamo.test_modes-1e70902832469070.xml (deflated 28%) 2025-10-10T02:30:22.0889720Z adding: test/test-reports/python-pytest/dynamo.test_modes/dynamo.test_modes-5bfa0fca545d321c.xml (deflated 77%) 2025-10-10T02:30:22.0890694Z adding: test/test-reports/python-pytest/inductor.test_auto_functionalize/inductor.test_auto_functionalize-d16f08c3cbacb2f8.xml (deflated 27%) 2025-10-10T02:30:22.0891804Z adding: test/test-reports/python-pytest/inductor.test_auto_functionalize/inductor.test_auto_functionalize-b79d8f6d7746e610.xml (deflated 94%) 2025-10-10T02:30:22.0893062Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_codegen_config_overrides/inductor.test_torchinductor_codegen_config_overrides-2869ad4234a984c3.xml (deflated 28%) 2025-10-10T02:30:22.0894460Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_codegen_config_overrides/inductor.test_torchinductor_codegen_config_overrides-28c23ee32517f1c1.xml (deflated 76%) 2025-10-10T02:30:22.0895615Z adding: test/test-reports/python-pytest/dynamo.test_profiler/dynamo.test_profiler-4622e1ae6a97bd76.xml (deflated 28%) 2025-10-10T02:30:22.0896521Z adding: test/test-reports/python-pytest/dynamo.test_profiler/dynamo.test_profiler-680247c7aa8a0f57.xml (deflated 88%) 2025-10-10T02:30:22.0897498Z adding: test/test-reports/python-pytest/dynamo.test_global/dynamo.test_global-b9dff76eba44a6c9.xml (deflated 28%) 2025-10-10T02:30:22.0898543Z adding: test/test-reports/python-pytest/dynamo.test_global/dynamo.test_global-76b90f8dd4163dca.xml (deflated 90%) 2025-10-10T02:30:22.0899526Z adding: test/test-reports/python-pytest/inductor.test_inductor_freezing/inductor.test_inductor_freezing-399f1933306e0659.xml (deflated 28%) 2025-10-10T02:30:22.0900613Z adding: test/test-reports/python-pytest/inductor.test_inductor_freezing/inductor.test_inductor_freezing-322e64cb633bb2dd.xml (deflated 95%) 2025-10-10T02:30:22.0901627Z adding: test/test-reports/python-pytest/dynamo.test_model_output/dynamo.test_model_output-f4c36dc5c8ca1620.xml (deflated 28%) 2025-10-10T02:30:22.0902589Z adding: test/test-reports/python-pytest/dynamo.test_model_output/dynamo.test_model_output-47ae6059e6fa0101.xml (deflated 90%) 2025-10-10T02:30:22.0903512Z adding: test/test-reports/python-pytest/export.test_torchbind/export.test_torchbind-f8ca41696164bac2.xml (deflated 28%) 2025-10-10T02:30:22.0904426Z adding: test/test-reports/python-pytest/export.test_torchbind/export.test_torchbind-5af9ad46255f7546.xml (deflated 90%) 2025-10-10T02:30:22.0905497Z adding: test/test-reports/python-pytest/dynamo.test_nested_graph_breaks/dynamo.test_nested_graph_breaks-285c028d1b38c3b9.xml (deflated 28%) 2025-10-10T02:30:22.0906692Z adding: test/test-reports/python-pytest/dynamo.test_nested_graph_breaks/dynamo.test_nested_graph_breaks-90a1d67ef3ece01a.xml (deflated 91%) 2025-10-10T02:30:22.0907824Z adding: test/test-reports/python-pytest/dynamo.test_backward_higher_order_ops/dynamo.test_backward_higher_order_ops-03e0394af2396ae1.xml (deflated 28%) 2025-10-10T02:30:22.0908966Z adding: test/test-reports/python-pytest/dynamo.test_backward_higher_order_ops/dynamo.test_backward_higher_order_ops-a3e8e6685f8a5716.xml (deflated 86%) 2025-10-10T02:30:22.0909974Z adding: test/test-reports/python-pytest/export.test_passes/export.test_passes-c97d010c8ef7c66f.xml (deflated 28%) 2025-10-10T02:30:22.0910853Z adding: test/test-reports/python-pytest/export.test_passes/export.test_passes-e177007f22a45ab1.xml (deflated 92%) 2025-10-10T02:30:22.0911772Z adding: test/test-reports/python-pytest/inductor.test_torchbind/inductor.test_torchbind-a4a38981624bd82d.xml (deflated 28%) 2025-10-10T02:30:22.0912756Z adding: test/test-reports/python-pytest/inductor.test_torchbind/inductor.test_torchbind-3263ae5a3d90bf7a.xml (deflated 91%) 2025-10-10T02:30:22.0913807Z adding: test/test-reports/python-pytest/inductor.test_custom_partitioner_fn/inductor.test_custom_partitioner_fn-d53b70871a5d560c.xml (deflated 29%) 2025-10-10T02:30:22.0914998Z adding: test/test-reports/python-pytest/inductor.test_custom_partitioner_fn/inductor.test_custom_partitioner_fn-723ebd6cdbbe9e5c.xml (deflated 48%) 2025-10-10T02:30:22.0916043Z adding: test/test-reports/python-pytest/inductor.test_alignment/inductor.test_alignment-001e91ae49ecf215.xml (deflated 29%) 2025-10-10T02:30:22.0917005Z adding: test/test-reports/python-pytest/inductor.test_alignment/inductor.test_alignment-91e29dd0be61a4cd.xml (deflated 89%) 2025-10-10T02:30:22.0917928Z adding: test/test-reports/python-pytest/dynamo.test_sources/dynamo.test_sources-d778488a90079047.xml (deflated 28%) 2025-10-10T02:30:22.0918818Z adding: test/test-reports/python-pytest/dynamo.test_sources/dynamo.test_sources-4a3106905cda7ea4.xml (deflated 72%) 2025-10-10T02:30:22.0919693Z adding: test/test-reports/python-pytest/dynamo.test_resume/dynamo.test_resume-e64545eddee64622.xml (deflated 28%) 2025-10-10T02:30:22.0920550Z adding: test/test-reports/python-pytest/dynamo.test_resume/dynamo.test_resume-7481f53e5915470f.xml (deflated 44%) 2025-10-10T02:30:22.0921451Z adding: test/test-reports/python-pytest/dynamo.test_debug_utils/dynamo.test_debug_utils-5e0fc7443f25f4e1.xml (deflated 28%) 2025-10-10T02:30:22.0922379Z adding: test/test-reports/python-pytest/dynamo.test_debug_utils/dynamo.test_debug_utils-b80123a05eeb0140.xml (deflated 77%) 2025-10-10T02:30:22.0923261Z adding: test/test-reports/python-pytest/export.test_swap/export.test_swap-98b101b77f4f84af.xml (deflated 29%) 2025-10-10T02:30:22.0924105Z adding: test/test-reports/python-pytest/export.test_swap/export.test_swap-bd6baf51deb4b189.xml (deflated 93%) 2025-10-10T02:30:22.0925058Z adding: test/test-reports/python-pytest/dynamo.test_aot_autograd_cache/dynamo.test_aot_autograd_cache-fccbf587472f9675.xml (deflated 28%) 2025-10-10T02:30:22.0926089Z adding: test/test-reports/python-pytest/dynamo.test_aot_autograd_cache/dynamo.test_aot_autograd_cache-52a2bdc15002054b.xml (deflated 96%) 2025-10-10T02:30:22.0927229Z adding: test/test-reports/python-pytest/inductor.test_binary_folding/inductor.test_binary_folding-e3562ce53a146560.xml (deflated 28%) 2025-10-10T02:30:22.0928256Z adding: test/test-reports/python-pytest/inductor.test_binary_folding/inductor.test_binary_folding-25fbd66b0710caeb.xml (deflated 84%) 2025-10-10T02:30:22.0929210Z adding: test/test-reports/python-pytest/dynamo.test_base_hop/dynamo.test_base_hop-383a555ac7de57ea.xml (deflated 28%) 2025-10-10T02:30:22.0930109Z adding: test/test-reports/python-pytest/dynamo.test_base_hop/dynamo.test_base_hop-585cf5c41de49ca5.xml (deflated 89%) 2025-10-10T02:30:22.0931014Z adding: test/test-reports/python-pytest/dynamo.test_list/dynamo.test_list-16a3ae7810e9962a.xml (deflated 29%) 2025-10-10T02:30:22.0931936Z adding: test/test-reports/python-pytest/dynamo.test_list/dynamo.test_list-f7cdc2b4517897db.xml (deflated 95%) 2025-10-10T02:30:22.0932820Z adding: test/test-reports/python-pytest/export.test_unflatten/export.test_unflatten-098f2bfdde6f4a3e.xml (deflated 28%) 2025-10-10T02:30:22.0933745Z adding: test/test-reports/python-pytest/export.test_unflatten/export.test_unflatten-1508c14952a53cea.xml (deflated 92%) 2025-10-10T02:30:22.0934744Z adding: test/test-reports/python-pytest/inductor.test_needs_exact_strides/inductor.test_needs_exact_strides-404940086b58300b.xml (deflated 29%) 2025-10-10T02:30:22.0935830Z adding: test/test-reports/python-pytest/inductor.test_needs_exact_strides/inductor.test_needs_exact_strides-330a11d0751b7e6c.xml (deflated 66%) 2025-10-10T02:30:22.0936947Z adding: test/test-reports/python-pytest/dynamo.test_verify_correctness/dynamo.test_verify_correctness-f3126d75af01d800.xml (deflated 28%) 2025-10-10T02:30:22.0938014Z adding: test/test-reports/python-pytest/dynamo.test_verify_correctness/dynamo.test_verify_correctness-ce72172b3c622ccb.xml (deflated 79%) 2025-10-10T02:30:22.0938984Z adding: test/test-reports/python-pytest/export.test_export/export.test_export-4a49e3a6f14f8520.xml (deflated 28%) 2025-10-10T02:30:22.0939866Z adding: test/test-reports/python-pytest/export.test_export/export.test_export-f505a525e1e5644e.xml (deflated 95%) 2025-10-10T02:30:22.0940881Z adding: test/test-reports/python-pytest/inductor.test_minifier_isolate/inductor.test_minifier_isolate-85ff73e018fc22b3.xml (deflated 28%) 2025-10-10T02:30:22.0941932Z adding: test/test-reports/python-pytest/inductor.test_minifier_isolate/inductor.test_minifier_isolate-529ab44148197671.xml (deflated 93%) 2025-10-10T02:30:22.0942896Z adding: test/test-reports/python-pytest/dynamo.test_logging/dynamo.test_logging-2567741d9f7a93b2.xml (deflated 28%) 2025-10-10T02:30:22.0943780Z adding: test/test-reports/python-pytest/dynamo.test_logging/dynamo.test_logging-ae543bce28ee51de.xml (deflated 95%) 2025-10-10T02:30:22.0944705Z adding: test/test-reports/python-pytest/dynamo.test_deviceguard/dynamo.test_deviceguard-cdefbca63757898b.xml (deflated 28%) 2025-10-10T02:30:22.0945693Z adding: test/test-reports/python-pytest/dynamo.test_deviceguard/dynamo.test_deviceguard-f463a8252cc29786.xml (deflated 77%) 2025-10-10T02:30:22.0946660Z adding: test/test-reports/python-pytest/dynamo.test_aot_autograd/dynamo.test_aot_autograd-f5a10d758aa7c81a.xml (deflated 28%) 2025-10-10T02:30:22.0947606Z adding: test/test-reports/python-pytest/dynamo.test_aot_autograd/dynamo.test_aot_autograd-75394d9bf0bd84e7.xml (deflated 94%) 2025-10-10T02:30:22.0948649Z adding: test/test-reports/python-pytest/inductor.test_augmented_graph_helper/inductor.test_augmented_graph_helper-c7018a41ee6c6c68.xml (deflated 27%) 2025-10-10T02:30:22.0949795Z adding: test/test-reports/python-pytest/inductor.test_augmented_graph_helper/inductor.test_augmented_graph_helper-9910078e258ffce2.xml (deflated 85%) 2025-10-10T02:30:22.0950843Z adding: test/test-reports/python-pytest/dynamo.test_cudagraphs/dynamo.test_cudagraphs-cc781b321638982e.xml (deflated 29%) 2025-10-10T02:30:22.0951787Z adding: test/test-reports/python-pytest/dynamo.test_cudagraphs/dynamo.test_cudagraphs-48bfaf7c49a41ee5.xml (deflated 87%) 2025-10-10T02:30:22.0952724Z adding: test/test-reports/python-pytest/inductor.test_caching/inductor.test_caching-001b03d35efbb00a.xml (deflated 28%) 2025-10-10T02:30:22.0953652Z adding: test/test-reports/python-pytest/inductor.test_caching/inductor.test_caching-8177ca95d58d9519.xml (deflated 97%) 2025-10-10T02:30:22.0954578Z adding: test/test-reports/python-pytest/export.test_upgrader/export.test_upgrader-eff0fc52bd7d621e.xml (deflated 28%) 2025-10-10T02:30:22.0955497Z adding: test/test-reports/python-pytest/export.test_upgrader/export.test_upgrader-8dbc14b9d8163a35.xml (deflated 70%) 2025-10-10T02:30:22.0956419Z adding: test/test-reports/python-pytest/dynamo.test_sets/dynamo.test_sets-3446f3b33d90df8f.xml (deflated 28%) 2025-10-10T02:30:22.0957345Z adding: test/test-reports/python-pytest/dynamo.test_sets/dynamo.test_sets-7acc19564e4d8e8c.xml (deflated 97%) 2025-10-10T02:30:22.0958204Z adding: test/test-reports/python-pytest/dynamo.test_unspec/dynamo.test_unspec-992ed153d8cd50e3.xml (deflated 28%) 2025-10-10T02:30:22.0959083Z adding: test/test-reports/python-pytest/dynamo.test_unspec/dynamo.test_unspec-e936aeff2feb774c.xml (deflated 93%) 2025-10-10T02:30:22.0960050Z adding: test/test-reports/python-pytest/dynamo.test_python_dispatcher/dynamo.test_python_dispatcher-537930a9f50bfda7.xml (deflated 28%) 2025-10-10T02:30:22.0961095Z adding: test/test-reports/python-pytest/dynamo.test_python_dispatcher/dynamo.test_python_dispatcher-794fb951b8c9c761.xml (deflated 85%) 2025-10-10T02:30:22.0962088Z adding: test/test-reports/python-pytest/dynamo.test_optimizers/dynamo.test_optimizers-d282a3eb89fd761c.xml (deflated 28%) 2025-10-10T02:30:22.0963035Z adding: test/test-reports/python-pytest/dynamo.test_optimizers/dynamo.test_optimizers-00f050a737764742.xml (deflated 72%) 2025-10-10T02:30:22.0963959Z adding: test/test-reports/python-pytest/dynamo.test_flat_apply/dynamo.test_flat_apply-a7874794dd2ffad7.xml (deflated 28%) 2025-10-10T02:30:22.0964887Z adding: test/test-reports/python-pytest/dynamo.test_flat_apply/dynamo.test_flat_apply-a8e9fd4a622d6f56.xml (deflated 77%) 2025-10-10T02:30:22.0965902Z adding: test/test-reports/python-pytest/dynamo.test_higher_order_ops/dynamo.test_higher_order_ops-11272092879f19e7.xml (deflated 28%) 2025-10-10T02:30:22.0966901Z adding: test/test-reports/python-pytest/dynamo.test_higher_order_ops/dynamo.test_higher_order_ops-47462cccf45395d2.xml (deflated 96%) 2025-10-10T02:30:22.0967889Z adding: test/test-reports/python-pytest/export.test_nativert/export.test_nativert-5ee89056be0ecb88.xml (deflated 28%) 2025-10-10T02:30:22.0968781Z adding: test/test-reports/python-pytest/export.test_nativert/export.test_nativert-e089172b710930b8.xml (deflated 84%) 2025-10-10T02:30:22.0969700Z adding: test/test-reports/python-pytest/inductor.test_cpu_repro/inductor.test_cpu_repro-e4d1fc97b7c9f69c.xml (deflated 28%) 2025-10-10T02:30:22.0985695Z adding: test/test-reports/python-pytest/inductor.test_cpu_repro/inductor.test_cpu_repro-e8cde978303029e7.xml (deflated 98%) 2025-10-10T02:30:22.0986766Z adding: test/test-reports/python-pytest/dynamo.test_graph_deduplication/dynamo.test_graph_deduplication-8177a0f370db3d1b.xml (deflated 28%) 2025-10-10T02:30:22.0987845Z adding: test/test-reports/python-pytest/dynamo.test_graph_deduplication/dynamo.test_graph_deduplication-d3c4b30c66c66317.xml (deflated 92%) 2025-10-10T02:30:22.0988821Z adding: test/test-reports/python-pytest/dynamo.test_export/dynamo.test_export-4243f15ab27f3e3f.xml (deflated 28%) 2025-10-10T02:30:22.0992353Z adding: test/test-reports/python-pytest/dynamo.test_export/dynamo.test_export-ccbc29d4d3be62af.xml (deflated 95%) 2025-10-10T02:30:22.0993287Z adding: test/test-reports/python-pytest/dynamo.test_error_messages/dynamo.test_error_messages-ae05157dfc1e1e12.xml (deflated 28%) 2025-10-10T02:30:22.0994266Z adding: test/test-reports/python-pytest/dynamo.test_error_messages/dynamo.test_error_messages-034cb0ba5316192e.xml (deflated 93%) 2025-10-10T02:30:22.0995164Z adding: test/test-reports/python-pytest/export.test_hop/export.test_hop-714bd9e48423675e.xml (deflated 28%) 2025-10-10T02:30:22.0995982Z adding: test/test-reports/python-pytest/export.test_hop/export.test_hop-1f797669ab47d81b.xml (deflated 96%) 2025-10-10T02:30:22.0997009Z adding: test/test-reports/python-pytest/dynamo.test_cudagraphs_expandable_segments/dynamo.test_cudagraphs_expandable_segments-3d6090e899d40087.xml (deflated 28%) 2025-10-10T02:30:22.0998236Z adding: test/test-reports/python-pytest/dynamo.test_cudagraphs_expandable_segments/dynamo.test_cudagraphs_expandable_segments-bc4be7c79c5ea5df.xml (deflated 87%) 2025-10-10T02:30:22.0999535Z adding: test/test-reports/python-pytest/dynamo.test_recompile_ux/dynamo.test_recompile_ux-f39090e20601022d.xml (deflated 28%) 2025-10-10T02:30:22.1000663Z adding: test/test-reports/python-pytest/dynamo.test_recompile_ux/dynamo.test_recompile_ux-474907756ba82428.xml (deflated 87%) 2025-10-10T02:30:22.1001611Z adding: test/test-reports/python-pytest/inductor.test_mmdecomp/inductor.test_mmdecomp-3e726062341de726.xml (deflated 28%) 2025-10-10T02:30:22.1002550Z adding: test/test-reports/python-pytest/inductor.test_mmdecomp/inductor.test_mmdecomp-79e48f4652b11a02.xml (deflated 95%) 2025-10-10T02:30:22.1003536Z adding: test/test-reports/python-pytest/dynamo.test_precompile_context/dynamo.test_precompile_context-912830df5149a2ab.xml (deflated 28%) 2025-10-10T02:30:22.1004583Z adding: test/test-reports/python-pytest/dynamo.test_precompile_context/dynamo.test_precompile_context-2ee9e2cdaf92d57d.xml (deflated 74%) 2025-10-10T02:30:22.1005599Z adding: test/test-reports/python-pytest/dynamo.test_bytecode_utils/dynamo.test_bytecode_utils-9b64c87eee98aa6e.xml (deflated 28%) 2025-10-10T02:30:22.1006576Z adding: test/test-reports/python-pytest/dynamo.test_bytecode_utils/dynamo.test_bytecode_utils-85bc36bd0255a56f.xml (deflated 89%) 2025-10-10T02:30:22.1007595Z adding: test/test-reports/python-pytest/export.test_pass_infra/export.test_pass_infra-9eb9bdcff6944887.xml (deflated 28%) 2025-10-10T02:30:22.1008524Z adding: test/test-reports/python-pytest/export.test_pass_infra/export.test_pass_infra-2dd9ecadc3f0bddd.xml (deflated 80%) 2025-10-10T02:30:22.1009535Z adding: test/test-reports/python-pytest/dynamo.test_guard_manager/dynamo.test_guard_manager-bef1c118ce3dadf7.xml (deflated 28%) 2025-10-10T02:30:22.1010496Z adding: test/test-reports/python-pytest/dynamo.test_guard_manager/dynamo.test_guard_manager-6890530f86293462.xml (deflated 93%) 2025-10-10T02:30:22.1011418Z adding: test/test-reports/python-pytest/dynamo.test_minifier/dynamo.test_minifier-524572d510b1607f.xml (deflated 28%) 2025-10-10T02:30:22.1012301Z adding: test/test-reports/python-pytest/dynamo.test_minifier/dynamo.test_minifier-d59747bf698eb9a9.xml (deflated 92%) 2025-10-10T02:30:22.1013211Z adding: test/test-reports/python-pytest/export.test_converter/export.test_converter-3965bfc50439a227.xml (deflated 28%) 2025-10-10T02:30:22.1014123Z adding: test/test-reports/python-pytest/export.test_converter/export.test_converter-832e93340671079a.xml (deflated 90%) 2025-10-10T02:30:22.1015064Z adding: test/test-reports/python-pytest/export.test_experimental/export.test_experimental-042ed765aee09bd5.xml (deflated 28%) 2025-10-10T02:30:22.1016032Z adding: test/test-reports/python-pytest/export.test_experimental/export.test_experimental-102333f387142dbb.xml (deflated 88%) 2025-10-10T02:30:22.1017047Z adding: test/test-reports/python-pytest/dynamo.test_input_attr_tracking/dynamo.test_input_attr_tracking-9942da21700978a6.xml (deflated 28%) 2025-10-10T02:30:22.1018096Z adding: test/test-reports/python-pytest/dynamo.test_input_attr_tracking/dynamo.test_input_attr_tracking-cea42017aff16b3a.xml (deflated 90%) 2025-10-10T02:30:22.1019028Z adding: test/test-reports/python-pytest/dynamo.test_exc/dynamo.test_exc-3f393378a5529ca7.xml (deflated 28%) 2025-10-10T02:30:22.1019835Z adding: test/test-reports/python-pytest/dynamo.test_exc/dynamo.test_exc-084325573969512e.xml (deflated 87%) 2025-10-10T02:30:22.1020658Z adding: test/test-reports/python-pytest/dynamo.test_hooks/dynamo.test_hooks-494369e5ce33ed1d.xml (deflated 28%) 2025-10-10T02:30:22.1021517Z adding: test/test-reports/python-pytest/dynamo.test_hooks/dynamo.test_hooks-f6f63aca7680f46d.xml (deflated 93%) 2025-10-10T02:30:22.1022406Z adding: test/test-reports/python-pytest/dynamo.test_trace_rules/dynamo.test_trace_rules-aec71f93b440109f.xml (deflated 28%) 2025-10-10T02:30:22.1023351Z adding: test/test-reports/python-pytest/dynamo.test_trace_rules/dynamo.test_trace_rules-af9d36511bf57a00.xml (deflated 81%) 2025-10-10T02:30:22.1024275Z adding: test/test-reports/python-pytest/dynamo.test_exceptions/dynamo.test_exceptions-c88855017741a92e.xml (deflated 29%) 2025-10-10T02:30:22.1025375Z adding: test/test-reports/python-pytest/dynamo.test_exceptions/dynamo.test_exceptions-6c86a34bbd71f32b.xml (deflated 94%) 2025-10-10T02:30:22.1026292Z adding: test/test-reports/python-pytest/export.test_schema/export.test_schema-d1b35aded46b2b4b.xml (deflated 28%) 2025-10-10T02:30:22.1036290Z adding: test/test-reports/python-pytest/export.test_schema/export.test_schema-0ee969f5717f8bdc.xml (deflated 81%) 2025-10-10T02:30:22.1037464Z adding: test/test-reports/python-pytest/inductor.test_cudagraph_trees_expandable_segments/inductor.test_cudagraph_trees_expandable_segments-2aae40613de67616.xml (deflated 28%) 2025-10-10T02:30:22.1038804Z adding: test/test-reports/python-pytest/inductor.test_cudagraph_trees_expandable_segments/inductor.test_cudagraph_trees_expandable_segments-11efc026c98107ce.xml (deflated 96%) 2025-10-10T02:30:22.1039931Z adding: test/test-reports/python-pytest/dynamo.test_subclasses/dynamo.test_subclasses-a58f7278660cdf62.xml (deflated 28%) 2025-10-10T02:30:22.1040873Z adding: test/test-reports/python-pytest/dynamo.test_subclasses/dynamo.test_subclasses-2423bd2d42bfa8cb.xml (deflated 95%) 2025-10-10T02:30:22.1041770Z adding: test/test-reports/python-pytest/dynamo.test_repros/dynamo.test_repros-c96234560ddf8d1a.xml (deflated 63%) 2025-10-10T02:30:22.1042626Z adding: test/test-reports/python-pytest/dynamo.test_repros/dynamo.test_repros-1cf914744212a161.xml (deflated 94%) 2025-10-10T02:30:22.1043605Z adding: test/test-reports/python-pytest/dynamo.test_reorder_logs/dynamo.test_reorder_logs-566afc1437bca4f8.xml (deflated 28%) 2025-10-10T02:30:22.1044536Z adding: test/test-reports/python-pytest/dynamo.test_reorder_logs/dynamo.test_reorder_logs-02113d0603cb056b.xml (deflated 91%) 2025-10-10T02:30:22.1045459Z adding: test/test-reports/python-pytest/dynamo.test_generator/dynamo.test_generator-2e8a2b87f69fc587.xml (deflated 29%) 2025-10-10T02:30:22.1046374Z adding: test/test-reports/python-pytest/dynamo.test_generator/dynamo.test_generator-bce4d558e62b7916.xml (deflated 95%) 2025-10-10T02:30:22.1047380Z adding: test/test-reports/python-pytest/export.test_lift_unlift/export.test_lift_unlift-dbd653d06ee79fe7.xml (deflated 28%) 2025-10-10T02:30:22.1048306Z adding: test/test-reports/python-pytest/export.test_lift_unlift/export.test_lift_unlift-c40f22ca421b5ad3.xml (deflated 64%) 2025-10-10T02:30:22.1049213Z adding: test/test-reports/python-pytest/export.test_verifier/export.test_verifier-32393dcc36aba2d7.xml (deflated 28%) 2025-10-10T02:30:22.1050095Z adding: test/test-reports/python-pytest/export.test_verifier/export.test_verifier-44db4cf583627736.xml (deflated 88%) 2025-10-10T02:30:22.1051006Z adding: test/test-reports/python-pytest/profiler.test_profiler/profiler.test_profiler-3719d148f5dda097.xml (deflated 90%) 2025-10-10T02:30:22.1051938Z adding: test/test-reports/python-pytest/profiler.test_profiler/profiler.test_profiler-e3a5c4c656e7e716.xml (deflated 93%) 2025-10-10T02:30:22.1052819Z adding: test/test-reports/python-pytest/dynamo.test_misc/dynamo.test_misc-7a397b9abba1fe8d.xml (deflated 28%) 2025-10-10T02:30:22.1064672Z adding: test/test-reports/python-pytest/dynamo.test_misc/dynamo.test_misc-5cf970d380c442ea.xml (deflated 95%) 2025-10-10T02:30:22.1065570Z adding: test/test-reports/python-pytest/export.test_draft_export/export.test_draft_export-e2d3b4294a9b15d0.xml (deflated 28%) 2025-10-10T02:30:22.1066522Z adding: test/test-reports/python-pytest/export.test_draft_export/export.test_draft_export-5f43f4a5fd1143d0.xml (deflated 91%) 2025-10-10T02:30:22.1067421Z adding: test/test-reports/python-pytest/export.test_sparse/export.test_sparse-97d39374fb561a32.xml (deflated 28%) 2025-10-10T02:30:22.1071477Z adding: test/test-reports/python-pytest/export.test_sparse/export.test_sparse-b01d78c49bd03191.xml (deflated 98%) 2025-10-10T02:30:22.1072363Z adding: test/test-reports/python-pytest/dynamo.test_comptime/dynamo.test_comptime-c25f42b44b794c5e.xml (deflated 28%) 2025-10-10T02:30:22.1073331Z adding: test/test-reports/python-pytest/dynamo.test_comptime/dynamo.test_comptime-074392352ffc39d4.xml (deflated 90%) 2025-10-10T02:30:22.1074354Z adding: test/test-reports/python-pytest/dynamo.test_python_autograd/dynamo.test_python_autograd-1a349f5d165c6693.xml (deflated 28%) 2025-10-10T02:30:22.1075348Z adding: test/test-reports/python-pytest/dynamo.test_python_autograd/dynamo.test_python_autograd-ad0751c0717a80d1.xml (deflated 82%) 2025-10-10T02:30:22.1076330Z adding: test/test-reports/python-pytest/functorch.test_rearrange/functorch.test_rearrange-c0b535585314010e.xml (deflated 28%) 2025-10-10T02:30:22.1077301Z adding: test/test-reports/python-pytest/functorch.test_rearrange/functorch.test_rearrange-b873b9fdc3d30191.xml (deflated 88%) 2025-10-10T02:30:22.1078253Z adding: test/test-reports/python-pytest/functorch.test_parsing/functorch.test_parsing-aac81057f06c1334.xml (deflated 29%) 2025-10-10T02:30:22.1079184Z adding: test/test-reports/python-pytest/functorch.test_parsing/functorch.test_parsing-cd20d7167311feaa.xml (deflated 88%) 2025-10-10T02:30:22.1080036Z adding: test/test-reports/python-pytest/test_package/test_package-9db1317f119faf18.xml (deflated 28%) 2025-10-10T02:30:22.1080809Z adding: test/test-reports/python-pytest/test_package/test_package-8a979256ed3569b2.xml (deflated 94%) 2025-10-10T02:30:22.1081643Z adding: test/test-reports/python-pytest/test_comparison_utils/test_comparison_utils-df17a335a704974b.xml (deflated 28%) 2025-10-10T02:30:22.1082593Z adding: test/test-reports/python-pytest/test_comparison_utils/test_comparison_utils-cf213432d3740aeb.xml (deflated 86%) 2025-10-10T02:30:22.1083449Z adding: test/test-reports/python-pytest/test_mkl_verbose/test_mkl_verbose-2e6c19d2a539c880.xml (deflated 28%) 2025-10-10T02:30:22.1084254Z adding: test/test-reports/python-pytest/test_mkl_verbose/test_mkl_verbose-a06b646b7f250e68.xml (deflated 64%) 2025-10-10T02:30:22.1085144Z adding: test/test-reports/python-pytest/functorch.test_ac_logging/functorch.test_ac_logging-ef6d94b9e0b78f6b.xml (deflated 28%) 2025-10-10T02:30:22.1086129Z adding: test/test-reports/python-pytest/functorch.test_ac_logging/functorch.test_ac_logging-3539f0c6a8e7c7d1.xml (deflated 64%) 2025-10-10T02:30:22.1087104Z adding: test/test-reports/python-pytest/test_mkldnn_verbose/test_mkldnn_verbose-500b68b9df0df697.xml (deflated 28%) 2025-10-10T02:30:22.1087958Z adding: test/test-reports/python-pytest/test_mkldnn_verbose/test_mkldnn_verbose-4f508d333f6e0922.xml (deflated 64%) 2025-10-10T02:30:22.1088842Z adding: test/test-reports/python-pytest/profiler.test_kineto/profiler.test_kineto-4bbc70f3e9ffb4af.xml (deflated 28%) 2025-10-10T02:30:22.1089754Z adding: test/test-reports/python-pytest/profiler.test_kineto/profiler.test_kineto-804eafad7766b8fd.xml (deflated 57%) 2025-10-10T02:30:22.1090609Z adding: test/test-reports/python-pytest/test_matmul_cuda/test_matmul_cuda-e3da992f5ade6992.xml (deflated 27%) 2025-10-10T02:30:22.1114040Z adding: test/test-reports/python-pytest/test_matmul_cuda/test_matmul_cuda-1900ab685e05b4e4.xml (deflated 99%) 2025-10-10T02:30:22.1114901Z adding: test/test-reports/python-pytest/test_transformers/test_transformers-84de00f8bdca0df3.xml (deflated 28%) 2025-10-10T02:30:22.1420418Z adding: test/test-reports/python-pytest/test_transformers/test_transformers-70b4c89bd184b4eb.xml (deflated 99%) 2025-10-10T02:30:22.1421223Z adding: test/test-reports/python-pytest/test_meta/test_meta-0362512724aefaa0.xml (deflated 28%) 2025-10-10T02:30:22.2322856Z adding: test/test-reports/python-pytest/test_meta/test_meta-a5cf10940c7c2444.xml (deflated 99%) 2025-10-10T02:30:22.2323644Z adding: test/test-reports/python-pytest/test_license/test_license-d862fae8e301e802.xml (deflated 27%) 2025-10-10T02:30:22.2324422Z adding: test/test-reports/python-pytest/test_license/test_license-4c0a7fa17f8a4924.xml (deflated 57%) 2025-10-10T02:30:22.2325288Z adding: test/test-reports/python-pytest/test_utils_config_module/test_utils_config_module-17d19d718510dcee.xml (deflated 28%) 2025-10-10T02:30:22.2326359Z adding: test/test-reports/python-pytest/test_utils_config_module/test_utils_config_module-c7c41e7b2597cb10.xml (deflated 93%) 2025-10-10T02:30:22.2327433Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-02207e15769d81d5.xml (deflated 27%) 2025-10-10T02:30:22.2328196Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-f9b76dda40ad8453.xml (deflated 28%) 2025-10-10T02:30:22.2328932Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-89905771e30af52c.xml (deflated 28%) 2025-10-10T02:30:22.2329667Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-687164141f9e09d0.xml (deflated 28%) 2025-10-10T02:30:22.2330408Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-772d80de11a753f4.xml (deflated 28%) 2025-10-10T02:30:22.2331153Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-27bb89642f654c31.xml (deflated 28%) 2025-10-10T02:30:22.2341362Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-1d52521b40f2ebb7.xml (deflated 97%) 2025-10-10T02:30:22.2354924Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-b5c95241f43c5fdc.xml (deflated 97%) 2025-10-10T02:30:22.2368690Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-8de4ed866e7b200b.xml (deflated 97%) 2025-10-10T02:30:22.2383091Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-0779d2267fefd8a3.xml (deflated 98%) 2025-10-10T02:30:22.2398045Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-01d3f721ea9d8c7e.xml (deflated 98%) 2025-10-10T02:30:22.2413141Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-12043a5bc4de5758.xml (deflated 98%) 2025-10-10T02:30:22.2413917Z adding: test/test-reports/python-pytest/xpu.test_conv/xpu.test_conv-7b325020df32d69f.xml (deflated 28%) 2025-10-10T02:30:22.2414688Z adding: test/test-reports/python-pytest/xpu.test_conv/xpu.test_conv-77e791878581b6a1.xml (deflated 28%) 2025-10-10T02:30:22.2415511Z adding: test/test-reports/python-pytest/functorch.test_ops/functorch.test_ops-a0c46d32e921392f.xml (deflated 28%) 2025-10-10T02:30:22.2550966Z adding: test/test-reports/python-pytest/functorch.test_ops/functorch.test_ops-c3ac24a2c74ed16a.xml (deflated 98%) 2025-10-10T02:30:22.2551809Z adding: test/test-reports/python-pytest/test_datapipe/test_datapipe-f8778458abd7f7c4.xml (deflated 28%) 2025-10-10T02:30:22.2553798Z adding: test/test-reports/python-pytest/test_datapipe/test_datapipe-db608672fc653878.xml (deflated 94%) 2025-10-10T02:30:22.2554650Z adding: test/test-reports/python-pytest/lazy.test_generator/lazy.test_generator-16f603851415bae0.xml (deflated 27%) 2025-10-10T02:30:22.2555517Z adding: test/test-reports/python-pytest/lazy.test_generator/lazy.test_generator-8b144860bbb396be.xml (deflated 64%) 2025-10-10T02:30:22.2556515Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.lib.test_type_check/torch_np.numpy_tests.lib.test_type_check-37cc3ec74e671b2b.xml (deflated 28%) 2025-10-10T02:30:22.2557668Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.lib.test_type_check/torch_np.numpy_tests.lib.test_type_check-f42dc3d67fa1f7aa.xml (deflated 95%) 2025-10-10T02:30:22.2558696Z adding: test/test-reports/python-pytest/lazy.test_debug_util/lazy.test_debug_util-dfae58adf5b093ff.xml (deflated 27%) 2025-10-10T02:30:22.2559572Z adding: test/test-reports/python-pytest/lazy.test_debug_util/lazy.test_debug_util-a566f67277c30119.xml (deflated 43%) 2025-10-10T02:30:22.2560432Z adding: test/test-reports/python-pytest/test_jit_llga_fuser/test_jit_llga_fuser-18666ef36f0ca72f.xml (deflated 28%) 2025-10-10T02:30:22.2561278Z adding: test/test-reports/python-pytest/test_jit_llga_fuser/test_jit_llga_fuser-df93d912809b79fd.xml (deflated 96%) 2025-10-10T02:30:22.2562116Z adding: test/test-reports/python-pytest/test_numa_binding/test_numa_binding-0030bbdd6814defa.xml (deflated 27%) 2025-10-10T02:30:22.2562944Z adding: test/test-reports/python-pytest/test_numa_binding/test_numa_binding-f0b255215ec965b6.xml (deflated 90%) 2025-10-10T02:30:22.2564186Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.lib.test_histograms/torch_np.numpy_tests.lib.test_histograms-9e4295238b86e193.xml (deflated 27%) 2025-10-10T02:30:22.2565355Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.lib.test_histograms/torch_np.numpy_tests.lib.test_histograms-40b37ce1b5526aef.xml (deflated 95%) 2025-10-10T02:30:22.2566504Z adding: test/test-reports/python-pytest/benchmark_utils.test_benchmark_utils/benchmark_utils.test_benchmark_utils-d8def58fac009f27.xml (deflated 29%) 2025-10-10T02:30:22.2567681Z adding: test/test-reports/python-pytest/benchmark_utils.test_benchmark_utils/benchmark_utils.test_benchmark_utils-b824b04b3a5260cc.xml (deflated 85%) 2025-10-10T02:30:22.2568839Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_scalarmath/torch_np.numpy_tests.core.test_scalarmath-ce2e774dea9eb55f.xml (deflated 28%) 2025-10-10T02:30:22.2571451Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_scalarmath/torch_np.numpy_tests.core.test_scalarmath-52867daaa2230389.xml (deflated 97%) 2025-10-10T02:30:22.2572450Z adding: test/test-reports/python-pytest/test_indexing/test_indexing-1fa619615429c841.xml (deflated 43%) 2025-10-10T02:30:22.2576319Z adding: test/test-reports/python-pytest/test_indexing/test_indexing-8cbfdd6cdf30d351.xml (deflated 97%) 2025-10-10T02:30:22.2577201Z adding: test/test-reports/python-pytest/profiler.test_torch_tidy/profiler.test_torch_tidy-ea8c3b9784e1e746.xml (deflated 28%) 2025-10-10T02:30:22.2578259Z adding: test/test-reports/python-pytest/profiler.test_torch_tidy/profiler.test_torch_tidy-fb5b5f22e0e7ea34.xml (deflated 90%) 2025-10-10T02:30:22.2579170Z adding: test/test-reports/python-pytest/nn.test_module_hooks/nn.test_module_hooks-9c4d8a3f90fbb390.xml (deflated 28%) 2025-10-10T02:30:22.2580035Z adding: test/test-reports/python-pytest/nn.test_module_hooks/nn.test_module_hooks-a8cc9149f6d50f31.xml (deflated 95%) 2025-10-10T02:30:22.2580979Z adding: test/test-reports/python-pytest/functorch.test_aotdispatch/functorch.test_aotdispatch-e2db1fe26341d79a.xml (deflated 28%) 2025-10-10T02:30:22.2595223Z adding: test/test-reports/python-pytest/functorch.test_aotdispatch/functorch.test_aotdispatch-b8748981c4bcaceb.xml (deflated 96%) 2025-10-10T02:30:22.2596393Z adding: test/test-reports/python-pytest/nn.test_load_state_dict/nn.test_load_state_dict-bd46f06fdec8f2fb.xml (deflated 28%) 2025-10-10T02:30:22.2597479Z adding: test/test-reports/python-pytest/nn.test_load_state_dict/nn.test_load_state_dict-4bb9e81843cca73d.xml (deflated 94%) 2025-10-10T02:30:22.2598816Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.linalg.test_linalg/torch_np.numpy_tests.linalg.test_linalg-9719fbad4ebe3bdb.xml (deflated 28%) 2025-10-10T02:30:22.2603667Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.linalg.test_linalg/torch_np.numpy_tests.linalg.test_linalg-22b38fdbbc0d10a0.xml (deflated 97%) 2025-10-10T02:30:22.2604644Z adding: test/test-reports/python-pytest/test_shape_ops/test_shape_ops-90cd4d6f9df6b25c.xml (deflated 28%) 2025-10-10T02:30:22.2605930Z adding: test/test-reports/python-pytest/test_shape_ops/test_shape_ops-6833784cf87b90fb.xml (deflated 97%) 2025-10-10T02:30:22.2606968Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_shape_base/torch_np.numpy_tests.core.test_shape_base-d3c9465b95e40816.xml (deflated 28%) 2025-10-10T02:30:22.2610265Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_shape_base/torch_np.numpy_tests.core.test_shape_base-da08c3053e260425.xml (deflated 97%) 2025-10-10T02:30:22.2611411Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_dtype/torch_np.numpy_tests.core.test_dtype-8e499884f82ac558.xml (deflated 28%) 2025-10-10T02:30:22.2613603Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_dtype/torch_np.numpy_tests.core.test_dtype-57ea51090a78bd5c.xml (deflated 96%) 2025-10-10T02:30:22.2614585Z adding: test/test-reports/python-pytest/test_unary_ufuncs/test_unary_ufuncs-3b441e6d1aee50d8.xml (deflated 28%) 2025-10-10T02:30:22.3190001Z adding: test/test-reports/python-pytest/test_unary_ufuncs/test_unary_ufuncs-0fe5f3b295036103.xml (deflated 99%) 2025-10-10T02:30:22.3190861Z adding: test/test-reports/python-pytest/test_sparse_csr/test_sparse_csr-5b4d5a840c05be44.xml (deflated 28%) 2025-10-10T02:30:22.3250597Z adding: test/test-reports/python-pytest/test_sparse_csr/test_sparse_csr-69740c807dff84f3.xml (deflated 99%) 2025-10-10T02:30:22.3251458Z adding: test/test-reports/python-pytest/test_serialization/test_serialization-b02d4423c53bd17b.xml (deflated 66%) 2025-10-10T02:30:22.3256566Z adding: test/test-reports/python-pytest/test_serialization/test_serialization-5e9cd5cdb14bd81c.xml (deflated 96%) 2025-10-10T02:30:22.3257607Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.lib.test_twodim_base/torch_np.numpy_tests.lib.test_twodim_base-aa903c44c6ece71d.xml (deflated 28%) 2025-10-10T02:30:22.3258779Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.lib.test_twodim_base/torch_np.numpy_tests.lib.test_twodim_base-03e8798a341aef6f.xml (deflated 94%) 2025-10-10T02:30:22.3259810Z adding: test/test-reports/python-pytest/test_function_schema/test_function_schema-842d655c67c24f98.xml (deflated 28%) 2025-10-10T02:30:22.3260704Z adding: test/test-reports/python-pytest/test_function_schema/test_function_schema-5c1d193a0001aecd.xml (deflated 90%) 2025-10-10T02:30:22.3261691Z adding: test/test-reports/python-pytest/functorch.test_vmap/functorch.test_vmap-15bed35c499bf800.xml (deflated 28%) 2025-10-10T02:30:22.3314221Z adding: test/test-reports/python-pytest/functorch.test_vmap/functorch.test_vmap-0cada83fe8a8129b.xml (deflated 98%) 2025-10-10T02:30:22.3315259Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.lib.test_shape_base_/torch_np.numpy_tests.lib.test_shape_base_-802992f9616e2c6e.xml (deflated 27%) 2025-10-10T02:30:22.3316634Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.lib.test_shape_base_/torch_np.numpy_tests.lib.test_shape_base_-850325a48b8da611.xml (deflated 95%) 2025-10-10T02:30:22.3317789Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.fft.test_pocketfft/torch_np.numpy_tests.fft.test_pocketfft-b3743fe391809694.xml (deflated 28%) 2025-10-10T02:30:22.3319443Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.fft.test_pocketfft/torch_np.numpy_tests.fft.test_pocketfft-d9f9dadc550b2f19.xml (deflated 97%) 2025-10-10T02:30:22.3320498Z adding: test/test-reports/python-pytest/test_scatter_gather_ops/test_scatter_gather_ops-31d592167816728c.xml (deflated 28%) 2025-10-10T02:30:22.3321778Z adding: test/test-reports/python-pytest/test_scatter_gather_ops/test_scatter_gather_ops-0e5602e08edacb7d.xml (deflated 97%) 2025-10-10T02:30:22.3322877Z adding: test/test-reports/python-pytest/torch_np.test_ndarray_methods/torch_np.test_ndarray_methods-8f27b830842802e4.xml (deflated 28%) 2025-10-10T02:30:22.3330885Z adding: test/test-reports/python-pytest/torch_np.test_ndarray_methods/torch_np.test_ndarray_methods-23699fd423c77a57.xml (deflated 98%) 2025-10-10T02:30:22.3331796Z adding: test/test-reports/python-pytest/test_view_ops/test_view_ops-e8240103d2dce8f8.xml (deflated 28%) 2025-10-10T02:30:22.3336992Z adding: test/test-reports/python-pytest/test_view_ops/test_view_ops-8b8404ba5b309a7c.xml (deflated 97%) 2025-10-10T02:30:22.3337954Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_dlpack/torch_np.numpy_tests.core.test_dlpack-38629dadc7572d6a.xml (deflated 28%) 2025-10-10T02:30:22.3339094Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_dlpack/torch_np.numpy_tests.core.test_dlpack-d661f70cbb3a78d1.xml (deflated 96%) 2025-10-10T02:30:22.3340239Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_getlimits/torch_np.numpy_tests.core.test_getlimits-1bae7bee1e6be544.xml (deflated 29%) 2025-10-10T02:30:22.3341419Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_getlimits/torch_np.numpy_tests.core.test_getlimits-474cf0f940e72bb0.xml (deflated 89%) 2025-10-10T02:30:22.3342648Z adding: test/test-reports/python-pytest/test_accelerator/test_accelerator-31a2b9dc6d048b14.xml (deflated 28%) 2025-10-10T02:30:22.3343493Z adding: test/test-reports/python-pytest/test_accelerator/test_accelerator-18527848017ced2f.xml (deflated 87%) 2025-10-10T02:30:22.3344342Z adding: test/test-reports/python-pytest/lazy.test_reuse_ir/lazy.test_reuse_ir-580a6a5f417c7548.xml (deflated 28%) 2025-10-10T02:30:22.3345194Z adding: test/test-reports/python-pytest/lazy.test_reuse_ir/lazy.test_reuse_ir-cba203177571aa94.xml (deflated 78%) 2025-10-10T02:30:22.3346214Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.lib.test_index_tricks/torch_np.numpy_tests.lib.test_index_tricks-e4ba57c8706ee4b1.xml (deflated 28%) 2025-10-10T02:30:22.3347394Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.lib.test_index_tricks/torch_np.numpy_tests.lib.test_index_tricks-7804d28d756edb17.xml (deflated 94%) 2025-10-10T02:30:22.3348385Z adding: test/test-reports/python-pytest/nn.test_init/nn.test_init-527755aaa52fa395.xml (deflated 28%) 2025-10-10T02:30:22.3349148Z adding: test/test-reports/python-pytest/nn.test_init/nn.test_init-34c76aa08f344f50.xml (deflated 93%) 2025-10-10T02:30:22.3350127Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_numerictypes/torch_np.numpy_tests.core.test_numerictypes-477a0f5535124f61.xml (deflated 28%) 2025-10-10T02:30:22.3351420Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_numerictypes/torch_np.numpy_tests.core.test_numerictypes-6b685ebfa366d5b4.xml (deflated 95%) 2025-10-10T02:30:22.3352470Z adding: test/test-reports/python-pytest/test_type_promotion/test_type_promotion-8ba940f8d50cd14b.xml (deflated 28%) 2025-10-10T02:30:22.3358201Z adding: test/test-reports/python-pytest/test_type_promotion/test_type_promotion-1aa103f105d0f98e.xml (deflated 98%) 2025-10-10T02:30:22.3359291Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_scalar_methods/torch_np.numpy_tests.core.test_scalar_methods-21e64f64ab97055e.xml (deflated 28%) 2025-10-10T02:30:22.3360695Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_scalar_methods/torch_np.numpy_tests.core.test_scalar_methods-d8752f2d16eb312d.xml (deflated 96%) 2025-10-10T02:30:22.3361859Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.fft.test_helper/torch_np.numpy_tests.fft.test_helper-22df787bcf6b6284.xml (deflated 29%) 2025-10-10T02:30:22.3362964Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.fft.test_helper/torch_np.numpy_tests.fft.test_helper-1bec3ed1570d094b.xml (deflated 86%) 2025-10-10T02:30:22.3364010Z adding: test/test-reports/python-pytest/torch_np.test_function_base/torch_np.test_function_base-e1dea90ad78a2bfa.xml (deflated 28%) 2025-10-10T02:30:22.3364976Z adding: test/test-reports/python-pytest/torch_np.test_function_base/torch_np.test_function_base-836d74c0ee35ff55.xml (deflated 45%) 2025-10-10T02:30:22.3365974Z adding: test/test-reports/python-pytest/profiler.test_profiler_tree/profiler.test_profiler_tree-f7d728e5b40f3335.xml (deflated 28%) 2025-10-10T02:30:22.3366974Z adding: test/test-reports/python-pytest/profiler.test_profiler_tree/profiler.test_profiler_tree-8f8e22be94090ea8.xml (deflated 86%) 2025-10-10T02:30:22.3368066Z adding: test/test-reports/python-pytest/functorch.test_eager_transforms/functorch.test_eager_transforms-8eaea86b345f8a26.xml (deflated 28%) 2025-10-10T02:30:22.3374136Z adding: test/test-reports/python-pytest/functorch.test_eager_transforms/functorch.test_eager_transforms-d7227992633ec0ea.xml (deflated 96%) 2025-10-10T02:30:22.3375065Z adding: test/test-reports/python-pytest/test_sparse/test_sparse-f2f59bfba1db9191.xml (deflated 28%) 2025-10-10T02:30:22.3445029Z adding: test/test-reports/python-pytest/test_sparse/test_sparse-9fb94c449d7e8fe0.xml (deflated 99%) 2025-10-10T02:30:22.3553948Z ##[group]Run # Remove any previous usage logs if they exist 2025-10-10T02:30:22.3554493Z # Remove any previous usage logs if they exist 2025-10-10T02:30:22.3554846Z rm -f logs-*.zip 2025-10-10T02:30:22.3555346Z zip "logs-${FILE_SUFFIX}.zip" 'usage_log.txt' || true 2025-10-10T02:30:22.3555844Z zip -r "logs-${FILE_SUFFIX}.zip" test/test-reports -i '*.log' || true 2025-10-10T02:30:22.3565005Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:30:22.3565383Z env: 2025-10-10T02:30:22.3565616Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:22.3565946Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:22.3566493Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:22.3567030Z DEVICE_NAME: 2025-10-10T02:30:22.3567370Z DEVICE_TYPE: 2025-10-10T02:30:22.3567730Z FILE_SUFFIX: test-slow-2-3-linux.g5.4xlarge.nvidia.gpu_52406799277 2025-10-10T02:30:22.3568127Z ##[endgroup] 2025-10-10T02:30:22.3925476Z adding: usage_log.txt (deflated 96%) 2025-10-10T02:30:22.4064713Z adding: test/test-reports/export.test_export_1.1_724dc9b2e00543d9_.log (deflated 50%) 2025-10-10T02:30:22.4065429Z adding: test/test-reports/inductor.test_dependencies_1.1_86844299a4665816_.log (deflated 51%) 2025-10-10T02:30:22.4066045Z adding: test/test-reports/test_ops_1.1_a290887c7b969568_.log (deflated 48%) 2025-10-10T02:30:22.4066647Z adding: test/test-reports/dynamo.test_logging_1.1_39f690b2466cf642_.log (deflated 50%) 2025-10-10T02:30:22.4067444Z adding: test/test-reports/test_torchfuzz_repros_1.1_2bc72c91a6a54132_.log (deflated 50%) 2025-10-10T02:30:22.4068074Z adding: test/test-reports/test_mkl_verbose_1.1_ae2a4d433f1b52f2_.log (deflated 49%) 2025-10-10T02:30:22.4068669Z adding: test/test-reports/test_opaque_obj_1.1_8b7b66c8590c136e_.log (deflated 49%) 2025-10-10T02:30:22.4069274Z adding: test/test-reports/test_matmul_cuda_1.1_4b2ba523bc20eaa9_.log (deflated 49%) 2025-10-10T02:30:22.4069875Z adding: test/test-reports/test_testing_1.1_9d31ddad2b6fecdc_.log (deflated 49%) 2025-10-10T02:30:22.4070497Z adding: test/test-reports/dynamo.test_deviceguard_1.1_9c4ce377a52b32cc_.log (deflated 50%) 2025-10-10T02:30:22.4071151Z adding: test/test-reports/test_public_bindings_1.1_531123992d7ab0b9_.log (deflated 50%) 2025-10-10T02:30:22.4071805Z adding: test/test-reports/dynamo.test_aot_autograd_1.1_ffd5f025d73e0db2_.log (deflated 50%) 2025-10-10T02:30:22.4072480Z adding: test/test-reports/inductor.test_aot_inductor_1.1_ab2d17034a493267_.log (deflated 51%) 2025-10-10T02:30:22.4073150Z adding: test/test-reports/dynamo.test_cudagraphs_1.1_3c21fc8a1903df8c_.log (deflated 50%) 2025-10-10T02:30:22.4073830Z adding: test/test-reports/inductor.test_torchinductor_1.1_2ef4abe840cd09c4_.log (deflated 50%) 2025-10-10T02:30:22.4074494Z adding: test/test-reports/export.test_package_1.1_f570fed3cdc48302_.log (deflated 50%) 2025-10-10T02:30:22.4075196Z adding: test/test-reports/inductor.test_torchinductor_opinfo_2.11_27d27947181c8c8b_.log (deflated 52%) 2025-10-10T02:30:22.4075911Z adding: test/test-reports/dynamo.test_ctx_manager_1.1_8af0268dcaa93e75_.log (deflated 50%) 2025-10-10T02:30:22.4076626Z adding: test/test-reports/inductor.test_torchinductor_opinfo_5.11_32aafc5ce9df02d7_.log (deflated 52%) 2025-10-10T02:30:22.4077353Z adding: test/test-reports/inductor.test_cudagraph_trees_1.1_36197e1b3d795a8a_.log (deflated 51%) 2025-10-10T02:30:22.4078089Z adding: test/test-reports/inductor.test_torchinductor_opinfo_6.11_ca27bac7090278d9_.log (deflated 52%) 2025-10-10T02:30:22.4078827Z adding: test/test-reports/inductor.test_block_analysis_1.1_8cf98248b21bf3e8_.log (deflated 51%) 2025-10-10T02:30:22.4079558Z adding: test/test-reports/inductor.test_torchinductor_opinfo_8.11_eee00d2276de4801_.log (deflated 52%) 2025-10-10T02:30:22.4080271Z adding: test/test-reports/dynamo.test_pre_dispatch_1.1_b28ca5840659f754_.log (deflated 50%) 2025-10-10T02:30:22.4080985Z adding: test/test-reports/inductor.test_torchinductor_opinfo_11.11_a2dda46409c7457c_.log (deflated 52%) 2025-10-10T02:30:22.4081823Z adding: test/test-reports/dynamo.test_guard_serialization_1.1_029a3c89ee56da9f_.log (deflated 51%) 2025-10-10T02:30:22.4082701Z adding: test/test-reports/inductor.test_static_cuda_launcher_1.1_ac8e94e2359fb972_.log (deflated 51%) 2025-10-10T02:30:22.4083406Z adding: test/test-reports/dynamo.test_subgraphs_1.1_b700ddb9c915a228_.log (deflated 50%) 2025-10-10T02:30:22.4084125Z adding: test/test-reports/inductor.test_cooperative_reductions_1.1_a72977059c179a93_.log (deflated 54%) 2025-10-10T02:30:22.4084832Z adding: test/test-reports/inductor.test_caching_1.1_934c7ab345d15da7_.log (deflated 50%) 2025-10-10T02:30:22.4085517Z adding: test/test-reports/inductor.test_async_compile_1.1_28bead571da43287_.log (deflated 51%) 2025-10-10T02:30:22.4086202Z adding: test/test-reports/dynamo.test_unittest_1.1_ea31b9b08c93d789_.log (deflated 50%) 2025-10-10T02:30:22.4086885Z adding: test/test-reports/inductor.test_kernel_benchmark_1.1_e20112867c7299ff_.log (deflated 51%) 2025-10-10T02:30:22.4087652Z adding: test/test-reports/export.test_upgrader_1.1_a584e0b480bb7748_.log (deflated 50%) 2025-10-10T02:30:22.4088312Z adding: test/test-reports/inductor.test_cuda_repro_1.1_1c4a1bb51e1f44f1_.log (deflated 50%) 2025-10-10T02:30:22.4088955Z adding: test/test-reports/dynamo.test_sets_1.1_a22f3589f545c3b1_.log (deflated 49%) 2025-10-10T02:30:22.4089583Z adding: test/test-reports/dynamo.test_callback_1.1_ec30c8e2a2630ac1_.log (deflated 50%) 2025-10-10T02:30:22.4090274Z adding: test/test-reports/test_transformers_1.1_e9070de377699fe8_.log (deflated 50%) 2025-10-10T02:30:22.4090899Z adding: test/test-reports/inductor.test_fp8_1.1_dccbc2d4b2904e41_.log (deflated 50%) 2025-10-10T02:30:22.4091590Z adding: test/test-reports/inductor.test_mkldnn_pattern_matcher_1.1_061658ea68b99485_.log (deflated 52%) 2025-10-10T02:30:22.4092383Z adding: test/test-reports/inductor.test_torchinductor_dynamic_shapes_1.2_3153119e2a46dd4e_.log (deflated 62%) 2025-10-10T02:30:22.4093106Z adding: test/test-reports/dynamo.test_unspec_1.1_cc72ce2c42a7b79d_.log (deflated 50%) 2025-10-10T02:30:22.4093769Z adding: test/test-reports/inductor.test_analysis_1.1_6c0f12a0794524ba_.log (deflated 50%) 2025-10-10T02:30:22.4094425Z adding: test/test-reports/dynamo.test_optimizers_1.1_74e86154988e6b15_.log (deflated 50%) 2025-10-10T02:30:22.4095105Z adding: test/test-reports/inductor.test_triton_syntax_1.1_12d372c47ed78b3f_.log (deflated 51%) 2025-10-10T02:30:22.4095783Z adding: test/test-reports/dynamo.test_decorators_1.1_a5f9b702d0f35d3d_.log (deflated 50%) 2025-10-10T02:30:22.4096504Z adding: test/test-reports/inductor.test_triton_extension_backend_1.1_99d06730dc11f751_.log (deflated 51%) 2025-10-10T02:30:22.4097177Z adding: test/test-reports/test_meta_1.1_ae20740f2a153e4a_.log (deflated 50%) 2025-10-10T02:30:22.4097778Z adding: test/test-reports/inductor.test_utils_1.1_bc93c486a0357eda_.log (deflated 50%) 2025-10-10T02:30:22.4098642Z adding: test/test-reports/dynamo.test_pgo_1.1_266517fadd492868_.log (deflated 49%) 2025-10-10T02:30:22.4099380Z adding: test/test-reports/inductor.test_coordinate_descent_tuner_1.1_2bd1307ad4f49e17_.log (deflated 52%) 2025-10-10T02:30:22.4100092Z adding: test/test-reports/inductor.test_cache_1.1_63f053c389aa26d9_.log (deflated 50%) 2025-10-10T02:30:22.4100782Z adding: test/test-reports/inductor.test_inplace_padding_1.1_f0da5bd278d946af_.log (deflated 51%) 2025-10-10T02:30:22.4101490Z adding: test/test-reports/dynamo.test_buffers_override_1.1_a875e247e8fe79d6_.log (deflated 51%) 2025-10-10T02:30:22.4102258Z adding: test/test-reports/inductor.test_template_heuristics_registry_1.1_d5ca3c0d864dab48_.log (deflated 53%) 2025-10-10T02:30:22.4103002Z adding: test/test-reports/dynamo.test_after_aot_1.1_c155ea426d214603_.log (deflated 50%) 2025-10-10T02:30:22.4103681Z adding: test/test-reports/inductor.test_select_algorithm_1.1_83e526ed83f06358_.log (deflated 51%) 2025-10-10T02:30:22.4104365Z adding: test/test-reports/inductor.test_compile_1.1_8ea83f9bfd18d05a_.log (deflated 50%) 2025-10-10T02:30:22.4105150Z adding: test/test-reports/inductor.test_extension_backend_1.1_255ea3527a1f2060_.log (deflated 51%) 2025-10-10T02:30:22.4105947Z adding: test/test-reports/export.test_export_opinfo_1.1_e16b3c6cd7888b49_.log (deflated 51%) 2025-10-10T02:30:22.4106687Z adding: test/test-reports/inductor.test_inductor_scheduler_1.1_3756edb8ff2c895c_.log (deflated 52%) 2025-10-10T02:30:22.4107412Z adding: test/test-reports/dynamo.test_flat_apply_1.1_69cc1488eab16041_.log (deflated 50%) 2025-10-10T02:30:22.4108065Z adding: test/test-reports/inductor.test_padding_1.1_58bb0e4694895dda_.log (deflated 50%) 2025-10-10T02:30:22.4108747Z adding: test/test-reports/inductor.test_custom_lowering_1.1_fcd0bceaa7572fd4_.log (deflated 51%) 2025-10-10T02:30:22.4109452Z adding: test/test-reports/inductor.test_codegen_triton_1.1_48a804e657ff6a4c_.log (deflated 51%) 2025-10-10T02:30:22.4110128Z adding: test/test-reports/dynamo.test_fx_annotate_1.1_fda785a6cccff9bd_.log (deflated 50%) 2025-10-10T02:30:22.4110906Z adding: test/test-reports/inductor.test_torchinductor_codegen_dynamic_shapes_1.2_3a27e00f8710b930_.log (deflated 53%) 2025-10-10T02:30:22.4111693Z adding: test/test-reports/inductor.test_control_deps_1.1_f992cb451f92b502_.log (deflated 51%) 2025-10-10T02:30:22.4112436Z adding: test/test-reports/export.test_export_training_ir_to_run_decomp_1.1_34018bf7d12e1028_.log (deflated 53%) 2025-10-10T02:30:22.4113157Z adding: test/test-reports/export.test_nativert_1.1_2b7aabf5b8a15e01_.log (deflated 50%) 2025-10-10T02:30:22.4113868Z adding: test/test-reports/inductor.test_indexing_1.1_928d9225831fb74b_.log (deflated 50%) 2025-10-10T02:30:22.4114527Z adding: test/test-reports/inductor.test_cpu_repro_1.1_4d201b196322547c_.log (deflated 50%) 2025-10-10T02:30:22.4115186Z adding: test/test-reports/inductor.test_minifier_1.1_21f38f15d7c8d51c_.log (deflated 50%) 2025-10-10T02:30:22.4115827Z adding: test/test-reports/test_license_1.1_c7db97813d50b504_.log (deflated 49%) 2025-10-10T02:30:22.4116442Z adding: test/test-reports/inductor.test_perf_1.1_f99cd6efcb248e75_.log (deflated 50%) 2025-10-10T02:30:22.4117068Z adding: test/test-reports/dynamo.test_export_1.1_81862530a67f5ce9_.log (deflated 50%) 2025-10-10T02:30:22.4117702Z adding: test/test-reports/inductor.test_pad_mm_1.1_8bd42738a715a2f4_.log (deflated 50%) 2025-10-10T02:30:22.4118383Z adding: test/test-reports/dynamo.test_graph_region_tracker_1.1_64b79f2ed725064b_.log (deflated 51%) 2025-10-10T02:30:22.4119127Z adding: test/test-reports/inductor.test_inductor_annotations_1.1_5e8539a8796a1261_.log (deflated 52%) 2025-10-10T02:30:22.4119844Z adding: test/test-reports/dynamo.test_error_messages_1.1_b759ce749620e66c_.log (deflated 51%) 2025-10-10T02:30:22.4120520Z adding: test/test-reports/inductor.test_ck_backend_1.1_a9a51f6c723d1e69_.log (deflated 50%) 2025-10-10T02:30:22.4121168Z adding: test/test-reports/dynamo.test_dicts_1.1_37a8d4e5c2055746_.log (deflated 49%) 2025-10-10T02:30:22.4121832Z adding: test/test-reports/inductor.test_inductor_utils_1.1_9e536dd58eb7f3fe_.log (deflated 51%) 2025-10-10T02:30:22.4122525Z adding: test/test-reports/inductor.test_fuzzer_1.1_ff0415bf194a3291_.log (deflated 50%) 2025-10-10T02:30:22.4123193Z adding: test/test-reports/inductor.test_op_completeness_1.1_dda0816a3578d294_.log (deflated 51%) 2025-10-10T02:30:22.4123859Z adding: test/test-reports/export.test_hop_1.1_4ba4ac4a2bfe39f3_.log (deflated 49%) 2025-10-10T02:30:22.4124513Z adding: test/test-reports/inductor.test_multi_kernel_1.1_0d87f3b493366d85_.log (deflated 51%) 2025-10-10T02:30:22.4125193Z adding: test/test-reports/dynamo.test_recompile_ux_1.1_9a97e2c3c96dcdb5_.log (deflated 50%) 2025-10-10T02:30:22.4125895Z adding: test/test-reports/inductor.test_autoheuristic_1.1_9740c07f37095603_.log (deflated 50%) 2025-10-10T02:30:22.4126531Z adding: test/test-reports/test_decomp_1.16_bac203bb45e89c92_.log (deflated 48%) 2025-10-10T02:30:22.4127188Z adding: test/test-reports/export.test_serdes_1.1_38a488be1c228665_.log (deflated 50%) 2025-10-10T02:30:22.4127863Z adding: test/test-reports/dynamo.test_modules_1.1_91040889e26f427c_.log (deflated 50%) 2025-10-10T02:30:22.4128625Z adding: test/test-reports/dynamo.test_deque_reconstruct_1.1_c8d5fe5ab5327b72_.log (deflated 51%) 2025-10-10T02:30:22.4129331Z adding: test/test-reports/dynamo.test_metrics_context_1.1_d7a7ab83aa180c96_.log (deflated 51%) 2025-10-10T02:30:22.4130058Z adding: test/test-reports/inductor.test_cuda_select_algorithm_1.1_753ed3b429a73664_.log (deflated 52%) 2025-10-10T02:30:22.4130799Z adding: test/test-reports/dynamo.test_install_free_tensors_1.1_353e80a51212289c_.log (deflated 51%) 2025-10-10T02:30:22.4131502Z adding: test/test-reports/export.test_strict_export_v2_1.1_91b992c3521eba99_.log (deflated 51%) 2025-10-10T02:30:22.4132181Z adding: test/test-reports/inductor.test_mmdecomp_1.1_fee16a1a1a3043d0_.log (deflated 50%) 2025-10-10T02:30:22.4132865Z adding: test/test-reports/inductor.test_deterministic_1.1_58a0a0d408e8c482_.log (deflated 51%) 2025-10-10T02:30:22.4133562Z adding: test/test-reports/dynamo.test_bytecode_utils_1.1_61bea430f17802bb_.log (deflated 51%) 2025-10-10T02:30:22.4134264Z adding: test/test-reports/inductor.test_flex_decoding_1.1_f91f8e38d95283e7_.log (deflated 65%) 2025-10-10T02:30:22.4134973Z adding: test/test-reports/inductor.test_memory_planning_1.1_11d097bb98f41602_.log (deflated 51%) 2025-10-10T02:30:22.4135701Z adding: test/test-reports/export.test_unflatten_training_ir_1.1_a765755a6e7d8f4a_.log (deflated 51%) 2025-10-10T02:30:22.4136459Z adding: test/test-reports/inductor.test_ordered_set_1.1_8b17c9f23a4ad726_.log (deflated 51%) 2025-10-10T02:30:22.4137232Z adding: test/test-reports/inductor.test_aot_inductor_arrayref_1.1_5de7dff66346f4c6_.log (deflated 52%) 2025-10-10T02:30:22.4137970Z adding: test/test-reports/dynamo.test_compiler_bisector_1.1_d811bb94219d5c18_.log (deflated 51%) 2025-10-10T02:30:22.4138690Z adding: test/test-reports/dynamo.test_fx_passes_pre_grad_1.1_853ee5fd75a2e1ff_.log (deflated 51%) 2025-10-10T02:30:22.4139378Z adding: test/test-reports/dynamo.test_aot_compile_1.1_707a03b1c52226d9_.log (deflated 50%) 2025-10-10T02:30:22.4152337Z adding: test/test-reports/inductor.test_aot_inductor_windows_1.1_292693d894285bbe_.log (deflated 52%) 2025-10-10T02:30:22.4153090Z adding: test/test-reports/dynamo.test_modes_1.1_97b8a2c162a9bae1_.log (deflated 57%) 2025-10-10T02:30:22.4153779Z adding: test/test-reports/inductor.test_compiled_autograd_1.2_c61db975b7b0ad5e_.log (deflated 51%) 2025-10-10T02:30:22.4154497Z adding: test/test-reports/export.test_pass_infra_1.1_524fe1187e997278_.log (deflated 50%) 2025-10-10T02:30:22.4155144Z adding: test/test-reports/inductor.test_metrics_1.1_368586a87e605888_.log (deflated 50%) 2025-10-10T02:30:22.4155806Z adding: test/test-reports/inductor.test_cutlass_evt_1.1_6014a37fda186289_.log (deflated 50%) 2025-10-10T02:30:22.4156521Z adding: test/test-reports/inductor.test_custom_post_grad_passes_1.1_6392b39a1bd376f3_.log (deflated 52%) 2025-10-10T02:30:22.4157270Z adding: test/test-reports/inductor.test_auto_functionalize_1.1_e064c39f6c122394_.log (deflated 51%) 2025-10-10T02:30:22.4158018Z adding: test/test-reports/inductor.test_aot_inductor_package_1.1_19fa980ee4863f37_.log (deflated 52%) 2025-10-10T02:30:22.4158712Z adding: test/test-reports/dynamo.test_profiler_1.1_0aeeb6348edbd2f6_.log (deflated 50%) 2025-10-10T02:30:22.4159409Z adding: test/test-reports/inductor.test_provenance_tracing_1.1_d9075b58c4ec6bbc_.log (deflated 51%) 2025-10-10T02:30:22.4160122Z adding: test/test-reports/dynamo.test_guard_manager_1.1_b259dedf9d0333a0_.log (deflated 51%) 2025-10-10T02:30:22.4160790Z adding: test/test-reports/inductor.test_fx_fusion_1.1_b67642a4a0de600d_.log (deflated 50%) 2025-10-10T02:30:22.4161441Z adding: test/test-reports/dynamo.test_minifier_1.1_b7d446ff20d6291d_.log (deflated 50%) 2025-10-10T02:30:22.4162103Z adding: test/test-reports/inductor.test_loop_ordering_1.1_e7749674ccf3fa1e_.log (deflated 51%) 2025-10-10T02:30:22.4162803Z adding: test/test-reports/inductor.test_online_softmax_1.1_f92e4c9d93f8dff8_.log (deflated 51%) 2025-10-10T02:30:22.4163732Z adding: test/test-reports/export.test_functionalized_assertions_1.1_6b07771784c40230_.log (deflated 52%) 2025-10-10T02:30:22.4164439Z adding: test/test-reports/dynamo.test_global_1.1_55c8d690278f14ce_.log (deflated 49%) 2025-10-10T02:30:22.4165108Z adding: test/test-reports/inductor.test_segmented_tree_1.1_1db7f12bde08ea58_.log (deflated 51%) 2025-10-10T02:30:22.4165825Z adding: test/test-reports/inductor.test_inductor_freezing_1.1_af7fe6b4124975f9_.log (deflated 51%) 2025-10-10T02:30:22.4166626Z adding: test/test-reports/inductor.test_compiled_optimizers_1.1_2eda2e723a1555d4_.log (deflated 52%) 2025-10-10T02:30:22.4167444Z adding: test/test-reports/test_model_exports_to_core_aten_1.1_5eee626c1d181ea9_.log (deflated 51%) 2025-10-10T02:30:22.4168183Z adding: test/test-reports/inductor.test_decompose_mem_bound_mm_1.1_7dfe8ee45c6c8d04_.log (deflated 54%) 2025-10-10T02:30:22.4168895Z adding: test/test-reports/export.test_converter_1.1_65303ca9ac99b1a5_.log (deflated 50%) 2025-10-10T02:30:22.4169551Z adding: test/test-reports/dynamo.test_base_output_1.1_d74408fd8046e317_.log (deflated 50%) 2025-10-10T02:30:22.4170227Z adding: test/test-reports/export.test_experimental_1.1_6db50cb38a50aa59_.log (deflated 51%) 2025-10-10T02:30:22.4170892Z adding: test/test-reports/dynamo.test_backends_1.1_fd5e0c4be04e2890_.log (deflated 50%) 2025-10-10T02:30:22.4171610Z adding: test/test-reports/dynamo.test_model_output_1.1_8b0b27e549cfe40b_.log (deflated 50%) 2025-10-10T02:30:22.4172298Z adding: test/test-reports/dynamo.test_fx_graph_runnable_1.1_bd2f1fdee4aa4ab5_.log (deflated 51%) 2025-10-10T02:30:22.4172965Z adding: test/test-reports/export.test_torchbind_1.1_29905478f8b2fa76_.log (deflated 50%) 2025-10-10T02:30:22.4173638Z adding: test/test-reports/inductor.test_compile_worker_1.1_696a6569d36d6303_.log (deflated 51%) 2025-10-10T02:30:22.4174315Z adding: test/test-reports/inductor.test_config_1.1_afa4ffd6000fd00a_.log (deflated 50%) 2025-10-10T02:30:22.4175035Z adding: test/test-reports/inductor.test_move_constructors_to_cuda_1.1_2230c6148ed3f6a3_.log (deflated 52%) 2025-10-10T02:30:22.4175792Z adding: test/test-reports/dynamo.test_nested_graph_breaks_1.1_4f9ce98bfa3c98ad_.log (deflated 51%) 2025-10-10T02:30:22.4176508Z adding: test/test-reports/inductor.test_subgraph_choice_1.1_c90408b39edde7b1_.log (deflated 51%) 2025-10-10T02:30:22.4177170Z adding: test/test-reports/dynamo.test_exc_1.1_2bc6ab6e3f64bc31_.log (deflated 49%) 2025-10-10T02:30:22.4177819Z adding: test/test-reports/export.test_export_strict_1.1_c706f27fec9b2874_.log (deflated 51%) 2025-10-10T02:30:22.4178474Z adding: test/test-reports/export.test_passes_1.1_cdd42d00b3e5d75e_.log (deflated 50%) 2025-10-10T02:30:22.4179156Z adding: test/test-reports/inductor.test_cutedsl_template_1.1_52ff6371b196bb76_.log (deflated 51%) 2025-10-10T02:30:22.4179846Z adding: test/test-reports/inductor.test_torchbind_1.1_12ea036a9ef561ae_.log (deflated 50%) 2025-10-10T02:30:22.4180537Z adding: test/test-reports/dynamo.test_inline_and_install_1.1_d898458fd215403d_.log (deflated 51%) 2025-10-10T02:30:22.4181209Z adding: test/test-reports/dynamo.test_hooks_1.1_733aca6bce54ec7d_.log (deflated 49%) 2025-10-10T02:30:22.4181845Z adding: test/test-reports/export.test_tree_utils_1.1_7f35ac8ca390c8f7_.log (deflated 50%) 2025-10-10T02:30:22.4182502Z adding: test/test-reports/dynamo.test_trace_rules_1.1_6bc15433e41e8ed1_.log (deflated 50%) 2025-10-10T02:30:22.4183154Z adding: test/test-reports/dynamo.test_recompiles_1.1_602bfd9804f296f2_.log (deflated 50%) 2025-10-10T02:30:22.4183772Z adding: test/test-reports/test_decomp_6.16_899ef757ec9ac3b7_.log (deflated 48%) 2025-10-10T02:30:22.4184401Z adding: test/test-reports/dynamo.test_einops_1.1_f716b5119420a8f3_.log (deflated 49%) 2025-10-10T02:30:22.4185046Z adding: test/test-reports/dynamo.test_exceptions_1.1_3d46ecb5ecb75b62_.log (deflated 50%) 2025-10-10T02:30:22.4185700Z adding: test/test-reports/inductor.test_foreach_1.1_a5086df914c27c80_.log (deflated 50%) 2025-10-10T02:30:22.4186470Z adding: test/test-reports/inductor.test_alignment_1.1_294af1c25d8ef692_.log (deflated 50%) 2025-10-10T02:30:22.4187164Z adding: test/test-reports/inductor.test_minifier_utils_1.1_bbab12902e300986_.log (deflated 51%) 2025-10-10T02:30:22.4187808Z adding: test/test-reports/test_decomp_7.16_b5d68edc381e6c1e_.log (deflated 48%) 2025-10-10T02:30:22.4188411Z adding: test/test-reports/dynamo.test_sdpa_1.1_1001b4ea05f1d684_.log (deflated 49%) 2025-10-10T02:30:22.4189032Z adding: test/test-reports/dynamo.test_sources_1.1_f0a2850bee915e22_.log (deflated 50%) 2025-10-10T02:30:22.4189726Z adding: test/test-reports/inductor.test_compile_subprocess_1.1_d4ac5c4eb9ab9c0c_.log (deflated 51%) 2025-10-10T02:30:22.4190406Z adding: test/test-reports/export.test_schema_1.1_66d8fa029e329358_.log (deflated 50%) 2025-10-10T02:30:22.4191046Z adding: test/test-reports/export.test_cpp_serdes_1.1_86a1441da7c7f0aa_.log (deflated 50%) 2025-10-10T02:30:22.4191685Z adding: test/test-reports/inductor.test_mps_basic_1.1_8119c91943ef6807_.log (stored 0%) 2025-10-10T02:30:22.4192335Z adding: test/test-reports/inductor.test_debug_trace_1.1_2e9e05d1909b21ee_.log (deflated 50%) 2025-10-10T02:30:22.4193003Z adding: test/test-reports/dynamo.test_subclasses_1.1_c98fc60eaeb1ce49_.log (deflated 50%) 2025-10-10T02:30:22.4193651Z adding: test/test-reports/inductor.test_memory_1.1_8a3ccfe8335f2942_.log (deflated 50%) 2025-10-10T02:30:22.4194325Z adding: test/test-reports/dynamo.test_repros_1.1_1309813200270afc_.log (deflated 51%) 2025-10-10T02:30:22.4194958Z adding: test/test-reports/dynamo.test_frame_init_1.1_a259ebe95855bee9_.log (deflated 50%) 2025-10-10T02:30:22.4195583Z adding: test/test-reports/dynamo.test_resume_1.1_e95342750787047f_.log (deflated 50%) 2025-10-10T02:30:22.4196265Z adding: test/test-reports/inductor.test_kernel_optimization_1.1_875b521802e1a907_.log (deflated 55%) 2025-10-10T02:30:22.4196974Z adding: test/test-reports/dynamo.test_reorder_logs_1.1_58b593a8ffd10ed0_.log (deflated 50%) 2025-10-10T02:30:22.4197665Z adding: test/test-reports/inductor.test_combo_kernels_1.1_6591cab9fa30e71f_.log (deflated 51%) 2025-10-10T02:30:22.4198550Z adding: test/test-reports/dynamo.test_debug_utils_1.1_149e5bb7f0d00d3c_.log (deflated 50%) 2025-10-10T02:30:22.4199281Z adding: test/test-reports/inductor.test_inplacing_pass_1.1_28ade21d154ae588_.log (deflated 55%) 2025-10-10T02:30:22.4199967Z adding: test/test-reports/dynamo.test_generator_1.1_7f94c085e85a6bdd_.log (deflated 50%) 2025-10-10T02:30:22.4200640Z adding: test/test-reports/dynamo.test_skip_non_tensor_1.1_c910ca93aaa343e7_.log (deflated 51%) 2025-10-10T02:30:22.4201325Z adding: test/test-reports/export.test_lift_unlift_1.1_3e93da37ecca511d_.log (deflated 50%) 2025-10-10T02:30:22.4202014Z adding: test/test-reports/inductor.test_op_dtype_prop_1.1_f48fbff8ae399887_.log (deflated 51%) 2025-10-10T02:30:22.4202701Z adding: test/test-reports/export.test_verifier_1.1_9a11d0dc143e42b0_.log (deflated 50%) 2025-10-10T02:30:22.4203379Z adding: test/test-reports/dynamo.test_reconstruct_1.1_bf0a71b14c1b6d65_.log (deflated 50%) 2025-10-10T02:30:22.4204055Z adding: test/test-reports/profiler.test_profiler_1.1_5bb9c96b84d7a8fd_.log (deflated 80%) 2025-10-10T02:30:22.4204737Z adding: test/test-reports/export.test_dynamic_shapes_1.1_2eca05b0a663d698_.log (deflated 51%) 2025-10-10T02:30:22.4205393Z adding: test/test-reports/dynamo.test_misc_1.1_0fcf8f36cf60d109_.log (deflated 49%) 2025-10-10T02:30:22.4206043Z adding: test/test-reports/inductor.test_remote_cache_1.1_7fb455d90679ab6b_.log (deflated 53%) 2025-10-10T02:30:22.4206687Z adding: test/test-reports/test_decomp_10.16_ed0b201de6866abe_.log (deflated 48%) 2025-10-10T02:30:22.4207351Z adding: test/test-reports/dynamo.test_interop_1.1_7d4750a428fbe66e_.log (deflated 50%) 2025-10-10T02:30:22.4208007Z adding: test/test-reports/export.test_draft_export_1.1_ddf78549364c320a_.log (deflated 51%) 2025-10-10T02:30:22.4208785Z adding: test/test-reports/inductor.test_device_assert_1.1_5f5c8031fe7fd613_.log (deflated 51%) 2025-10-10T02:30:22.4209525Z adding: test/test-reports/test_decomp_15.16_c7290e8d28a49368_.log (deflated 48%) 2025-10-10T02:30:22.4210133Z adding: test/test-reports/inductor.test_smoke_1.1_36c7ed0ea59e0c17_.log (stored 0%) 2025-10-10T02:30:22.4210758Z adding: test/test-reports/export.test_swap_1.1_eae475bae73f52d8_.log (deflated 49%) 2025-10-10T02:30:22.4211442Z adding: test/test-reports/dynamo.test_skip_guard_eval_unsafe_1.1_5ad321e463e6138c_.log (deflated 51%) 2025-10-10T02:30:22.4212123Z adding: test/test-reports/test_decomp_16.16_2d7cab34934f216c_.log (deflated 48%) 2025-10-10T02:30:22.4212728Z adding: test/test-reports/export.test_tools_1.1_674db0ccef8f5279_.log (deflated 50%) 2025-10-10T02:30:22.4213409Z adding: test/test-reports/dynamo.test_aot_autograd_cache_1.1_c9fc592fb1e9cfca_.log (deflated 51%) 2025-10-10T02:30:22.4214127Z adding: test/test-reports/inductor.test_gpu_cpp_wrapper_1.1_a5d4f1de56f24715_.log (deflated 51%) 2025-10-10T02:30:22.4214843Z adding: test/test-reports/inductor.test_helion_kernels_1.1_50b9ef0405d03e5b_.log (deflated 51%) 2025-10-10T02:30:22.4215602Z adding: test/test-reports/export.test_export_with_inline_and_install_1.1_34da667e96b6ae57_.log (deflated 53%) 2025-10-10T02:30:22.4216326Z adding: test/test-reports/export.test_sparse_1.1_c298a1c849959967_.log (deflated 50%) 2025-10-10T02:30:22.4216974Z adding: test/test-reports/export.test_serialize_1.1_f9d33d6cf6abfd6d_.log (deflated 50%) 2025-10-10T02:30:22.4217691Z adding: test/test-reports/dynamo.test_comptime_1.1_8834e9fe44cb9fd1_.log (deflated 50%) 2025-10-10T02:30:22.4218338Z adding: test/test-reports/dynamo.test_functions_1.1_4f466802dc375aa7_.log (deflated 50%) 2025-10-10T02:30:22.4219017Z adding: test/test-reports/functorch.test_rearrange_1.1_032e07978013a7d3_.log (deflated 50%) 2025-10-10T02:30:22.4219712Z adding: test/test-reports/inductor.test_benchmarking_1.1_fffb1721d5aea822_.log (deflated 51%) 2025-10-10T02:30:22.4220417Z adding: test/test-reports/functorch.test_parsing_1.1_4bf25a31d01113af_.log (deflated 50%) 2025-10-10T02:30:22.4221116Z adding: test/test-reports/inductor.test_quantization_1.1_e36eaddf111f1674_.log (deflated 50%) 2025-10-10T02:30:22.4221835Z adding: test/test-reports/inductor.test_aot_inductor_utils_1.1_bbc230082af08e51_.log (deflated 51%) 2025-10-10T02:30:22.4222582Z adding: test/test-reports/inductor.test_aot_inductor_custom_ops_1.1_a1b0b9e83de4eaa8_.log (deflated 52%) 2025-10-10T02:30:22.4223323Z adding: test/test-reports/inductor.test_binary_folding_1.1_50a87f671e977c21_.log (deflated 51%) 2025-10-10T02:30:22.4224058Z adding: test/test-reports/inductor.test_scatter_optimization_1.1_4f5fbb2443ce81e7_.log (deflated 52%) 2025-10-10T02:30:22.4224766Z adding: test/test-reports/dynamo.test_base_hop_1.1_c59e586c1ba8f88d_.log (deflated 50%) 2025-10-10T02:30:22.4225459Z adding: test/test-reports/inductor.test_group_batch_fusion_1.1_617d3421dd6ff8fc_.log (deflated 73%) 2025-10-10T02:30:22.4226127Z adding: test/test-reports/dynamo.test_list_1.1_565a77db26f170c2_.log (deflated 49%) 2025-10-10T02:30:22.4226811Z adding: test/test-reports/inductor.test_split_cat_fx_passes_1.1_432d448a30c72404_.log (deflated 51%) 2025-10-10T02:30:22.4227483Z adding: test/test-reports/xpu.test_conv_1.1_ddda2c623f8f88c8_.log (deflated 48%) 2025-10-10T02:30:22.4228092Z adding: test/test-reports/dynamo.test_view_1.1_08f3df60b96c4425_.log (deflated 49%) 2025-10-10T02:30:22.4228769Z adding: test/test-reports/dynamo.test_autograd_function_1.1_cca3560f02bb7ebc_.log (deflated 51%) 2025-10-10T02:30:22.4229445Z adding: test/test-reports/functorch.test_ops_2.2_b412cbf486341313_.log (deflated 49%) 2025-10-10T02:30:22.4230074Z adding: test/test-reports/dynamo.test_nops_1.1_8a38f8b5813cdc19_.log (deflated 49%) 2025-10-10T02:30:22.4230697Z adding: test/test-reports/test_datapipe_1.1_31e17acc5475cd43_.log (deflated 49%) 2025-10-10T02:30:22.4231304Z adding: test/test-reports/dynamo.test_config_1.1_7222ccd924774084_.log (deflated 50%) 2025-10-10T02:30:22.4231974Z adding: test/test-reports/test_package_1.1_aab5f1fc1d9ca256_.log (deflated 56%) 2025-10-10T02:30:22.4232693Z adding: test/test-reports/inductor.test_control_flow_1.1_68ec3f826aff6336_.log (deflated 51%) 2025-10-10T02:30:22.4233355Z adding: test/test-reports/test_numa_binding_1.1_233808eeddf7a589_.log (deflated 49%) 2025-10-10T02:30:22.4233977Z adding: test/test-reports/export.test_db_1.1_d0b2865c718ec5bb_.log (deflated 49%) 2025-10-10T02:30:22.4234615Z adding: test/test-reports/export.test_unflatten_1.1_3699c48a5529a289_.log (deflated 50%) 2025-10-10T02:30:22.4235320Z adding: test/test-reports/inductor.test_unbacked_symints_1.1_64bff71131014153_.log (deflated 51%) 2025-10-10T02:30:22.4236053Z adding: test/test-reports/inductor.test_needs_exact_strides_1.1_e12db4474a6455f2_.log (deflated 51%) 2025-10-10T02:30:22.4236785Z adding: test/test-reports/inductor.test_fused_attention_1.1_9fa22b5ab214ad5c_.log (deflated 51%) 2025-10-10T02:30:22.4237560Z adding: test/test-reports/dynamo.test_verify_correctness_1.1_602a54c0909bb888_.log (deflated 51%) 2025-10-10T02:30:22.4238275Z adding: test/test-reports/dynamo.test_export_mutations_1.1_24946587e6c48654_.log (deflated 51%) 2025-10-10T02:30:22.4243315Z adding: test/test-reports/test_testing_1.1_34083e212ec9e153_.log (deflated 96%) 2025-10-10T02:30:22.4244005Z adding: test/test-reports/inductor.test_graph_transform_observer_1.1_5cc53b82a0adea3d_.log (deflated 52%) 2025-10-10T02:30:22.4244874Z adding: test/test-reports/inductor.test_split_cat_fx_aten_passes_1.1_995a123916a0dee9_.log (deflated 72%) 2025-10-10T02:30:22.4245640Z adding: test/test-reports/dynamo.test_activation_checkpointing_1.1_33e757f29d78aa3e_.log (deflated 52%) 2025-10-10T02:30:22.4246485Z adding: test/test-reports/inductor.test_torchinductor_codegen_config_overrides_1.1_96747b8b0bb4f6b1_.log (deflated 54%) 2025-10-10T02:30:22.4247366Z adding: test/test-reports/dynamo.test_backward_higher_order_ops_1.1_1ee262ea4923d2bc_.log (deflated 52%) 2025-10-10T02:30:22.4248128Z adding: test/test-reports/inductor.test_custom_partitioner_fn_1.1_32e3a895042458ce_.log (deflated 52%) 2025-10-10T02:30:22.4248868Z adding: test/test-reports/inductor.test_minifier_isolate_1.1_7ae75bb0ee2fb4b8_.log (deflated 51%) 2025-10-10T02:30:22.4249607Z adding: test/test-reports/inductor.test_augmented_graph_helper_1.1_83972a4b6596f628_.log (deflated 52%) 2025-10-10T02:30:22.4250350Z adding: test/test-reports/dynamo.test_python_dispatcher_1.1_442c1b7022abc30b_.log (deflated 51%) 2025-10-10T02:30:22.4251061Z adding: test/test-reports/dynamo.test_higher_order_ops_1.1_fde87208f5bc3db9_.log (deflated 51%) 2025-10-10T02:30:22.4251774Z adding: test/test-reports/dynamo.test_graph_deduplication_1.1_ac4062b893bd7e24_.log (deflated 51%) 2025-10-10T02:30:22.4252540Z adding: test/test-reports/dynamo.test_cudagraphs_expandable_segments_1.1_97154a2c7fe6c4f3_.log (deflated 52%) 2025-10-10T02:30:22.4253318Z adding: test/test-reports/dynamo.test_precompile_context_1.1_fca9df25493b29ac_.log (deflated 51%) 2025-10-10T02:30:22.4254050Z adding: test/test-reports/dynamo.test_input_attr_tracking_1.1_3e484bca3e8d2089_.log (deflated 51%) 2025-10-10T02:30:22.4254854Z adding: test/test-reports/inductor.test_cudagraph_trees_expandable_segments_1.1_a7072ab6f4d3014d_.log (deflated 53%) 2025-10-10T02:30:22.4255641Z adding: test/test-reports/dynamo.test_python_autograd_1.1_6398ff528c26342f_.log (deflated 51%) 2025-10-10T02:30:22.4256302Z adding: test/test-reports/test_comparison_utils_1.1_711c3b9129d799fc_.log (deflated 50%) 2025-10-10T02:30:22.4256973Z adding: test/test-reports/functorch.test_ac_logging_1.1_2db252e11b8c73ef_.log (deflated 50%) 2025-10-10T02:30:22.4257623Z adding: test/test-reports/test_mkldnn_verbose_1.1_6666ce1ef0660778_.log (deflated 49%) 2025-10-10T02:30:22.4258258Z adding: test/test-reports/profiler.test_kineto_1.1_7c2d1ad2b6df02a9_.log (deflated 50%) 2025-10-10T02:30:22.4258916Z adding: test/test-reports/test_utils_config_module_1.1_0d5baaa873f9c4f8_.log (deflated 50%) 2025-10-10T02:30:22.4259602Z adding: test/test-reports/lazy.test_generator_1.1_467765b11eb670d9_.log (deflated 50%) 2025-10-10T02:30:22.4260372Z adding: test/test-reports/torch_np.numpy_tests.lib.test_type_check_1.1_c92581c692d414a3_.log (deflated 53%) 2025-10-10T02:30:22.4261074Z adding: test/test-reports/lazy.test_debug_util_1.1_a82fed25b63a5da1_.log (deflated 50%) 2025-10-10T02:30:22.4261699Z adding: test/test-reports/test_jit_llga_fuser_1.1_9952245cc515fc15_.log (deflated 50%) 2025-10-10T02:30:22.4262392Z adding: test/test-reports/torch_np.numpy_tests.lib.test_histograms_1.1_1622275eb060a396_.log (deflated 52%) 2025-10-10T02:30:22.4263153Z adding: test/test-reports/benchmark_utils.test_benchmark_utils_1.1_b626e7c59322f5dc_.log (deflated 52%) 2025-10-10T02:30:22.4263920Z adding: test/test-reports/torch_np.numpy_tests.core.test_scalarmath_1.1_661233fe8cec6b8b_.log (deflated 53%) 2025-10-10T02:30:22.4264599Z adding: test/test-reports/test_indexing_1.1_5dac5f241f91bc60_.log (deflated 49%) 2025-10-10T02:30:22.4265227Z adding: test/test-reports/profiler.test_torch_tidy_1.1_bad9600caf6d79ab_.log (deflated 51%) 2025-10-10T02:30:22.4265870Z adding: test/test-reports/nn.test_module_hooks_1.1_6e7195e2b954f228_.log (deflated 50%) 2025-10-10T02:30:22.4266524Z adding: test/test-reports/functorch.test_aotdispatch_1.1_f0abdf06aa543a1b_.log (deflated 52%) 2025-10-10T02:30:22.4267205Z adding: test/test-reports/nn.test_load_state_dict_1.1_0123cd304723ba25_.log (deflated 50%) 2025-10-10T02:30:22.4267961Z adding: test/test-reports/torch_np.numpy_tests.linalg.test_linalg_1.1_301a27dd55a2ff4c_.log (deflated 52%) 2025-10-10T02:30:22.4268639Z adding: test/test-reports/test_shape_ops_1.1_86b9c16c33599f91_.log (deflated 49%) 2025-10-10T02:30:22.4269316Z adding: test/test-reports/torch_np.numpy_tests.core.test_shape_base_1.1_409da463f80a2b72_.log (deflated 53%) 2025-10-10T02:30:22.4270084Z adding: test/test-reports/torch_np.numpy_tests.core.test_dtype_1.1_64c41a8cf03300ff_.log (deflated 52%) 2025-10-10T02:30:22.4270761Z adding: test/test-reports/test_unary_ufuncs_1.1_9717922c87452941_.log (deflated 50%) 2025-10-10T02:30:22.4271370Z adding: test/test-reports/optim.test_optim_1.1_7ce4bd7a6b4c88ff_.log (deflated 7%) 2025-10-10T02:30:22.4271964Z adding: test/test-reports/test_sparse_csr_1.2_b189d890dc16c7dd_.log (deflated 49%) 2025-10-10T02:30:22.4272569Z adding: test/test-reports/test_serialization_1.1_8385150271c87539_.log (deflated 65%) 2025-10-10T02:30:22.4273265Z adding: test/test-reports/torch_np.numpy_tests.lib.test_twodim_base_1.1_803ee025fe5c7886_.log (deflated 53%) 2025-10-10T02:30:22.4273980Z adding: test/test-reports/test_function_schema_1.1_6d9807b902f2a45b_.log (deflated 50%) 2025-10-10T02:30:22.4274613Z adding: test/test-reports/functorch.test_vmap_1.1_13cbcd2562c0b9de_.log (deflated 50%) 2025-10-10T02:30:22.4275307Z adding: test/test-reports/torch_np.numpy_tests.lib.test_shape_base__1.1_19a060713c4e9774_.log (deflated 52%) 2025-10-10T02:30:22.4276075Z adding: test/test-reports/torch_np.numpy_tests.fft.test_pocketfft_1.1_7a3eb9285227325a_.log (deflated 52%) 2025-10-10T02:30:22.4276788Z adding: test/test-reports/test_scatter_gather_ops_1.1_c34a081d5539f2ac_.log (deflated 50%) 2025-10-10T02:30:22.4277466Z adding: test/test-reports/torch_np.test_ndarray_methods_1.1_7fa620bd1dd39880_.log (deflated 51%) 2025-10-10T02:30:22.4278108Z adding: test/test-reports/test_view_ops_1.1_2f352a06f209e475_.log (deflated 49%) 2025-10-10T02:30:22.4278785Z adding: test/test-reports/torch_np.numpy_tests.core.test_dlpack_1.1_63140e98a93e5810_.log (deflated 52%) 2025-10-10T02:30:22.4279548Z adding: test/test-reports/torch_np.numpy_tests.core.test_getlimits_1.1_ed6e61d089606395_.log (deflated 53%) 2025-10-10T02:30:22.4280237Z adding: test/test-reports/test_accelerator_1.1_54ab8aa03a1c43a1_.log (deflated 49%) 2025-10-10T02:30:22.4280851Z adding: test/test-reports/lazy.test_reuse_ir_1.1_99fff47802e98d89_.log (deflated 50%) 2025-10-10T02:30:22.4281595Z adding: test/test-reports/torch_np.numpy_tests.lib.test_index_tricks_1.1_73c04ebe9ceec85c_.log (deflated 53%) 2025-10-10T02:30:22.4282371Z adding: test/test-reports/nn.test_init_1.1_b542e5b73dcd92c9_.log (deflated 49%) 2025-10-10T02:30:22.4283065Z adding: test/test-reports/torch_np.numpy_tests.core.test_numerictypes_1.1_d08cc253be27ed78_.log (deflated 53%) 2025-10-10T02:30:22.4283774Z adding: test/test-reports/test_type_promotion_1.1_18dcd77cea18637f_.log (deflated 50%) 2025-10-10T02:30:22.4284495Z adding: test/test-reports/torch_np.numpy_tests.core.test_scalar_methods_1.1_ab42ef4cbf8efd10_.log (deflated 53%) 2025-10-10T02:30:22.4285274Z adding: test/test-reports/torch_np.numpy_tests.fft.test_helper_1.1_ac41356c39097f3f_.log (deflated 52%) 2025-10-10T02:30:22.4285979Z adding: test/test-reports/torch_np.test_function_base_1.1_f5ab374d2797df7d_.log (deflated 51%) 2025-10-10T02:30:22.4286662Z adding: test/test-reports/profiler.test_profiler_tree_1.1_235d4b965cf216e8_.log (deflated 51%) 2025-10-10T02:30:22.4287420Z adding: test/test-reports/functorch.test_eager_transforms_1.1_5fba5476cfdee925_.log (deflated 51%) 2025-10-10T02:30:22.4288077Z adding: test/test-reports/test_sparse_1.1_4deaf8f100a8d8d0_.log (deflated 49%) 2025-10-10T02:30:22.4288682Z adding: test/test-reports/inductor.test_config_1.1_948d16f61fbab907_.log (deflated 79%) 2025-10-10T02:30:22.4289346Z adding: test/test-reports/inductor.test_dependencies_1.1_9d5bdfe4dc267a99_.log (deflated 69%) 2025-10-10T02:30:22.4290072Z adding: test/test-reports/test_torchfuzz_repros_1.1_e455325673a5a69c_.log (deflated 86%) 2025-10-10T02:30:22.4290696Z adding: test/test-reports/export.test_hop_1.1_44ab375d262f0764_.log (deflated 90%) 2025-10-10T02:30:22.4291293Z adding: test/test-reports/test_opaque_obj_1.1_83a0045b8ec005b5_.log (deflated 92%) 2025-10-10T02:30:22.4291908Z adding: test/test-reports/test_public_bindings_1.1_b11c263fd0c85ed8_.log (deflated 64%) 2025-10-10T02:30:22.4302053Z adding: test/test-reports/inductor.test_op_dtype_prop_1.1_e430cf9b6161cf23_.log (deflated 96%) 2025-10-10T02:30:22.4306896Z adding: test/test-reports/dynamo.test_export_1.1_015a8a0a286ca800_.log (deflated 91%) 2025-10-10T02:30:22.4932453Z adding: test/test-reports/test_ops_1.1_70958335944aa2bf_.log (deflated 96%) 2025-10-10T02:30:22.4933626Z adding: test/test-reports/inductor.test_benchmarking_1.1_77bdfa0c43db741c_.log (deflated 82%) 2025-10-10T02:30:22.4958162Z adding: test/test-reports/inductor.test_aot_inductor_1.1_3445a10b048e315b_.log (deflated 94%) 2025-10-10T02:30:22.4959535Z adding: test/test-reports/inductor.test_quantization_1.1_0ffd05ffbea88812_.log (deflated 59%) 2025-10-10T02:30:22.4968441Z adding: test/test-reports/inductor.test_torchinductor_opinfo_5.11_2db6fc2ad4709efb_.log (deflated 93%) 2025-10-10T02:30:22.4969168Z adding: test/test-reports/inductor.test_control_deps_1.1_d2340aff53f38c07_.log (deflated 52%) 2025-10-10T02:30:22.4991301Z adding: test/test-reports/inductor.test_torchinductor_1.1_ccc835528304573b_.log (deflated 93%) 2025-10-10T02:30:22.4992173Z adding: test/test-reports/inductor.test_group_batch_fusion_1.1_25811d4046ce28f8_.log (deflated 82%) 2025-10-10T02:30:22.5003187Z adding: test/test-reports/inductor.test_torchinductor_opinfo_2.11_d749cdb92972a508_.log (deflated 94%) 2025-10-10T02:30:22.5025456Z adding: test/test-reports/inductor.test_compile_subprocess_1.1_7f685742ec8c765f_.log (deflated 94%) 2025-10-10T02:30:22.5034365Z adding: test/test-reports/inductor.test_torchinductor_opinfo_8.11_eee6125272166a05_.log (deflated 94%) 2025-10-10T02:30:22.5035208Z adding: test/test-reports/dynamo.test_fx_annotate_1.1_107d311c25f70a2b_.log (deflated 65%) 2025-10-10T02:30:22.5036054Z adding: test/test-reports/inductor.test_static_cuda_launcher_1.1_fecdd3b2b7645033_.log (deflated 83%) 2025-10-10T02:30:22.5036940Z adding: test/test-reports/dynamo.test_skip_guard_eval_unsafe_1.1_c4c5134a33f1089a_.log (deflated 69%) 2025-10-10T02:30:22.5041382Z adding: test/test-reports/inductor.test_cooperative_reductions_1.1_a58b03e9f2224605_.log (deflated 95%) 2025-10-10T02:30:22.5042206Z adding: test/test-reports/dynamo.test_unittest_1.1_f48ae0c7ef7bfb97_.log (deflated 51%) 2025-10-10T02:30:22.5042971Z adding: test/test-reports/inductor.test_async_compile_1.1_9a7f52f541016830_.log (deflated 75%) 2025-10-10T02:30:22.5043625Z adding: test/test-reports/dynamo.test_view_1.1_94455d928243c478_.log (deflated 69%) 2025-10-10T02:30:22.5044303Z adding: test/test-reports/inductor.test_kernel_benchmark_1.1_0019b7cb7ae5a001_.log (deflated 82%) 2025-10-10T02:30:22.5055483Z adding: test/test-reports/inductor.test_torchinductor_opinfo_11.11_5fb5715d736a3c9c_.log (deflated 94%) 2025-10-10T02:30:22.5056341Z adding: test/test-reports/dynamo.test_reconstruct_1.1_e3846bbbc7021b42_.log (deflated 81%) 2025-10-10T02:30:22.5057164Z adding: test/test-reports/inductor.test_fuzzer_1.1_2aab71feeee138a2_.log (deflated 78%) 2025-10-10T02:30:22.5059591Z adding: test/test-reports/inductor.test_cuda_repro_1.1_52865d9256534d61_.log (deflated 89%) 2025-10-10T02:30:22.5060255Z adding: test/test-reports/dynamo.test_callback_1.1_a27e80d13cd2d8fa_.log (deflated 63%) 2025-10-10T02:30:22.5060925Z adding: test/test-reports/dynamo.test_skip_non_tensor_1.1_e54bcb5cd15b165c_.log (deflated 74%) 2025-10-10T02:30:22.5077747Z adding: test/test-reports/inductor.test_cache_1.1_6802b59f2684bb57_.log (deflated 96%) 2025-10-10T02:30:22.5084656Z adding: test/test-reports/inductor.test_fp8_1.1_c75c2f35def64f67_.log (deflated 95%) 2025-10-10T02:30:22.5088104Z adding: test/test-reports/dynamo.test_modules_1.1_de1820bf1c4fdb49_.log (deflated 91%) 2025-10-10T02:30:22.5089074Z adding: test/test-reports/inductor.test_analysis_1.1_105daf17aeb27a7b_.log (deflated 87%) 2025-10-10T02:30:22.5089959Z adding: test/test-reports/export.test_export_opinfo_1.1_6357e7fe4ed14a50_.log (deflated 78%) 2025-10-10T02:30:22.5090889Z adding: test/test-reports/inductor.test_triton_syntax_1.1_96dd9f0f5728fc7a_.log (deflated 51%) 2025-10-10T02:30:22.5099140Z adding: test/test-reports/inductor.test_gpu_cpp_wrapper_1.1_c98ea84e751491e2_.log (deflated 95%) 2025-10-10T02:30:22.5100050Z adding: test/test-reports/inductor.test_triton_extension_backend_1.1_3354f39e07ca9e6b_.log (deflated 52%) 2025-10-10T02:30:22.5102765Z adding: test/test-reports/dynamo.test_sets_1.1_9a443ccf099885fe_.log (deflated 91%) 2025-10-10T02:30:22.5103525Z adding: test/test-reports/inductor.test_utils_1.1_4cc7b136308b4869_.log (deflated 71%) 2025-10-10T02:30:22.5104323Z adding: test/test-reports/inductor.test_scatter_optimization_1.1_04a138b69f0c8004_.log (deflated 75%) 2025-10-10T02:30:22.5105203Z adding: test/test-reports/inductor.test_coordinate_descent_tuner_1.1_6a2934ec14281cf7_.log (deflated 70%) 2025-10-10T02:30:22.5107322Z adding: test/test-reports/dynamo.test_decorators_1.1_b780ec96ddf1bf8a_.log (deflated 89%) 2025-10-10T02:30:22.5108141Z adding: test/test-reports/inductor.test_inplace_padding_1.1_55b2e80ffac4002c_.log (deflated 71%) 2025-10-10T02:30:22.5108910Z adding: test/test-reports/export.test_tools_1.1_8d0107df3e998ceb_.log (deflated 56%) 2025-10-10T02:30:22.5109734Z adding: test/test-reports/inductor.test_template_heuristics_registry_1.1_61d7bb43414b2b23_.log (deflated 71%) 2025-10-10T02:30:22.5110687Z adding: test/test-reports/dynamo.test_subgraphs_1.1_1b4c6992c9e6c62e_.log (deflated 87%) 2025-10-10T02:30:22.5111808Z adding: test/test-reports/inductor.test_select_algorithm_1.1_c9f1f2b80a4ec2ff_.log (deflated 85%) 2025-10-10T02:30:22.5112614Z adding: test/test-reports/dynamo.test_pre_dispatch_1.1_5c1414542ae058c3_.log (deflated 61%) 2025-10-10T02:30:22.5113352Z adding: test/test-reports/inductor.test_extension_backend_1.1_0691e68051c267cd_.log (deflated 57%) 2025-10-10T02:30:22.5116551Z adding: test/test-reports/dynamo.test_ctx_manager_1.1_d120947205c9e8cc_.log (deflated 90%) 2025-10-10T02:30:22.5117423Z adding: test/test-reports/inductor.test_inductor_scheduler_1.1_5969c9d2b44781db_.log (deflated 70%) 2025-10-10T02:30:22.5118207Z adding: test/test-reports/inductor.test_compile_1.1_fa8e60346a4a10c2_.log (deflated 76%) 2025-10-10T02:30:22.5119767Z adding: test/test-reports/inductor.test_padding_1.1_540e9d8848716689_.log (deflated 90%) 2025-10-10T02:30:22.5121158Z adding: test/test-reports/inductor.test_online_softmax_1.1_796aa6511e0a418c_.log (deflated 88%) 2025-10-10T02:30:22.5121978Z adding: test/test-reports/inductor.test_codegen_triton_1.1_785ab94d7e55a2f2_.log (deflated 51%) 2025-10-10T02:30:22.5122742Z adding: test/test-reports/dynamo.test_interop_1.1_17fde887b302c825_.log (deflated 65%) 2025-10-10T02:30:22.5149190Z adding: test/test-reports/inductor.test_torchinductor_codegen_dynamic_shapes_1.2_7641e235d1e3f55b_.log (deflated 95%) 2025-10-10T02:30:22.5150075Z adding: test/test-reports/inductor.test_device_assert_1.1_f5dbfde030818860_.log (deflated 78%) 2025-10-10T02:30:22.5180627Z adding: test/test-reports/export.test_export_training_ir_to_run_decomp_1.1_a47bfa21cc01caa5_.log (deflated 95%) 2025-10-10T02:30:22.5183519Z adding: test/test-reports/dynamo.test_dicts_1.1_4854fe43cf57b271_.log (deflated 91%) 2025-10-10T02:30:22.5184339Z adding: test/test-reports/inductor.test_indexing_1.1_4a79f373842475ab_.log (deflated 84%) 2025-10-10T02:30:22.5192631Z adding: test/test-reports/inductor.test_ordered_set_1.1_381dadbf74517f24_.log (deflated 92%) 2025-10-10T02:30:22.5193390Z adding: test/test-reports/inductor.test_minifier_1.1_4ac24adc1396a082_.log (deflated 80%) 2025-10-10T02:30:22.5195248Z adding: test/test-reports/dynamo.test_unspec_1.1_36fcfd22ed8d330c_.log (deflated 87%) 2025-10-10T02:30:22.5197169Z adding: test/test-reports/inductor.test_perf_1.1_a44bacdf904975be_.log (deflated 89%) 2025-10-10T02:30:22.5197816Z adding: test/test-reports/dynamo.test_global_1.1_43e046d083585564_.log (deflated 78%) 2025-10-10T02:30:22.5198780Z adding: test/test-reports/inductor.test_pad_mm_1.1_681b4dfd3c846ba9_.log (deflated 81%) 2025-10-10T02:30:22.5200389Z adding: test/test-reports/dynamo.test_pgo_1.1_f31f1f6b2207dd56_.log (deflated 73%) 2025-10-10T02:30:22.5201085Z adding: test/test-reports/inductor.test_inductor_annotations_1.1_fc532dfee78e7c3b_.log (deflated 60%) 2025-10-10T02:30:22.5201808Z adding: test/test-reports/dynamo.test_aot_compile_1.1_724050367795ef40_.log (deflated 78%) 2025-10-10T02:30:22.5203201Z adding: test/test-reports/inductor.test_ck_backend_1.1_12aa5978a80d628f_.log (deflated 90%) 2025-10-10T02:30:22.5203973Z adding: test/test-reports/inductor.test_cutlass_evt_1.1_35b013441ff83e6b_.log (deflated 69%) 2025-10-10T02:30:22.5204674Z adding: test/test-reports/inductor.test_inductor_utils_1.1_767b338c1b49423d_.log (deflated 57%) 2025-10-10T02:30:22.5205577Z adding: test/test-reports/dynamo.test_buffers_override_1.1_83626aa4367f16b7_.log (deflated 58%) 2025-10-10T02:30:22.5206387Z adding: test/test-reports/inductor.test_op_completeness_1.1_f2ce10d138763311_.log (deflated 68%) 2025-10-10T02:30:22.5207172Z adding: test/test-reports/dynamo.test_model_output_1.1_ed271d65aed54ae0_.log (deflated 79%) 2025-10-10T02:30:22.5208175Z adding: test/test-reports/inductor.test_multi_kernel_1.1_bd204fef91aabefa_.log (deflated 83%) 2025-10-10T02:30:22.5209433Z adding: test/test-reports/dynamo.test_modes_1.1_4dd8be0f57b8efb5_.log (deflated 80%) 2025-10-10T02:30:22.5210207Z adding: test/test-reports/inductor.test_autoheuristic_1.1_5f05e505b6ae1521_.log (deflated 50%) 2025-10-10T02:30:22.5210962Z adding: test/test-reports/export.test_schema_1.1_97e06027c4d60131_.log (deflated 66%) 2025-10-10T02:30:22.5238557Z adding: test/test-reports/export.test_serdes_1.1_2fd2a67c3686ad62_.log (deflated 93%) 2025-10-10T02:30:22.5243428Z adding: test/test-reports/inductor.test_cudagraph_trees_1.1_225a38602cd2e7c5_.log (deflated 92%) 2025-10-10T02:30:22.5244360Z adding: test/test-reports/dynamo.test_deque_reconstruct_1.1_f31cab565b18dfe5_.log (deflated 64%) 2025-10-10T02:30:22.5245060Z adding: test/test-reports/test_model_exports_to_core_aten_1.1_fc7755e53b31eadc_.log (deflated 52%) 2025-10-10T02:30:22.5246555Z adding: test/test-reports/inductor.test_cuda_select_algorithm_1.1_8d0470f03f49756e_.log (deflated 94%) 2025-10-10T02:30:22.5247730Z adding: test/test-reports/inductor.test_helion_kernels_1.1_8542007f207ba381_.log (deflated 57%) 2025-10-10T02:30:22.5261792Z adding: test/test-reports/export.test_strict_export_v2_1.1_31b0726b944d3506_.log (deflated 93%) 2025-10-10T02:30:22.5262597Z adding: test/test-reports/dynamo.test_profiler_1.1_cb36058f70d351c9_.log (deflated 77%) 2025-10-10T02:30:22.5263374Z adding: test/test-reports/inductor.test_deterministic_1.1_d504b2f8497fdcf1_.log (deflated 74%) 2025-10-10T02:30:22.5284051Z adding: test/test-reports/export.test_torchbind_1.1_85b5c57de673f6fb_.log (deflated 96%) 2025-10-10T02:30:22.5298850Z adding: test/test-reports/inductor.test_flex_decoding_1.1_014e239ef74bb15f_.log (deflated 96%) 2025-10-10T02:30:22.5299705Z adding: test/test-reports/inductor.test_aot_inductor_utils_1.1_07176475e9629f9a_.log (deflated 51%) 2025-10-10T02:30:22.5300818Z adding: test/test-reports/export.test_unflatten_training_ir_1.1_db15d3e124fd5785_.log (deflated 89%) 2025-10-10T02:30:22.5301617Z adding: test/test-reports/export.test_package_1.1_28253410963a8533_.log (deflated 62%) 2025-10-10T02:30:22.5327677Z adding: test/test-reports/inductor.test_torchinductor_dynamic_shapes_1.2_57e050d15a0ef298_.log (deflated 95%) 2025-10-10T02:30:22.5328635Z adding: test/test-reports/inductor.test_block_analysis_1.1_9e68007ecbb52a8b_.log (deflated 78%) 2025-10-10T02:30:22.5329373Z adding: test/test-reports/dynamo.test_fx_passes_pre_grad_1.1_04f0e6e8e98d46bf_.log (deflated 51%) 2025-10-10T02:30:22.5330226Z adding: test/test-reports/dynamo.test_autograd_function_1.1_ec9e3b5f11242efb_.log (deflated 87%) 2025-10-10T02:30:22.5331045Z adding: test/test-reports/inductor.test_aot_inductor_windows_1.1_42a69d4772d58dd2_.log (deflated 52%) 2025-10-10T02:30:22.5331837Z adding: test/test-reports/export.test_swap_1.1_70b480f71e377bb0_.log (deflated 83%) 2025-10-10T02:30:22.5343059Z adding: test/test-reports/inductor.test_aot_inductor_arrayref_1.1_9e8fa47aec179ee7_.log (deflated 94%) 2025-10-10T02:30:22.5343792Z adding: test/test-reports/inductor.test_torchbind_1.1_ec4c357d6bd3d8e1_.log (deflated 81%) 2025-10-10T02:30:22.5344460Z adding: test/test-reports/inductor.test_metrics_1.1_2736f0076b1f6312_.log (deflated 68%) 2025-10-10T02:30:22.5345160Z adding: test/test-reports/inductor.test_split_cat_fx_passes_1.1_84695daa9b39dc00_.log (deflated 79%) 2025-10-10T02:30:22.5346153Z adding: test/test-reports/inductor.test_custom_post_grad_passes_1.1_c3f5292f4425e102_.log (deflated 74%) 2025-10-10T02:30:22.5355941Z adding: test/test-reports/inductor.test_torchinductor_opinfo_6.11_49c72f5e054f3bbb_.log (deflated 94%) 2025-10-10T02:30:22.5360705Z adding: test/test-reports/export.test_sparse_1.1_468d13a07751fdee_.log (deflated 95%) 2025-10-10T02:30:22.5361453Z adding: test/test-reports/inductor.test_smoke_1.1_a2bd5e7e4ebc2a0d_.log (stored 0%) 2025-10-10T02:30:22.5363521Z adding: test/test-reports/inductor.test_aot_inductor_package_1.1_d6ff9edb97f5fbc0_.log (deflated 93%) 2025-10-10T02:30:22.5364413Z adding: test/test-reports/inductor.test_provenance_tracing_1.1_0c5bb1bcda5cff74_.log (deflated 80%) 2025-10-10T02:30:22.5365235Z adding: test/test-reports/export.test_dynamic_shapes_1.1_cc27332ded76651d_.log (deflated 56%) 2025-10-10T02:30:22.5365931Z adding: test/test-reports/export.test_passes_1.1_f2eef6b4e8045550_.log (deflated 84%) 2025-10-10T02:30:22.5366696Z adding: test/test-reports/inductor.test_fx_fusion_1.1_ba8b15390ffb14b4_.log (deflated 64%) 2025-10-10T02:30:22.5382911Z adding: test/test-reports/export.test_export_with_inline_and_install_1.1_2000574adcf3ce6d_.log (deflated 94%) 2025-10-10T02:30:22.5384078Z adding: test/test-reports/inductor.test_loop_ordering_1.1_9ae8f1f2b3081ac0_.log (deflated 88%) 2025-10-10T02:30:22.5384938Z adding: test/test-reports/export.test_functionalized_assertions_1.1_924a8cfb1a37b86d_.log (deflated 60%) 2025-10-10T02:30:22.5386111Z adding: test/test-reports/inductor.test_inplacing_pass_1.1_09815c94dc7811ce_.log (deflated 84%) 2025-10-10T02:30:22.5407608Z adding: test/test-reports/inductor.test_control_flow_1.1_e42998d1f616f85f_.log (deflated 97%) 2025-10-10T02:30:22.5408462Z adding: test/test-reports/inductor.test_segmented_tree_1.1_b3c8364b31241fe7_.log (deflated 78%) 2025-10-10T02:30:22.5411626Z adding: test/test-reports/export.test_serialize_1.1_95f8a40fb26f9781_.log (deflated 90%) 2025-10-10T02:30:22.5412794Z adding: test/test-reports/inductor.test_decompose_mem_bound_mm_1.1_a391d2cf64967b9d_.log (deflated 91%) 2025-10-10T02:30:22.5413493Z adding: test/test-reports/dynamo.test_sources_1.1_0c3a88f8f06b429e_.log (deflated 59%) 2025-10-10T02:30:22.5414309Z adding: test/test-reports/dynamo.test_base_output_1.1_47daba58d208aa02_.log (deflated 67%) 2025-10-10T02:30:22.5415020Z adding: test/test-reports/inductor.test_alignment_1.1_470237ed92fcad0c_.log (deflated 79%) 2025-10-10T02:30:22.5416191Z adding: test/test-reports/dynamo.test_backends_1.1_8303145e9a12b9c7_.log (deflated 80%) 2025-10-10T02:30:22.5416944Z adding: test/test-reports/dynamo.test_after_aot_1.1_4aea24d445cc5326_.log (deflated 55%) 2025-10-10T02:30:22.5417981Z adding: test/test-reports/dynamo.test_fx_graph_runnable_1.1_47dd4b448534f9a4_.log (deflated 81%) 2025-10-10T02:30:22.5418740Z adding: test/test-reports/dynamo.test_nops_1.1_81834f8cf3f4fbc6_.log (deflated 61%) 2025-10-10T02:30:22.5419486Z adding: test/test-reports/inductor.test_compile_worker_1.1_154742363e488171_.log (deflated 68%) 2025-10-10T02:30:22.5430400Z adding: test/test-reports/dynamo.test_functions_1.1_b2cf2692cce73a00_.log (deflated 92%) 2025-10-10T02:30:22.5431231Z adding: test/test-reports/inductor.test_move_constructors_to_cuda_1.1_12d7a03e610f0010_.log (deflated 74%) 2025-10-10T02:30:22.5432123Z adding: test/test-reports/dynamo.test_config_1.1_c2c4080c336cde5e_.log (deflated 66%) 2025-10-10T02:30:22.5432875Z adding: test/test-reports/inductor.test_subgraph_choice_1.1_27f3d74e8c36f856_.log (deflated 58%) 2025-10-10T02:30:22.5433559Z adding: test/test-reports/dynamo.test_debug_utils_1.1_3a6d2a9374af50e4_.log (deflated 65%) 2025-10-10T02:30:22.5446910Z adding: test/test-reports/export.test_export_strict_1.1_a3e324381a9966d5_.log (deflated 92%) 2025-10-10T02:30:22.5448425Z adding: test/test-reports/dynamo.test_guard_serialization_1.1_735fa62e05e2bf95_.log (deflated 88%) 2025-10-10T02:30:22.5449236Z adding: test/test-reports/inductor.test_cutedsl_template_1.1_08913ad265b5d1c7_.log (deflated 77%) 2025-10-10T02:30:22.5450571Z adding: test/test-reports/export.test_db_1.1_392d050764ebdc4b_.log (deflated 87%) 2025-10-10T02:30:22.5457460Z adding: test/test-reports/dynamo.test_inline_and_install_1.1_b99f352f2c845190_.log (deflated 94%) 2025-10-10T02:30:22.5458133Z adding: test/test-reports/dynamo.test_resume_1.1_a7f6314d2933b47a_.log (deflated 49%) 2025-10-10T02:30:22.5458763Z adding: test/test-reports/export.test_tree_utils_1.1_043345f8551ff430_.log (deflated 55%) 2025-10-10T02:30:22.5459398Z adding: test/test-reports/dynamo.test_base_hop_1.1_543d11ff3d3850da_.log (deflated 76%) 2025-10-10T02:30:22.5460213Z adding: test/test-reports/dynamo.test_recompiles_1.1_521468b3edd9c16b_.log (deflated 81%) 2025-10-10T02:30:22.5460905Z adding: test/test-reports/dynamo.test_exc_1.1_ec59563413aec021_.log (deflated 76%) 2025-10-10T02:30:22.5461566Z adding: test/test-reports/dynamo.test_einops_1.1_42e8d60b337a2722_.log (deflated 59%) 2025-10-10T02:30:22.5463667Z adding: test/test-reports/dynamo.test_aot_autograd_1.1_bd6312e2ff94e5c7_.log (deflated 89%) 2025-10-10T02:30:22.5474775Z adding: test/test-reports/inductor.test_foreach_1.1_134f2aabe256d0e2_.log (deflated 95%) 2025-10-10T02:30:22.5475694Z adding: test/test-reports/inductor.test_unbacked_symints_1.1_3854c4a618a0b316_.log (deflated 87%) 2025-10-10T02:30:22.5476403Z adding: test/test-reports/inductor.test_minifier_utils_1.1_75acc1efa6172974_.log (deflated 61%) 2025-10-10T02:30:22.5477678Z adding: test/test-reports/dynamo.test_hooks_1.1_b526d113aa1fb3f5_.log (deflated 86%) 2025-10-10T02:30:22.5478562Z adding: test/test-reports/dynamo.test_sdpa_1.1_8e823cdb223e529d_.log (deflated 67%) 2025-10-10T02:30:22.5482192Z adding: test/test-reports/inductor.test_fused_attention_1.1_0bcb9e0021e051e2_.log (deflated 94%) 2025-10-10T02:30:22.5499476Z adding: test/test-reports/inductor.test_compiled_optimizers_1.1_8320ff453d4d719c_.log (deflated 96%) 2025-10-10T02:30:22.5500906Z adding: test/test-reports/dynamo.test_logging_1.1_53ed6fd2cc9e4943_.log (deflated 91%) 2025-10-10T02:30:22.5514296Z adding: test/test-reports/export.test_cpp_serdes_1.1_e86dd5e5bb44350b_.log (deflated 93%) 2025-10-10T02:30:22.5515198Z adding: test/test-reports/dynamo.test_list_1.1_4248bb7d7c30055b_.log (deflated 86%) 2025-10-10T02:30:22.5515953Z adding: test/test-reports/inductor.test_debug_trace_1.1_8370b203a1624b92_.log (deflated 61%) 2025-10-10T02:30:22.5517116Z adding: test/test-reports/export.test_unflatten_1.1_de28863d6eab6ae8_.log (deflated 84%) 2025-10-10T02:30:22.5517846Z adding: test/test-reports/inductor.test_memory_1.1_93bb42c254f3d933_.log (deflated 73%) 2025-10-10T02:30:22.5531203Z adding: test/test-reports/export.test_export_1.1_799d906e592648bc_.log (deflated 90%) 2025-10-10T02:30:22.5532089Z adding: test/test-reports/dynamo.test_frame_init_1.1_b9ced74bf7b0c34e_.log (deflated 50%) 2025-10-10T02:30:22.5532881Z adding: test/test-reports/dynamo.test_export_mutations_1.1_dc71709ddd4ec781_.log (deflated 73%) 2025-10-10T02:30:22.5533627Z adding: test/test-reports/inductor.test_kernel_optimization_1.1_e3492474b9ed8890_.log (deflated 54%) 2025-10-10T02:30:22.5534490Z adding: test/test-reports/export.test_upgrader_1.1_7c1498cc0631c845_.log (deflated 67%) 2025-10-10T02:30:22.5546291Z adding: test/test-reports/inductor.test_combo_kernels_1.1_071e2ab93f2c127a_.log (deflated 85%) 2025-10-10T02:30:22.5547227Z adding: test/test-reports/inductor.test_remote_cache_1.1_da154ba4977184df_.log (deflated 62%) 2025-10-10T02:30:22.5548210Z adding: test/test-reports/inductor.test_aot_inductor_custom_ops_1.1_53dbf80fa4b7d1b0_.log (deflated 90%) 2025-10-10T02:30:22.5549253Z adding: test/test-reports/inductor.test_mkldnn_pattern_matcher_1.1_444bf6b07f3059e5_.log (deflated 96%) 2025-10-10T02:30:22.5550255Z adding: test/test-reports/inductor.test_graph_transform_observer_1.1_4599af1e98fa3111_.log (deflated 53%) 2025-10-10T02:30:22.5551273Z adding: test/test-reports/dynamo.test_graph_region_tracker_1.1_7d8fde614f127a2e_.log (deflated 81%) 2025-10-10T02:30:22.5552239Z adding: test/test-reports/inductor.test_custom_lowering_1.1_5a6704fc4dde7e87_.log (deflated 69%) 2025-10-10T02:30:22.5553074Z adding: test/test-reports/dynamo.test_metrics_context_1.1_a00681eaf9860da7_.log (deflated 75%) 2025-10-10T02:30:22.5553901Z adding: test/test-reports/dynamo.test_install_free_tensors_1.1_66d267faa80ec138_.log (deflated 86%) 2025-10-10T02:30:22.5554864Z adding: test/test-reports/inductor.test_memory_planning_1.1_09ecd4d1a24e9a08_.log (deflated 65%) 2025-10-10T02:30:22.5555705Z adding: test/test-reports/inductor.test_split_cat_fx_aten_passes_1.1_a8aba974cbdd3456_.log (deflated 77%) 2025-10-10T02:30:22.5556760Z adding: test/test-reports/dynamo.test_activation_checkpointing_1.1_20f97b959dffeba5_.log (deflated 87%) 2025-10-10T02:30:22.5557674Z adding: test/test-reports/dynamo.test_compiler_bisector_1.1_2b248deb7f14704d_.log (deflated 72%) 2025-10-10T02:30:22.5558484Z adding: test/test-reports/inductor.test_auto_functionalize_1.1_d3b7a61204f59239_.log (deflated 89%) 2025-10-10T02:30:22.5559589Z adding: test/test-reports/inductor.test_torchinductor_codegen_config_overrides_1.1_ed505f213deffd89_.log (deflated 67%) 2025-10-10T02:30:22.5560715Z adding: test/test-reports/inductor.test_inductor_freezing_1.1_74a06e8cebe9e633_.log (deflated 90%) 2025-10-10T02:30:22.5561541Z adding: test/test-reports/dynamo.test_nested_graph_breaks_1.1_df6a649fed81d0dc_.log (deflated 82%) 2025-10-10T02:30:22.5562265Z adding: test/test-reports/dynamo.test_backward_higher_order_ops_1.1_c7ef207536f78c55_.log (deflated 75%) 2025-10-10T02:30:22.5576514Z adding: test/test-reports/inductor.test_compiled_autograd_1.2_4ebc3f903fd9dde5_.log (deflated 92%) 2025-10-10T02:30:22.5577649Z adding: test/test-reports/inductor.test_custom_partitioner_fn_1.1_5695bb31f732dee1_.log (deflated 54%) 2025-10-10T02:30:22.5579986Z adding: test/test-reports/dynamo.test_aot_autograd_cache_1.1_4e168b7c37066d68_.log (deflated 92%) 2025-10-10T02:30:22.5580957Z adding: test/test-reports/inductor.test_binary_folding_1.1_a1218328b8bd8e7c_.log (deflated 72%) 2025-10-10T02:30:22.5581931Z adding: test/test-reports/inductor.test_needs_exact_strides_1.1_1964273ba08a4005_.log (deflated 59%) 2025-10-10T02:30:22.5582907Z adding: test/test-reports/dynamo.test_verify_correctness_1.1_d0753fae103be4ae_.log (deflated 66%) 2025-10-10T02:30:22.5583862Z adding: test/test-reports/dynamo.test_deviceguard_1.1_9a49e3f89bda1ff3_.log (deflated 63%) 2025-10-10T02:30:22.5584731Z adding: test/test-reports/inductor.test_augmented_graph_helper_1.1_ef0ac65fad0c43d6_.log (deflated 80%) 2025-10-10T02:30:22.5585448Z adding: test/test-reports/dynamo.test_cudagraphs_1.1_9f1060c0752f2213_.log (deflated 73%) 2025-10-10T02:30:22.5588627Z adding: test/test-reports/inductor.test_caching_1.1_8d229fa359ca9a0d_.log (deflated 94%) 2025-10-10T02:30:22.5589566Z adding: test/test-reports/dynamo.test_python_dispatcher_1.1_31d628dd1efd6e18_.log (deflated 72%) 2025-10-10T02:30:22.5590483Z adding: test/test-reports/dynamo.test_optimizers_1.1_5a96456b2482bde0_.log (deflated 59%) 2025-10-10T02:30:22.5591349Z adding: test/test-reports/dynamo.test_flat_apply_1.1_e475d81eda6118a4_.log (deflated 64%) 2025-10-10T02:30:22.5597169Z adding: test/test-reports/dynamo.test_higher_order_ops_1.1_0c512c3b1d5a9285_.log (deflated 92%) 2025-10-10T02:30:22.5598094Z adding: test/test-reports/export.test_nativert_1.1_ea78fa49e0fcb5eb_.log (deflated 67%) 2025-10-10T02:30:22.5599255Z adding: test/test-reports/dynamo.test_graph_deduplication_1.1_fc2a050026445f44_.log (deflated 84%) 2025-10-10T02:30:22.5600233Z adding: test/test-reports/dynamo.test_error_messages_1.1_052d645fe43b2353_.log (deflated 87%) 2025-10-10T02:30:22.5601261Z adding: test/test-reports/dynamo.test_cudagraphs_expandable_segments_1.1_a7bea9564038dfdc_.log (deflated 76%) 2025-10-10T02:30:22.5602282Z adding: test/test-reports/dynamo.test_recompile_ux_1.1_fcd2102d267c6683_.log (deflated 76%) 2025-10-10T02:30:22.5603134Z adding: test/test-reports/inductor.test_mmdecomp_1.1_a1642879794c1991_.log (deflated 87%) 2025-10-10T02:30:22.5604085Z adding: test/test-reports/dynamo.test_precompile_context_1.1_aa671318c4c80024_.log (deflated 62%) 2025-10-10T02:30:22.5605020Z adding: test/test-reports/inductor.test_minifier_isolate_1.1_1234750d7788ee70_.log (deflated 57%) 2025-10-10T02:30:22.5627588Z adding: test/test-reports/inductor.test_cpu_repro_1.1_8c0256faf5df4382_.log (deflated 97%) 2025-10-10T02:30:22.5628517Z adding: test/test-reports/dynamo.test_bytecode_utils_1.1_67cc7b7c051e6119_.log (deflated 80%) 2025-10-10T02:30:22.5629426Z adding: test/test-reports/export.test_pass_infra_1.1_10f22a096d017b60_.log (deflated 66%) 2025-10-10T02:30:22.5639985Z adding: test/test-reports/functorch.test_eager_transforms_1.1_979ad6ef0ef55061_.log (deflated 93%) 2025-10-10T02:30:22.5641004Z adding: test/test-reports/dynamo.test_guard_manager_1.1_15db491b2515d834_.log (deflated 86%) 2025-10-10T02:30:22.5641912Z adding: test/test-reports/dynamo.test_minifier_1.1_2347d38b77264edb_.log (deflated 83%) 2025-10-10T02:30:22.5642812Z adding: test/test-reports/export.test_experimental_1.1_d2bd020472bbf1da_.log (deflated 76%) 2025-10-10T02:30:22.5644059Z adding: test/test-reports/dynamo.test_input_attr_tracking_1.1_355e67848140ae83_.log (deflated 81%) 2025-10-10T02:30:22.5645719Z adding: test/test-reports/export.test_converter_1.1_dd2e25c8f5e52a44_.log (deflated 87%) 2025-10-10T02:30:22.5646659Z adding: test/test-reports/dynamo.test_trace_rules_1.1_b2b5e22c651d65d4_.log (deflated 69%) 2025-10-10T02:30:22.5648246Z adding: test/test-reports/dynamo.test_exceptions_1.1_4112bd861ff8dc84_.log (deflated 88%) 2025-10-10T02:30:22.5706214Z adding: test/test-reports/test_sparse_csr_1.2_e71d1888f8d6e313_.log (deflated 97%) 2025-10-10T02:30:22.5707268Z adding: test/test-reports/inductor.test_mps_basic_1.1_7bb770612b0fea89_.log (stored 0%) 2025-10-10T02:30:22.5710425Z adding: test/test-reports/dynamo.test_subclasses_1.1_14d5f88b9fc210d4_.log (deflated 91%) 2025-10-10T02:30:22.5715396Z adding: test/test-reports/inductor.test_cudagraph_trees_expandable_segments_1.1_bdc9ffac270e9589_.log (deflated 93%) 2025-10-10T02:30:22.5724618Z adding: test/test-reports/dynamo.test_repros_1.1_a2f8d394a273405c_.log (deflated 89%) 2025-10-10T02:30:22.5725512Z adding: test/test-reports/dynamo.test_reorder_logs_1.1_16aea957efe8413a_.log (deflated 82%) 2025-10-10T02:30:22.5727924Z adding: test/test-reports/dynamo.test_generator_1.1_3a0ec19c03d8910f_.log (deflated 90%) 2025-10-10T02:30:22.5791244Z adding: test/test-reports/test_sparse_1.1_c4a4003efa99e723_.log (deflated 97%) 2025-10-10T02:30:22.5792110Z adding: test/test-reports/export.test_lift_unlift_1.1_0661475e62a3ad5d_.log (deflated 62%) 2025-10-10T02:30:22.5792977Z adding: test/test-reports/export.test_verifier_1.1_960dacb2e6b526fd_.log (deflated 76%) 2025-10-10T02:30:22.5795159Z adding: test/test-reports/profiler.test_profiler_1.1_3bfa04acb9efaa7e_.log (deflated 87%) 2025-10-10T02:30:22.5796081Z adding: test/test-reports/export.test_draft_export_1.1_a4ad47099ea70073_.log (deflated 83%) 2025-10-10T02:30:22.5812545Z adding: test/test-reports/dynamo.test_misc_1.1_4ba039d9e46540fc_.log (deflated 90%) 2025-10-10T02:30:22.5813429Z adding: test/test-reports/dynamo.test_comptime_1.1_12f0992c6518ae4b_.log (deflated 77%) 2025-10-10T02:30:22.5814344Z adding: test/test-reports/dynamo.test_python_autograd_1.1_baf474ecf16b23fa_.log (deflated 68%) 2025-10-10T02:30:22.5815301Z adding: test/test-reports/functorch.test_rearrange_1.1_0bdbdf8367fdfb7a_.log (deflated 75%) 2025-10-10T02:30:22.5816199Z adding: test/test-reports/functorch.test_parsing_1.1_5f2548c18dc309d7_.log (deflated 77%) 2025-10-10T02:30:22.5819976Z adding: test/test-reports/test_package_1.1_cbd9eae8a849bf4d_.log (deflated 90%) 2025-10-10T02:30:22.5820842Z adding: test/test-reports/test_comparison_utils_1.1_abb85c565d058c55_.log (deflated 72%) 2025-10-10T02:30:22.5821694Z adding: test/test-reports/test_mkl_verbose_1.1_e7c198c0f1e698db_.log (deflated 54%) 2025-10-10T02:30:22.5822564Z adding: test/test-reports/functorch.test_ac_logging_1.1_d0124ae9a3552e21_.log (deflated 63%) 2025-10-10T02:30:22.5823456Z adding: test/test-reports/test_mkldnn_verbose_1.1_202119cb08eecedf_.log (deflated 55%) 2025-10-10T02:30:22.5824111Z adding: test/test-reports/profiler.test_kineto_1.1_bd1f24c6684fac74_.log (deflated 51%) 2025-10-10T02:30:22.5856242Z adding: test/test-reports/test_matmul_cuda_1.1_4558612648566e2a_.log (deflated 97%) 2025-10-10T02:30:22.5857048Z adding: test/test-reports/test_license_1.1_b3d15cbbd76cd71b_.log (deflated 51%) 2025-10-10T02:30:22.5857870Z adding: test/test-reports/test_utils_config_module_1.1_86078f5a5cf42ecc_.log (deflated 83%) 2025-10-10T02:30:22.6246659Z adding: test/test-reports/test_transformers_1.1_1181ea920dbf7d81_.log (deflated 98%) 2025-10-10T02:30:22.6985540Z adding: test/test-reports/test_meta_1.1_310c899992f4fd8e_.log (deflated 97%) 2025-10-10T02:30:22.6998964Z adding: test/test-reports/test_decomp_6.16_b6aa06fa43228dc4_.log (deflated 93%) 2025-10-10T02:30:22.7013396Z adding: test/test-reports/test_decomp_7.16_b85ff1c0038feadd_.log (deflated 93%) 2025-10-10T02:30:22.7026291Z adding: test/test-reports/test_decomp_15.16_674eae0c73a8ecc0_.log (deflated 93%) 2025-10-10T02:30:22.7040667Z adding: test/test-reports/test_decomp_10.16_c2717449774e4579_.log (deflated 93%) 2025-10-10T02:30:22.7041579Z adding: test/test-reports/xpu.test_conv_1.1_c0e4f3c7aae6dfed_.log (deflated 48%) 2025-10-10T02:30:22.7055365Z adding: test/test-reports/test_decomp_16.16_bf1d120015c9ac2d_.log (deflated 93%) 2025-10-10T02:30:22.7057947Z adding: test/test-reports/test_datapipe_1.1_dc5c5546e5068041_.log (deflated 89%) 2025-10-10T02:30:22.7058943Z adding: test/test-reports/lazy.test_generator_1.1_3db541494829351f_.log (deflated 56%) 2025-10-10T02:30:22.7059949Z adding: test/test-reports/torch_np.numpy_tests.lib.test_type_check_1.1_d4097b0680d9fabe_.log (deflated 89%) 2025-10-10T02:30:22.7060929Z adding: test/test-reports/lazy.test_debug_util_1.1_782c9fff9d33d04b_.log (deflated 49%) 2025-10-10T02:30:22.7063584Z adding: test/test-reports/test_jit_llga_fuser_1.1_95e48d3d30174322_.log (deflated 91%) 2025-10-10T02:30:22.7064450Z adding: test/test-reports/test_numa_binding_1.1_ef4381c1447d526b_.log (deflated 81%) 2025-10-10T02:30:22.7066103Z adding: test/test-reports/torch_np.numpy_tests.lib.test_histograms_1.1_3b65da280a49a1f9_.log (deflated 90%) 2025-10-10T02:30:22.7080900Z adding: test/test-reports/test_decomp_1.16_6946e487729ae572_.log (deflated 93%) 2025-10-10T02:30:22.7086416Z adding: test/test-reports/torch_np.numpy_tests.core.test_scalarmath_1.1_fc9dd0dc170e7c55_.log (deflated 94%) 2025-10-10T02:30:22.7090794Z adding: test/test-reports/test_indexing_1.1_1aabc744c25f1d2e_.log (deflated 93%) 2025-10-10T02:30:22.7091676Z adding: test/test-reports/profiler.test_torch_tidy_1.1_b59fbc7b8680e5c8_.log (deflated 82%) 2025-10-10T02:30:22.7093379Z adding: test/test-reports/nn.test_module_hooks_1.1_e2c8b40e464fd47e_.log (deflated 89%) 2025-10-10T02:30:22.7108710Z adding: test/test-reports/functorch.test_aotdispatch_1.1_1e441b16cfa8619c_.log (deflated 94%) 2025-10-10T02:30:22.7109786Z adding: test/test-reports/nn.test_load_state_dict_1.1_6cdbff59400f14c3_.log (deflated 87%) 2025-10-10T02:30:22.7116338Z adding: test/test-reports/torch_np.numpy_tests.linalg.test_linalg_1.1_72b5754c3ec219f3_.log (deflated 93%) 2025-10-10T02:30:22.7118506Z adding: test/test-reports/test_shape_ops_1.1_baf0e023aba1b3cf_.log (deflated 91%) 2025-10-10T02:30:22.7122012Z adding: test/test-reports/torch_np.numpy_tests.core.test_shape_base_1.1_734bdded1185faf0_.log (deflated 93%) 2025-10-10T02:30:22.7125017Z adding: test/test-reports/torch_np.numpy_tests.core.test_dtype_1.1_71068497346246e2_.log (deflated 91%) 2025-10-10T02:30:22.7598283Z adding: test/test-reports/test_unary_ufuncs_1.1_40b935a37851e832_.log (deflated 97%) 2025-10-10T02:30:22.7707456Z adding: test/test-reports/functorch.test_ops_2.2_7be5e9b7b2602bbb_.log (deflated 95%) 2025-10-10T02:30:22.7708336Z adding: test/test-reports/optim.test_optim_1.1_54604825053500b8_.log (deflated 7%) 2025-10-10T02:30:22.7713074Z adding: test/test-reports/test_serialization_1.1_cc7d12474b1a1e9e_.log (deflated 92%) 2025-10-10T02:30:22.7714157Z adding: test/test-reports/torch_np.numpy_tests.lib.test_twodim_base_1.1_f2b6b88ee147953d_.log (deflated 87%) 2025-10-10T02:30:22.7715402Z adding: test/test-reports/test_function_schema_1.1_f798fa934797b399_.log (deflated 81%) 2025-10-10T02:30:22.7764692Z adding: test/test-reports/functorch.test_vmap_1.1_63b66feb7af2a617_.log (deflated 95%) 2025-10-10T02:30:22.7766417Z adding: test/test-reports/torch_np.numpy_tests.lib.test_shape_base__1.1_407426e7354b1940_.log (deflated 90%) 2025-10-10T02:30:22.7769174Z adding: test/test-reports/torch_np.numpy_tests.fft.test_pocketfft_1.1_1b99f28b2aa0788e_.log (deflated 93%) 2025-10-10T02:30:22.7770852Z adding: test/test-reports/test_scatter_gather_ops_1.1_35b67cd1046f2ac5_.log (deflated 92%) 2025-10-10T02:30:22.7778985Z adding: test/test-reports/torch_np.test_ndarray_methods_1.1_a3bcb645e12a6c00_.log (deflated 95%) 2025-10-10T02:30:22.7785056Z adding: test/test-reports/test_view_ops_1.1_add3b985aeb53ec3_.log (deflated 94%) 2025-10-10T02:30:22.7786123Z adding: test/test-reports/torch_np.numpy_tests.core.test_dlpack_1.1_b0ab03ce3acae62e_.log (deflated 90%) 2025-10-10T02:30:22.7787193Z adding: test/test-reports/torch_np.numpy_tests.core.test_getlimits_1.1_fcd04f230de9bbf6_.log (deflated 79%) 2025-10-10T02:30:22.7788154Z adding: test/test-reports/test_accelerator_1.1_7503bad570e8177e_.log (deflated 75%) 2025-10-10T02:30:22.7788994Z adding: test/test-reports/lazy.test_reuse_ir_1.1_4dd9796a057338cd_.log (deflated 62%) 2025-10-10T02:30:22.7790669Z adding: test/test-reports/torch_np.numpy_tests.lib.test_index_tricks_1.1_9dc3800ac10e145e_.log (deflated 88%) 2025-10-10T02:30:22.7791749Z adding: test/test-reports/benchmark_utils.test_benchmark_utils_1.1_fd1d46f532803740_.log (deflated 74%) 2025-10-10T02:30:22.7792659Z adding: test/test-reports/nn.test_init_1.1_09bb5d4c6c81d927_.log (deflated 84%) 2025-10-10T02:30:22.7793774Z adding: test/test-reports/torch_np.numpy_tests.core.test_numerictypes_1.1_98cad5f0df449ff9_.log (deflated 89%) 2025-10-10T02:30:22.7804735Z adding: test/test-reports/test_type_promotion_1.1_f7173feb7ff71173_.log (deflated 96%) 2025-10-10T02:30:22.7807169Z adding: test/test-reports/torch_np.numpy_tests.core.test_scalar_methods_1.1_13b953159f4dde16_.log (deflated 92%) 2025-10-10T02:30:22.7808263Z adding: test/test-reports/torch_np.numpy_tests.fft.test_helper_1.1_4d788950329a8b0e_.log (deflated 74%) 2025-10-10T02:30:22.7809236Z adding: test/test-reports/torch_np.test_function_base_1.1_2939ee6ae0a89997_.log (deflated 50%) 2025-10-10T02:30:22.7810023Z adding: test/test-reports/profiler.test_profiler_tree_1.1_df669bd43050df5d_.log (deflated 77%) 2025-10-10T02:30:22.7952313Z ##[group]Run # Remove any previous debugging artifacts if they exist 2025-10-10T02:30:22.7952820Z # Remove any previous debugging artifacts if they exist 2025-10-10T02:30:22.7953217Z rm -f debug-*.zip 2025-10-10T02:30:22.7953620Z if [ -d 'test/debug' ]; then 2025-10-10T02:30:22.7953974Z  zip -r "debug-${FILE_SUFFIX}.zip" test/debug 2025-10-10T02:30:22.7954309Z fi 2025-10-10T02:30:22.7963588Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:30:22.7963954Z env: 2025-10-10T02:30:22.7964172Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:22.7964498Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:22.7965046Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:22.7965544Z DEVICE_NAME: 2025-10-10T02:30:22.7965768Z DEVICE_TYPE: 2025-10-10T02:30:22.7966130Z FILE_SUFFIX: test-slow-2-3-linux.g5.4xlarge.nvidia.gpu_52406799277 2025-10-10T02:30:22.7966533Z ##[endgroup] 2025-10-10T02:30:22.8153107Z ##[group]Run seemethere/upload-artifact-s3@v5 2025-10-10T02:30:22.8153441Z with: 2025-10-10T02:30:22.8153679Z s3-bucket: gha-artifacts 2025-10-10T02:30:22.8154010Z s3-prefix: pytorch/pytorch/18392306083/1/artifact 2025-10-10T02:30:22.8154376Z retention-days: 14 2025-10-10T02:30:22.8154634Z if-no-files-found: warn 2025-10-10T02:30:22.8154917Z path: test-jsons-*.zip 2025-10-10T02:30:22.8155183Z name: artifact 2025-10-10T02:30:22.8155425Z region: us-east-1 2025-10-10T02:30:22.8155652Z env: 2025-10-10T02:30:22.8155872Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:22.8156214Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:22.8156775Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:22.8157270Z DEVICE_NAME: 2025-10-10T02:30:22.8157506Z DEVICE_TYPE: 2025-10-10T02:30:22.8157745Z ##[endgroup] 2025-10-10T02:30:23.3719034Z NOTE: s3-prefix specified, ignoring name parameter 2025-10-10T02:30:23.3719637Z With the provided path, there will be 1 file uploaded 2025-10-10T02:30:23.3720203Z Uploading to s3 prefix: pytorch/pytorch/18392306083/1/artifact 2025-10-10T02:30:23.3791672Z Starting upload of test-jsons-test-slow-2-3-linux.g5.4xlarge.nvidia.gpu_52406799277.zip 2025-10-10T02:30:23.5193985Z Finished upload of test-jsons-test-slow-2-3-linux.g5.4xlarge.nvidia.gpu_52406799277.zip 2025-10-10T02:30:23.5587637Z ##[group]Run seemethere/upload-artifact-s3@v5 2025-10-10T02:30:23.5587978Z with: 2025-10-10T02:30:23.5588197Z s3-bucket: gha-artifacts 2025-10-10T02:30:23.5588521Z s3-prefix: pytorch/pytorch/18392306083/1/artifact 2025-10-10T02:30:23.5588865Z retention-days: 14 2025-10-10T02:30:23.5589125Z if-no-files-found: error 2025-10-10T02:30:23.5589519Z path: test-reports-*.zip 2025-10-10T02:30:23.5589789Z name: artifact 2025-10-10T02:30:23.5590024Z region: us-east-1 2025-10-10T02:30:23.5590246Z env: 2025-10-10T02:30:23.5590622Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:23.5590964Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:23.5591520Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:23.5592011Z DEVICE_NAME: 2025-10-10T02:30:23.5592248Z DEVICE_TYPE: 2025-10-10T02:30:23.5592472Z ##[endgroup] 2025-10-10T02:30:24.1633457Z NOTE: s3-prefix specified, ignoring name parameter 2025-10-10T02:30:24.1634103Z With the provided path, there will be 1 file uploaded 2025-10-10T02:30:24.1634707Z Uploading to s3 prefix: pytorch/pytorch/18392306083/1/artifact 2025-10-10T02:30:24.1707279Z Starting upload of test-reports-test-slow-2-3-linux.g5.4xlarge.nvidia.gpu_52406799277.zip 2025-10-10T02:30:24.3916012Z Finished upload of test-reports-test-slow-2-3-linux.g5.4xlarge.nvidia.gpu_52406799277.zip 2025-10-10T02:30:24.4318462Z ##[group]Run seemethere/upload-artifact-s3@v5 2025-10-10T02:30:24.4318794Z with: 2025-10-10T02:30:24.4319030Z s3-bucket: gha-artifacts 2025-10-10T02:30:24.4319343Z s3-prefix: pytorch/pytorch/18392306083/1/artifact 2025-10-10T02:30:24.4319680Z retention-days: 14 2025-10-10T02:30:24.4319933Z if-no-files-found: ignore 2025-10-10T02:30:24.4320195Z path: logs-*.zip 2025-10-10T02:30:24.4320425Z name: artifact 2025-10-10T02:30:24.4320776Z region: us-east-1 2025-10-10T02:30:24.4321001Z env: 2025-10-10T02:30:24.4321207Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:24.4321539Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:24.4322077Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:24.4322560Z DEVICE_NAME: 2025-10-10T02:30:24.4322779Z DEVICE_TYPE: 2025-10-10T02:30:24.4322992Z ##[endgroup] 2025-10-10T02:30:24.7768638Z NOTE: s3-prefix specified, ignoring name parameter 2025-10-10T02:30:24.7769114Z With the provided path, there will be 1 file uploaded 2025-10-10T02:30:24.7769580Z Uploading to s3 prefix: pytorch/pytorch/18392306083/1/artifact 2025-10-10T02:30:24.7841922Z Starting upload of logs-test-slow-2-3-linux.g5.4xlarge.nvidia.gpu_52406799277.zip 2025-10-10T02:30:24.9560138Z Finished upload of logs-test-slow-2-3-linux.g5.4xlarge.nvidia.gpu_52406799277.zip 2025-10-10T02:30:24.9959107Z ##[group]Run seemethere/upload-artifact-s3@v5 2025-10-10T02:30:24.9959446Z with: 2025-10-10T02:30:24.9959718Z s3-bucket: gha-artifacts 2025-10-10T02:30:24.9960069Z s3-prefix: pytorch/pytorch/18392306083/1/artifact 2025-10-10T02:30:24.9960404Z retention-days: 14 2025-10-10T02:30:24.9960650Z if-no-files-found: ignore 2025-10-10T02:30:24.9960914Z path: debug-*.zip 2025-10-10T02:30:24.9961144Z name: artifact 2025-10-10T02:30:24.9961372Z region: us-east-1 2025-10-10T02:30:24.9961587Z env: 2025-10-10T02:30:24.9961794Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:24.9962130Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:24.9962666Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:24.9963153Z DEVICE_NAME: 2025-10-10T02:30:24.9963376Z DEVICE_TYPE: 2025-10-10T02:30:24.9963594Z ##[endgroup] 2025-10-10T02:30:25.3237290Z No files were found with the provided path: debug-*.zip. No artifacts will be uploaded. 2025-10-10T02:30:25.4051197Z ##[group]Run # shellcheck disable=SC2156 2025-10-10T02:30:25.4051576Z # shellcheck disable=SC2156 2025-10-10T02:30:25.4052120Z find . -iname "core.[1-9]*" -exec docker exec "${DOCKER_CONTAINER_ID}" sh -c "gdb python {} -ex 'bt' -ex 'q'" \; 2025-10-10T02:30:25.4063002Z shell: /usr/bin/bash -e {0} 2025-10-10T02:30:25.4063269Z env: 2025-10-10T02:30:25.4063484Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:25.4063815Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:25.4064362Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:25.4064957Z DEVICE_NAME: 2025-10-10T02:30:25.4065186Z DEVICE_TYPE: 2025-10-10T02:30:25.4065411Z ##[endgroup] 2025-10-10T02:30:25.8339301Z Prepare all required actions 2025-10-10T02:30:25.8339729Z Getting action download info 2025-10-10T02:30:25.9638132Z ##[group]Run ./.github/actions/upload-utilization-stats 2025-10-10T02:30:25.9638480Z with: 2025-10-10T02:30:25.9638690Z job_id: 52406799277 2025-10-10T02:30:25.9639152Z job_name: linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 2, 3, linux.g5.4xlarge.nvidia.gpu) 2025-10-10T02:30:25.9639655Z workflow_name: slow 2025-10-10T02:30:25.9639903Z workflow_run_id: 18392306083 2025-10-10T02:30:25.9640172Z workflow_attempt: 1 2025-10-10T02:30:25.9640407Z env: 2025-10-10T02:30:25.9640615Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:25.9640931Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:25.9641465Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:25.9641948Z DEVICE_NAME: 2025-10-10T02:30:25.9642171Z DEVICE_TYPE: 2025-10-10T02:30:25.9642382Z ##[endgroup] 2025-10-10T02:30:25.9790268Z ##[group]Run echo "workflow_id: 18392306083" 2025-10-10T02:30:25.9790616Z echo "workflow_id: 18392306083" 2025-10-10T02:30:25.9790925Z echo "workflow_attempt: 1" 2025-10-10T02:30:25.9791221Z echo "workflow_Name: slow" 2025-10-10T02:30:25.9791511Z echo "job_id: 52406799277" 2025-10-10T02:30:25.9792146Z echo "job_name: linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 2, 3, linux.g5.4xlarge.nvidia.gpu)" 2025-10-10T02:30:25.9792708Z echo "artifact_prefix: " 2025-10-10T02:30:25.9792996Z python3 --version 2025-10-10T02:30:25.9802686Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:30:25.9803037Z env: 2025-10-10T02:30:25.9803252Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:25.9803572Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:25.9804107Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:25.9804580Z DEVICE_NAME: 2025-10-10T02:30:25.9804811Z DEVICE_TYPE: 2025-10-10T02:30:25.9805031Z ##[endgroup] 2025-10-10T02:30:25.9834304Z workflow_id: 18392306083 2025-10-10T02:30:25.9834580Z workflow_attempt: 1 2025-10-10T02:30:25.9834824Z workflow_Name: slow 2025-10-10T02:30:25.9835057Z job_id: 52406799277 2025-10-10T02:30:25.9835515Z job_name: linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 2, 3, linux.g5.4xlarge.nvidia.gpu) 2025-10-10T02:30:25.9836007Z artifact_prefix: 2025-10-10T02:30:25.9849290Z Python 3.9.23 2025-10-10T02:30:25.9977027Z ##[group]Run nick-fields/retry@v3.0.0 2025-10-10T02:30:25.9977325Z with: 2025-10-10T02:30:25.9977522Z shell: bash 2025-10-10T02:30:25.9977739Z timeout_minutes: 5 2025-10-10T02:30:25.9977977Z max_attempts: 5 2025-10-10T02:30:25.9978202Z retry_wait_seconds: 30 2025-10-10T02:30:25.9978721Z command: set -eu python3 -m pip install python-dateutil==2.8.2 boto3==1.35.42 pandas==2.1.3 dataclasses_json==0.6.7 2025-10-10T02:30:25.9979283Z polling_interval_seconds: 1 2025-10-10T02:30:25.9979564Z warning_on_retry: true 2025-10-10T02:30:25.9979819Z continue_on_error: false 2025-10-10T02:30:25.9980058Z env: 2025-10-10T02:30:25.9980265Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:25.9980577Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:25.9981107Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:25.9981581Z DEVICE_NAME: 2025-10-10T02:30:25.9981802Z DEVICE_TYPE: 2025-10-10T02:30:25.9982015Z ##[endgroup] 2025-10-10T02:30:26.3716043Z Defaulting to user installation because normal site-packages is not writeable 2025-10-10T02:30:26.5004882Z Collecting python-dateutil==2.8.2 2025-10-10T02:30:26.5172422Z Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB) 2025-10-10T02:30:27.7075161Z Collecting boto3==1.35.42 2025-10-10T02:30:27.7113908Z Downloading boto3-1.35.42-py3-none-any.whl (139 kB) 2025-10-10T02:30:28.3759249Z Collecting pandas==2.1.3 2025-10-10T02:30:28.3816085Z Downloading pandas-2.1.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.3 MB) 2025-10-10T02:30:28.6966254Z Requirement already satisfied: dataclasses_json==0.6.7 in /home/ec2-user/.local/lib/python3.9/site-packages (0.6.7) 2025-10-10T02:30:28.6981743Z Requirement already satisfied: six>=1.5 in /usr/lib/python3.9/site-packages (from python-dateutil==2.8.2) (1.15.0) 2025-10-10T02:30:28.7027614Z Requirement already satisfied: s3transfer<0.11.0,>=0.10.0 in /home/ec2-user/.local/lib/python3.9/site-packages (from boto3==1.35.42) (0.10.4) 2025-10-10T02:30:28.7033694Z Requirement already satisfied: botocore<1.36.0,>=1.35.42 in /home/ec2-user/.local/lib/python3.9/site-packages (from boto3==1.35.42) (1.35.99) 2025-10-10T02:30:28.7036660Z Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in /usr/lib/python3.9/site-packages (from boto3==1.35.42) (0.10.0) 2025-10-10T02:30:29.8086114Z Collecting numpy<2,>=1.22.4 2025-10-10T02:30:29.8122738Z Downloading numpy-1.26.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB) 2025-10-10T02:30:30.3173266Z Collecting tzdata>=2022.1 2025-10-10T02:30:30.3214274Z Downloading tzdata-2025.2-py2.py3-none-any.whl (347 kB) 2025-10-10T02:30:30.3401274Z Requirement already satisfied: pytz>=2020.1 in /usr/lib/python3.9/site-packages (from pandas==2.1.3) (2022.7.1) 2025-10-10T02:30:30.3433243Z Requirement already satisfied: typing-inspect<1,>=0.4.0 in /home/ec2-user/.local/lib/python3.9/site-packages (from dataclasses_json==0.6.7) (0.9.0) 2025-10-10T02:30:30.3437178Z Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in /home/ec2-user/.local/lib/python3.9/site-packages (from dataclasses_json==0.6.7) (3.26.1) 2025-10-10T02:30:30.3491685Z Requirement already satisfied: urllib3<1.27,>=1.25.4 in /usr/lib/python3.9/site-packages (from botocore<1.36.0,>=1.35.42->boto3==1.35.42) (1.25.10) 2025-10-10T02:30:30.3623889Z Requirement already satisfied: packaging>=17.0 in /home/ec2-user/.local/lib/python3.9/site-packages (from marshmallow<4.0.0,>=3.18.0->dataclasses_json==0.6.7) (25.0) 2025-10-10T02:30:30.3730539Z Requirement already satisfied: typing-extensions>=3.7.4 in /home/ec2-user/.local/lib/python3.9/site-packages (from typing-inspect<1,>=0.4.0->dataclasses_json==0.6.7) (4.15.0) 2025-10-10T02:30:30.3734105Z Requirement already satisfied: mypy-extensions>=0.3.0 in /home/ec2-user/.local/lib/python3.9/site-packages (from typing-inspect<1,>=0.4.0->dataclasses_json==0.6.7) (1.1.0) 2025-10-10T02:30:30.7002917Z Installing collected packages: python-dateutil, tzdata, numpy, pandas, boto3 2025-10-10T02:30:36.2038447Z Attempting uninstall: boto3 2025-10-10T02:30:36.2039509Z Found existing installation: boto3 1.35.33 2025-10-10T02:30:36.2168307Z Uninstalling boto3-1.35.33: 2025-10-10T02:30:36.2184333Z Successfully uninstalled boto3-1.35.33 2025-10-10T02:30:36.3365238Z Successfully installed boto3-1.35.42 numpy-1.26.4 pandas-2.1.3 python-dateutil-2.8.2 tzdata-2025.2 2025-10-10T02:30:37.0854148Z Command completed after 1 attempt(s). 2025-10-10T02:30:37.1036845Z ##[group]Run python3 -m tools.stats.upload_utilization_stats.upload_utilization_stats \ 2025-10-10T02:30:37.1037712Z python3 -m tools.stats.upload_utilization_stats.upload_utilization_stats \ 2025-10-10T02:30:37.1038208Z  --workflow-run-id "18392306083" \ 2025-10-10T02:30:37.1038537Z  --workflow-name "slow" \ 2025-10-10T02:30:37.1038868Z  --workflow-run-attempt "1" \ 2025-10-10T02:30:37.1039183Z  --job-id "52406799277" \ 2025-10-10T02:30:37.1039722Z  --job-name "linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 2, 3, linux.g5.4xlarge.nvidia.gpu)" \ 2025-10-10T02:30:37.1040316Z  --local-path "" \ 2025-10-10T02:30:37.1040620Z  --artifact-prefix "" 2025-10-10T02:30:37.1051294Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:30:37.1051777Z env: 2025-10-10T02:30:37.1051986Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:37.1052300Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:37.1053013Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:37.1053493Z DEVICE_NAME: 2025-10-10T02:30:37.1053727Z DEVICE_TYPE: 2025-10-10T02:30:37.1053954Z ##[endgroup] 2025-10-10T02:30:40.6361152Z repo: pytorch/pytorch 2025-10-10T02:30:40.6361573Z Search for test log in s3 bucket: ossci-utilization 2025-10-10T02:30:40.6362073Z Downloading logs-test-slow-2-3-linux.g5.4xlarge.nvidia.gpu_52406799277.zip 2025-10-10T02:30:40.6362746Z extracting usage_log.txt from zip file logs-test-slow-2-3-linux.g5.4xlarge.nvidia.gpu_52406799277.zip 2025-10-10T02:30:40.6363295Z Converted Log Model: UtilizationMetadata: 2025-10-10T02:30:40.6364595Z UtilizationMetadata(level='metadata', workflow_id='18392306083', job_id='52406799277', workflow_name='slow', job_name='linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 2, 3, linux.g5.4xlarge.nvidia.gpu)', usage_collect_interval=1.0, data_model_version=1.5, start_at=1760057531, gpu_count=1, cpu_count=16, gpu_type='pynvml', error=None) 2025-10-10T02:30:40.6365955Z [Db Segments] detected pytest cmd: 26, generated segments: 26 2025-10-10T02:30:40.6366344Z [db model] Peek db timeseries 2025-10-10T02:30:40.6366609Z :{ 2025-10-10T02:30:40.6366809Z "created_at": 1760063439, 2025-10-10T02:30:40.6367163Z "type": "utilization", 2025-10-10T02:30:40.6367651Z "tags": [ 2025-10-10T02:30:40.6367870Z "record" 2025-10-10T02:30:40.6368083Z ], 2025-10-10T02:30:40.6368301Z "time_stamp": 1760057531, 2025-10-10T02:30:40.6368586Z "repo": "pytorch/pytorch", 2025-10-10T02:30:40.6368865Z "workflow_id": 18392306083, 2025-10-10T02:30:40.6369130Z "run_attempt": 1, 2025-10-10T02:30:40.6369379Z "job_id": 52406799277, 2025-10-10T02:30:40.6369644Z "workflow_name": "slow", 2025-10-10T02:30:40.6370137Z "job_name": "linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 2, 3, linux.g5.4xlarge.nvidia.gpu)", 2025-10-10T02:30:40.6370645Z "json_data": "{}" 2025-10-10T02:30:40.6370879Z } 2025-10-10T02:30:40.6371358Z Writing 1 documents to S3 ossci-utilization/util_metadata/v_1.5/pytorch/pytorch/18392306083/1/52406799277/metadata 2025-10-10T02:30:40.6372212Z Done! Finish writing document to S3 ossci-utilization/util_metadata/v_1.5/pytorch/pytorch/18392306083/1/52406799277/metadata 2025-10-10T02:30:40.6373091Z Writing 1176 documents to S3 ossci-utilization/util_timeseries/v_1.5/pytorch/pytorch/18392306083/1/52406799277/time_series 2025-10-10T02:30:40.6373981Z Done! Finish writing document to S3 ossci-utilization/util_timeseries/v_1.5/pytorch/pytorch/18392306083/1/52406799277/time_series 2025-10-10T02:30:40.7627454Z ##[group]Run pytorch/test-infra/.github/actions/teardown-linux@main 2025-10-10T02:30:40.7627869Z with: 2025-10-10T02:30:40.7628072Z env: 2025-10-10T02:30:40.7628287Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:40.7628612Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:40.7629153Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:40.7629627Z DEVICE_NAME: 2025-10-10T02:30:40.7629846Z DEVICE_TYPE: 2025-10-10T02:30:40.7630053Z ##[endgroup] 2025-10-10T02:30:40.7743360Z ##[group]Run set -eou pipefail 2025-10-10T02:30:40.7743667Z set -eou pipefail 2025-10-10T02:30:40.7743928Z  2025-10-10T02:30:40.7744280Z echo "Holding runner for 2 hours until all ssh sessions have logged out" 2025-10-10T02:30:40.7744719Z for _ in $(seq 1440); do 2025-10-10T02:30:40.7745041Z  # Break if no ssh session exists anymore 2025-10-10T02:30:40.7745402Z  if [ "$(who)" = "" ]; then 2025-10-10T02:30:40.7745683Z  break 2025-10-10T02:30:40.7745915Z  fi 2025-10-10T02:30:40.7746137Z  echo "." 2025-10-10T02:30:40.7746375Z  sleep 5 2025-10-10T02:30:40.7746603Z done 2025-10-10T02:30:40.7755650Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:30:40.7756003Z env: 2025-10-10T02:30:40.7756214Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:40.7756529Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:40.7757072Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:40.7757547Z DEVICE_NAME: 2025-10-10T02:30:40.7757773Z DEVICE_TYPE: 2025-10-10T02:30:40.7757995Z ##[endgroup] 2025-10-10T02:30:40.7803214Z Holding runner for 2 hours until all ssh sessions have logged out 2025-10-10T02:30:40.8284189Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2025-10-10T02:30:40.8284733Z # ignore expansion of "docker ps -q" since it could be empty 2025-10-10T02:30:40.8285138Z # shellcheck disable=SC2046 2025-10-10T02:30:40.8285464Z docker stop $(docker ps -q) || true 2025-10-10T02:30:40.8285805Z # Prune all of the docker images 2025-10-10T02:30:40.8286127Z docker system prune -af 2025-10-10T02:30:40.8294641Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:30:40.8294996Z env: 2025-10-10T02:30:40.8295214Z GIT_DEFAULT_BRANCH: main 2025-10-10T02:30:40.8295530Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-10-10T02:30:40.8296065Z DOCKER_CONTAINER_ID: 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:40.8296655Z DEVICE_NAME: 2025-10-10T02:30:40.8296873Z DEVICE_TYPE: 2025-10-10T02:30:40.8297083Z ##[endgroup] 2025-10-10T02:30:52.0727449Z 0d479bf7aa10 2025-10-10T02:30:55.3822430Z Deleted Containers: 2025-10-10T02:30:55.3822907Z 0d479bf7aa1028c1efe5abc00aba7c77fea2d669ee48fe0051d50c10c6eea1cb 2025-10-10T02:30:55.3823257Z 2025-10-10T02:31:09.8709809Z Deleted Images: 2025-10-10T02:31:09.8710659Z untagged: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-d8be0384e085f551506bd739678109fa0f5ee7ac 2025-10-10T02:31:09.8711968Z untagged: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image@sha256:02f790fd120c15e1b00388d4689c10992f2a801ea5dd956622d4a5688e229be6 2025-10-10T02:31:09.8713106Z deleted: sha256:2b326b7b17db730c6c973cebcc035ac7bd2de93f7608c304eb88a054c020cb51 2025-10-10T02:31:09.8713725Z deleted: sha256:c7339a08804e6178af8496c8c04289002b5a817b869b75058b3e7523c87c485b 2025-10-10T02:31:09.8714338Z deleted: sha256:e8fc21d4a8d7b00ebec57d1bb66787e87e1c088bce9db28cfe9d1a2eb4f603b0 2025-10-10T02:31:09.8714963Z deleted: sha256:c15e4ceb1a1ec10190472b9d822135b0475d53d8c1e88f280cf5605fc5ae4726 2025-10-10T02:31:09.8715568Z deleted: sha256:4b33aef3351bd5776c5590d55ae631a132294ed0f9788fe75c80e1702c6cbdbc 2025-10-10T02:31:09.8716595Z deleted: sha256:4bdbf0b3870e0beb2b5152a0dd7109a24fab816dba6f825deb9d191b2ba5542c 2025-10-10T02:31:09.8717284Z deleted: sha256:4d65c68079425a7a71b0b1409f96bb12a05f65dadfe5c085c7cf2120063c1a8d 2025-10-10T02:31:09.8717963Z deleted: sha256:e1a0c2201a8c71f9d4e7525428dec10c376da8ca32dcd3ff6f5cff2ff3a4cdc8 2025-10-10T02:31:09.8718564Z deleted: sha256:a24e87bd237175c2868256806df20d84a7ef565b16818c67ece97a10bc7a164f 2025-10-10T02:31:09.8719150Z deleted: sha256:d676a392554f84ce5cd1659940e1166716e722ae657597cf7e819a52d3d6e272 2025-10-10T02:31:09.8719744Z deleted: sha256:96df7aa5fc2d402333a07c6e81b8738174b183b2225da431b9fed9c6cbc2b176 2025-10-10T02:31:09.8720345Z deleted: sha256:9017b49736a417dc6c1078c7fcd99b270b58930bd00177b082ebd9f5d41fc5c1 2025-10-10T02:31:09.8720935Z deleted: sha256:d3486a70bf19f832bc326610645c315f37239dde72816d4313089ab5fbfdfbfe 2025-10-10T02:31:09.8721528Z deleted: sha256:42a5e595b2210bc44c8834e9443f65ca1facbd99091a1ebedf8d326b7714ffcb 2025-10-10T02:31:09.8722124Z deleted: sha256:cb48674e4c04ea5701073e7008717e14347362c23c738f866e11a6990eccbaa3 2025-10-10T02:31:09.8722723Z deleted: sha256:cbe87446c33b21f3fda5b4805900afc8cf2efbfcc68c55d76eae96cf76b19587 2025-10-10T02:31:09.8723343Z deleted: sha256:9b9acd04a153a774e0fc4df7b241a92bf76e586be20af85f455878ff1b5249e4 2025-10-10T02:31:09.8724053Z deleted: sha256:951c31057f613889fa7ef638d2c770f91c24766df682bd260665bee314efaedc 2025-10-10T02:31:09.8724655Z deleted: sha256:4c8aebb8a87f643b032b7035befed934f7f6e214e01575775f500c21d9dd96b9 2025-10-10T02:31:09.8725266Z deleted: sha256:dbf14f2516298ece9f929bc8b286afe9def3df3e02d050e4e05437c1955604b4 2025-10-10T02:31:09.8725881Z deleted: sha256:aaaf32f38a617483f30074d028c2f99e7dda48419984eff5ae8ee2ce0a05658f 2025-10-10T02:31:09.8726487Z deleted: sha256:bdfa4e3f9383a5ed2e7e973b13a19baad2512be4a735b7fa7083cf2ded6845bd 2025-10-10T02:31:09.8727239Z deleted: sha256:e52fcbd97bc6b1d84ee5af9e62f5ac08771db76cc6efd31e0dacc97c3d0427e9 2025-10-10T02:31:09.8727955Z deleted: sha256:42f403ff2fde8cc9be4ccf6cb01ac58300ce93b78d2f40d250b5f92c28e24e69 2025-10-10T02:31:09.8728562Z deleted: sha256:4ef1120d921c52f9b59f33808f0f3f29c63444a4c6d0db34194d88e63ed108e8 2025-10-10T02:31:09.8729159Z deleted: sha256:daec0673eb6df39b27151ba7fa0839a78e0d4de9f36841aba4073b27ba3e8c69 2025-10-10T02:31:09.8729758Z deleted: sha256:fdd50ee3397208057e292f9c8662f47cc7b98579b5cce46f67799cf8707f43bd 2025-10-10T02:31:09.8730357Z deleted: sha256:cf2d2647c196d0f543ce92df79a260921b14f9baeda7d7d499d49ff94b8b86d3 2025-10-10T02:31:09.8730970Z deleted: sha256:3b81a4fa5b63fcbd5d0fa80bf25ddd552af2e1d3d54abc8c0b84414feaa40637 2025-10-10T02:31:09.8731581Z deleted: sha256:5afc757631825c97e090df9f746f2e6da115223177f555ba1b25b8cd7afcc0d1 2025-10-10T02:31:09.8732181Z deleted: sha256:800e3209a4f368dc9ff7e4eb6524a02fbb75c3eafb2c44c063622f8f2af6c5ed 2025-10-10T02:31:09.8732895Z deleted: sha256:6da8ba5395b65c4f75fa2419109932d9e9da4d251d596a9f07c0af987c1118a7 2025-10-10T02:31:09.8733508Z deleted: sha256:e21ede4bd44a010ee1aed2eb2d00465ee4929601cd344068020e52d223904fd1 2025-10-10T02:31:09.8734129Z deleted: sha256:7aa39b517b5dcc59e3ed8fe30362697486a90e3ae779117c9efc814c060afe5b 2025-10-10T02:31:09.8734728Z deleted: sha256:fd389cc5537028828912fa4507563c21806ce25f9e53e2a14f5809180376531c 2025-10-10T02:31:09.8735307Z deleted: sha256:a1908491b2d5f10a107f8c859183701002b310d95f06008065f208df6642b747 2025-10-10T02:31:09.8735956Z deleted: sha256:ec9d62d5c9322c49ed267f960e1eaded3739fedc41e3db0758b54eafab6bd24c 2025-10-10T02:31:09.8736590Z deleted: sha256:2723d20718d232de161f15430128ae2a8c02f64497e85cb93abb09a89828fcf7 2025-10-10T02:31:09.8737385Z deleted: sha256:0f405efaab03760c5be488fffdbffad61f0460547eba4ffcddffc2a23ceb18b1 2025-10-10T02:31:09.8738257Z deleted: sha256:d01349298429411393ef357d7f95e27f214b79ada7a7043e628eb52e025a5d02 2025-10-10T02:31:09.8739050Z deleted: sha256:c192af2171a41d08adecfaaf87ad5d98565a9678043bf06081a9a34c7544fb30 2025-10-10T02:31:09.8739765Z deleted: sha256:c47a6ac2132e9bae253ecc14c9d3fc356197d06dc0e1a3233abd455acff80ec6 2025-10-10T02:31:09.8740524Z deleted: sha256:3da2807a14384fe9879af55ce32ad1f40234bd10f474ecbbad2372ede8097efc 2025-10-10T02:31:09.8741142Z deleted: sha256:5b04652c3b6ba8cb60a3ad6b4e7cab069e6423619b57bf20f7b6fd62a151d921 2025-10-10T02:31:09.8741764Z deleted: sha256:cdf6379eb46b510a8dfebddea3e5b0700174bc1dcf249e02ed82e31e17e98e01 2025-10-10T02:31:09.8742369Z deleted: sha256:867d164267ff967b7c6e635a75978dc5c7d61b1d5bd3677adc3d74a714be6f9a 2025-10-10T02:31:09.8742982Z deleted: sha256:a5c3ee4ad3b418575cd2cbdc47cc27606ea301219e5398b74dc5129b37b395bd 2025-10-10T02:31:09.8743593Z deleted: sha256:b6eae4c9bac3dc3429599664b19098177f944aa6090924e0538a476a8bb4bd94 2025-10-10T02:31:09.8744200Z deleted: sha256:b2915108edad138dc091cb5bad8fe0a3dbe2a723316c2452d9a852ae55a2037f 2025-10-10T02:31:09.8744802Z deleted: sha256:83e3e29103efa58171bd7d5861fc08f77555d073b55d99427a6d54190095f99a 2025-10-10T02:31:09.8745403Z deleted: sha256:e663d44bb0ac71e519dd1b24bc564f4fc7a6669b220a1c34b19b4ad2560c6086 2025-10-10T02:31:09.8746026Z deleted: sha256:987213e4d8ddb0b7cef0adb84a66d3bc8d8fa2e7d3242844a04a692927fd1924 2025-10-10T02:31:09.8746630Z deleted: sha256:465d904413d460ea8f2c3b17e465595dd1dcdb2fcc8213ce48a08f535a680ec6 2025-10-10T02:31:09.8747236Z deleted: sha256:6e7d8fefa48596db2ea21114efbc2032bf991f1381dd9f197451df8b654d4f42 2025-10-10T02:31:09.8747880Z deleted: sha256:067ad0e533b02897d787f3f836cdd5e6e98b58fe61bdce3e46af83db8f81e913 2025-10-10T02:31:09.8748479Z deleted: sha256:2447363d9316cb41977e6f3dabfc8cbe612e20e5f1f369bb5b4a179d5bc0b6a1 2025-10-10T02:31:09.8749074Z deleted: sha256:70c772625c7c7c6b5a93b30538beab0fcafebf048a47c739136e7575e5a0ef5f 2025-10-10T02:31:09.8749671Z deleted: sha256:9b743670c9bd3f82b0ba65ca02874578b7138fd4354e01aef7e12ab97d8deff8 2025-10-10T02:31:09.8750261Z deleted: sha256:12e5eadd8890e6f48e467ad3c21a3f294e72bcb29418a6c9c2d36880c5e414a7 2025-10-10T02:31:09.8750915Z deleted: sha256:8de5123f53a0ef01cc11fe1830680b32ed9ae93af771ae00f89bfb77cdbdcdf9 2025-10-10T02:31:09.8751517Z deleted: sha256:15346907c9e3755ee3a2fada9e9fdc56d5ccdea441838a5d17f0a4d712ad1855 2025-10-10T02:31:09.8752140Z deleted: sha256:a427ee1813fec9b2b54d23d0fa8cfa11c6531662f8b370638b23fed482d7688a 2025-10-10T02:31:09.8752728Z deleted: sha256:6577fa074b0a862f97673563c80035f7624b454369a3af45e147f4eb0174e4c1 2025-10-10T02:31:09.8753314Z deleted: sha256:4f16ba02e699c87df9363968913a8462fbd3eaf5d102d82bc6963a1af8f819aa 2025-10-10T02:31:09.8753965Z deleted: sha256:1f1952179d6334d85e4a489175dfd079b7562443ccd0faa29ee46da518bb8457 2025-10-10T02:31:09.8754568Z deleted: sha256:fd3f825446ad9fb557f04959ba4bf3eea3414dbfbdc5c1574fa171505b6fbfb4 2025-10-10T02:31:09.8755157Z deleted: sha256:6f3d6e680369339310537fc08094101e35c94f77270262fca3b9a9e0319df259 2025-10-10T02:31:09.8755742Z deleted: sha256:ad807d0c9bd52cf74bb57bafa667705415e6e02f9082068cfa1a696b671b0678 2025-10-10T02:31:09.8756386Z deleted: sha256:0902ea59963416b7921b462e4bbb98b05a29a0a52699c10990e83a5d44dba96e 2025-10-10T02:31:09.8756975Z deleted: sha256:776ac80f3f29b72ff408c403c60c49b6476a3ab4377433c3ca3ccbba0ea8eb8f 2025-10-10T02:31:09.8757582Z deleted: sha256:a10d1ffba4470ea3d57d904f53fc64d3bb89d90a0041aabf9483d38a77ee73f7 2025-10-10T02:31:09.8758196Z deleted: sha256:fc3fd4d17cb5de0bb1071c683ac609a8272ae96ac2ace09d716ab1b5a7428bf8 2025-10-10T02:31:09.8758791Z deleted: sha256:4c2df861b92027af87d580ff99d9841e21a599800836490e0cd0829e40e1e125 2025-10-10T02:31:09.8759396Z deleted: sha256:e45346cdafbffff90fea5a59b6721cf154ef0e831a6c37be07c89091a94fc0ac 2025-10-10T02:31:09.8760020Z deleted: sha256:60e3b4085deb3cf3cbdada6fc0a40676b8dd422b14d4aadd65d8c3490fcbdcec 2025-10-10T02:31:09.8760642Z deleted: sha256:446ab4c8b212a06ef0faff16b651256addacd44fe2dd59d682fdf6447050708e 2025-10-10T02:31:09.8761240Z deleted: sha256:767e56ba346ae714b6e6b816baa839051145ed78cfa0e4524a86cc287b0c4b00 2025-10-10T02:31:09.8761742Z untagged: public.ecr.aws/docker/library/python:3.13 2025-10-10T02:31:09.8762393Z untagged: public.ecr.aws/docker/library/python@sha256:4889af0e45f04b7c5dd741421a1280919499d38d3125d714b69fa86b23b1052a 2025-10-10T02:31:09.8763212Z deleted: sha256:6c82e3449d7794702180419555c0a0e1687ea79a0c665b250436286924681a55 2025-10-10T02:31:09.8763800Z deleted: sha256:68a0419cb3069ed43905ab41b911f2b7248601df62c854ae65e8c8a0342dbb30 2025-10-10T02:31:09.8764387Z deleted: sha256:b258354078ead7184c2f6d72eb3d5db1855162c0f80d164c09e794b21f30f48b 2025-10-10T02:31:09.8765001Z deleted: sha256:3e65e3c281dedcfdb54cb848bd29efd0e832cf5f29dec4b6b9849cd7420266cb 2025-10-10T02:31:09.8765612Z deleted: sha256:42f4cd5b256627f333ad4537462aac85c359e741da4f02d1cb68600c128841c5 2025-10-10T02:31:09.8766212Z deleted: sha256:4e7df8e345c749980c75fd48e7b2ef15e63dc912b467ffa446284f0dbcc5aa33 2025-10-10T02:31:09.8766803Z deleted: sha256:345f9c4d6fe93d61688b6f1a607137261d7983d3788b5d88e8791b6ebeb8a920 2025-10-10T02:31:09.8767512Z deleted: sha256:a5ec5ec9d16c5551ce8889cbc03af0609b92cf8a8d60b32e72a7eabb8378eaec 2025-10-10T02:31:09.8767881Z 2025-10-10T02:31:09.8768002Z Total reclaimed space: 37.62GB 2025-10-10T02:31:09.8855646Z Post job cleanup. 2025-10-10T02:31:09.8919258Z Post job cleanup. 2025-10-10T02:31:09.9950877Z [command]/usr/bin/git version 2025-10-10T02:31:10.0011562Z git version 2.50.1 2025-10-10T02:31:10.0049914Z Copying '/home/ec2-user/.gitconfig' to '/home/ec2-user/actions-runner/_work/_temp/d666947d-6abb-460b-a249-50f04173d58e/.gitconfig' 2025-10-10T02:31:10.0059923Z Temporarily overriding HOME='/home/ec2-user/actions-runner/_work/_temp/d666947d-6abb-460b-a249-50f04173d58e' before making global git config changes 2025-10-10T02:31:10.0060951Z Adding repository directory to the temporary git global config as a safe directory 2025-10-10T02:31:10.0065664Z [command]/usr/bin/git config --global --add safe.directory /home/ec2-user/actions-runner/_work/pytorch/pytorch 2025-10-10T02:31:10.0111712Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-10-10T02:31:10.0162355Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-10-10T02:31:10.0584916Z Entering 'android/libs/fbjni' 2025-10-10T02:31:10.0667120Z Entering 'third_party/FP16' 2025-10-10T02:31:10.0750019Z Entering 'third_party/FXdiv' 2025-10-10T02:31:10.0835205Z Entering 'third_party/NNPACK' 2025-10-10T02:31:10.0919098Z Entering 'third_party/NVTX' 2025-10-10T02:31:10.1002521Z Entering 'third_party/VulkanMemoryAllocator' 2025-10-10T02:31:10.1086000Z Entering 'third_party/XNNPACK' 2025-10-10T02:31:10.1187940Z Entering 'third_party/aiter' 2025-10-10T02:31:10.1275805Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-10-10T02:31:10.1367460Z Entering 'third_party/benchmark' 2025-10-10T02:31:10.1452421Z Entering 'third_party/composable_kernel' 2025-10-10T02:31:10.1546085Z Entering 'third_party/cpp-httplib' 2025-10-10T02:31:10.1629495Z Entering 'third_party/cpuinfo' 2025-10-10T02:31:10.1712540Z Entering 'third_party/cudnn_frontend' 2025-10-10T02:31:10.1796444Z Entering 'third_party/cutlass' 2025-10-10T02:31:10.1890781Z Entering 'third_party/fbgemm' 2025-10-10T02:31:10.1975541Z Entering 'third_party/fbgemm/external/asmjit' 2025-10-10T02:31:10.2058075Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-10-10T02:31:10.2145989Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-10-10T02:31:10.2224012Z Entering 'third_party/fbgemm/external/cutlass' 2025-10-10T02:31:10.2313177Z Entering 'third_party/fbgemm/external/googletest' 2025-10-10T02:31:10.2392145Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-10-10T02:31:10.2471932Z Entering 'third_party/fbgemm/external/json' 2025-10-10T02:31:10.2559759Z Entering 'third_party/flash-attention' 2025-10-10T02:31:10.2640218Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-10-10T02:31:10.2725036Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-10-10T02:31:10.2815495Z Entering 'third_party/flatbuffers' 2025-10-10T02:31:10.2902613Z Entering 'third_party/fmt' 2025-10-10T02:31:10.2984328Z Entering 'third_party/gemmlowp/gemmlowp' 2025-10-10T02:31:10.3067372Z Entering 'third_party/gloo' 2025-10-10T02:31:10.3149636Z Entering 'third_party/googletest' 2025-10-10T02:31:10.3231815Z Entering 'third_party/ideep' 2025-10-10T02:31:10.3317184Z Entering 'third_party/ideep/mkl-dnn' 2025-10-10T02:31:10.3410330Z Entering 'third_party/ittapi' 2025-10-10T02:31:10.3491595Z Entering 'third_party/kineto' 2025-10-10T02:31:10.3571416Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-10-10T02:31:10.3650328Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-10-10T02:31:10.3736494Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-10-10T02:31:10.3822233Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-10-10T02:31:10.3902133Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-10-10T02:31:10.3979519Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-10-10T02:31:10.4062727Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-10-10T02:31:10.4142260Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-10-10T02:31:10.4221681Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-10-10T02:31:10.4303204Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-10-10T02:31:10.4382875Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-10-10T02:31:10.4461544Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T02:31:10.4547440Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T02:31:10.4635139Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-10-10T02:31:10.4712811Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-10-10T02:31:10.4796846Z Entering 'third_party/kleidiai' 2025-10-10T02:31:10.4885421Z Entering 'third_party/mimalloc' 2025-10-10T02:31:10.4968774Z Entering 'third_party/nlohmann' 2025-10-10T02:31:10.5051764Z Entering 'third_party/onnx' 2025-10-10T02:31:10.5148791Z Entering 'third_party/onnx/third_party/pybind11' 2025-10-10T02:31:10.5233962Z Entering 'third_party/opentelemetry-cpp' 2025-10-10T02:31:10.5315010Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-10-10T02:31:10.5395516Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-10-10T02:31:10.5476187Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-10-10T02:31:10.5554987Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-10-10T02:31:10.5634049Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-10-10T02:31:10.5709387Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-10-10T02:31:10.5786536Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-10-10T02:31:10.5863698Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T02:31:10.5943291Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T02:31:10.6025053Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-10-10T02:31:10.6125585Z Entering 'third_party/pocketfft' 2025-10-10T02:31:10.6206715Z Entering 'third_party/protobuf' 2025-10-10T02:31:10.6288704Z Entering 'third_party/protobuf/third_party/benchmark' 2025-10-10T02:31:10.6368357Z Entering 'third_party/protobuf/third_party/googletest' 2025-10-10T02:31:10.6452311Z Entering 'third_party/psimd' 2025-10-10T02:31:10.6534630Z Entering 'third_party/pthreadpool' 2025-10-10T02:31:10.6615204Z Entering 'third_party/pybind11' 2025-10-10T02:31:10.6700097Z Entering 'third_party/python-peachpy' 2025-10-10T02:31:10.6781503Z Entering 'third_party/sleef' 2025-10-10T02:31:10.6862173Z Entering 'third_party/tensorpipe' 2025-10-10T02:31:10.6941150Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-10-10T02:31:10.7019866Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-10-10T02:31:10.7099380Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-10-10T02:31:10.7176565Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-10-10T02:31:10.7251465Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-10-10T02:31:10.7360662Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-10-10T02:31:10.7390973Z http.https://github.com/.extraheader 2025-10-10T02:31:10.7404550Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2025-10-10T02:31:10.7443716Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-10-10T02:31:10.7844601Z Entering 'android/libs/fbjni' 2025-10-10T02:31:10.7900445Z http.https://github.com/.extraheader 2025-10-10T02:31:10.7951692Z Entering 'third_party/FP16' 2025-10-10T02:31:10.8005246Z http.https://github.com/.extraheader 2025-10-10T02:31:10.8056178Z Entering 'third_party/FXdiv' 2025-10-10T02:31:10.8109481Z http.https://github.com/.extraheader 2025-10-10T02:31:10.8158485Z Entering 'third_party/NNPACK' 2025-10-10T02:31:10.8210155Z http.https://github.com/.extraheader 2025-10-10T02:31:10.8261600Z Entering 'third_party/NVTX' 2025-10-10T02:31:10.8313885Z http.https://github.com/.extraheader 2025-10-10T02:31:10.8364643Z Entering 'third_party/VulkanMemoryAllocator' 2025-10-10T02:31:10.8422135Z http.https://github.com/.extraheader 2025-10-10T02:31:10.8471673Z Entering 'third_party/XNNPACK' 2025-10-10T02:31:10.8523713Z http.https://github.com/.extraheader 2025-10-10T02:31:10.8588082Z Entering 'third_party/aiter' 2025-10-10T02:31:10.8640408Z http.https://github.com/.extraheader 2025-10-10T02:31:10.8692453Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-10-10T02:31:10.8745606Z http.https://github.com/.extraheader 2025-10-10T02:31:10.8807401Z Entering 'third_party/benchmark' 2025-10-10T02:31:10.8863102Z http.https://github.com/.extraheader 2025-10-10T02:31:10.8912765Z Entering 'third_party/composable_kernel' 2025-10-10T02:31:10.8964369Z http.https://github.com/.extraheader 2025-10-10T02:31:10.9023634Z Entering 'third_party/cpp-httplib' 2025-10-10T02:31:10.9075635Z http.https://github.com/.extraheader 2025-10-10T02:31:10.9125571Z Entering 'third_party/cpuinfo' 2025-10-10T02:31:10.9176970Z http.https://github.com/.extraheader 2025-10-10T02:31:10.9226497Z Entering 'third_party/cudnn_frontend' 2025-10-10T02:31:10.9278456Z http.https://github.com/.extraheader 2025-10-10T02:31:10.9328616Z Entering 'third_party/cutlass' 2025-10-10T02:31:10.9380314Z http.https://github.com/.extraheader 2025-10-10T02:31:10.9443085Z Entering 'third_party/fbgemm' 2025-10-10T02:31:10.9505725Z http.https://github.com/.extraheader 2025-10-10T02:31:10.9555702Z Entering 'third_party/fbgemm/external/asmjit' 2025-10-10T02:31:10.9607741Z http.https://github.com/.extraheader 2025-10-10T02:31:10.9659140Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-10-10T02:31:10.9709526Z http.https://github.com/.extraheader 2025-10-10T02:31:10.9766063Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-10-10T02:31:10.9818327Z http.https://github.com/.extraheader 2025-10-10T02:31:10.9867743Z Entering 'third_party/fbgemm/external/cutlass' 2025-10-10T02:31:10.9919946Z http.https://github.com/.extraheader 2025-10-10T02:31:10.9979009Z Entering 'third_party/fbgemm/external/googletest' 2025-10-10T02:31:11.0031273Z http.https://github.com/.extraheader 2025-10-10T02:31:11.0080936Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-10-10T02:31:11.0132423Z http.https://github.com/.extraheader 2025-10-10T02:31:11.0182086Z Entering 'third_party/fbgemm/external/json' 2025-10-10T02:31:11.0233915Z http.https://github.com/.extraheader 2025-10-10T02:31:11.0290874Z Entering 'third_party/flash-attention' 2025-10-10T02:31:11.0344912Z http.https://github.com/.extraheader 2025-10-10T02:31:11.0394315Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-10-10T02:31:11.0444255Z http.https://github.com/.extraheader 2025-10-10T02:31:11.0503258Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-10-10T02:31:11.0554506Z http.https://github.com/.extraheader 2025-10-10T02:31:11.0618981Z Entering 'third_party/flatbuffers' 2025-10-10T02:31:11.0670222Z http.https://github.com/.extraheader 2025-10-10T02:31:11.0723714Z Entering 'third_party/fmt' 2025-10-10T02:31:11.0775632Z http.https://github.com/.extraheader 2025-10-10T02:31:11.0829160Z Entering 'third_party/gemmlowp/gemmlowp' 2025-10-10T02:31:11.0881539Z http.https://github.com/.extraheader 2025-10-10T02:31:11.0931468Z Entering 'third_party/gloo' 2025-10-10T02:31:11.0982844Z http.https://github.com/.extraheader 2025-10-10T02:31:11.1033922Z Entering 'third_party/googletest' 2025-10-10T02:31:11.1085770Z http.https://github.com/.extraheader 2025-10-10T02:31:11.1135969Z Entering 'third_party/ideep' 2025-10-10T02:31:11.1187801Z http.https://github.com/.extraheader 2025-10-10T02:31:11.1235637Z Entering 'third_party/ideep/mkl-dnn' 2025-10-10T02:31:11.1287497Z http.https://github.com/.extraheader 2025-10-10T02:31:11.1347936Z Entering 'third_party/ittapi' 2025-10-10T02:31:11.1401085Z http.https://github.com/.extraheader 2025-10-10T02:31:11.1451143Z Entering 'third_party/kineto' 2025-10-10T02:31:11.1504624Z http.https://github.com/.extraheader 2025-10-10T02:31:11.1551907Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-10-10T02:31:11.1608813Z http.https://github.com/.extraheader 2025-10-10T02:31:11.1656885Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-10-10T02:31:11.1708868Z http.https://github.com/.extraheader 2025-10-10T02:31:11.1760474Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-10-10T02:31:11.1815660Z http.https://github.com/.extraheader 2025-10-10T02:31:11.1866121Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-10-10T02:31:11.1918047Z http.https://github.com/.extraheader 2025-10-10T02:31:11.1969355Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-10-10T02:31:11.2025657Z http.https://github.com/.extraheader 2025-10-10T02:31:11.2074412Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-10-10T02:31:11.2127403Z http.https://github.com/.extraheader 2025-10-10T02:31:11.2182406Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-10-10T02:31:11.2235269Z http.https://github.com/.extraheader 2025-10-10T02:31:11.2283986Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-10-10T02:31:11.2341494Z http.https://github.com/.extraheader 2025-10-10T02:31:11.2394234Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-10-10T02:31:11.2446927Z http.https://github.com/.extraheader 2025-10-10T02:31:11.2499750Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-10-10T02:31:11.2551575Z http.https://github.com/.extraheader 2025-10-10T02:31:11.2602760Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-10-10T02:31:11.2653579Z http.https://github.com/.extraheader 2025-10-10T02:31:11.2703857Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T02:31:11.2755322Z http.https://github.com/.extraheader 2025-10-10T02:31:11.2810064Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T02:31:11.2862202Z http.https://github.com/.extraheader 2025-10-10T02:31:11.2922720Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-10-10T02:31:11.2974537Z http.https://github.com/.extraheader 2025-10-10T02:31:11.3023279Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-10-10T02:31:11.3074818Z http.https://github.com/.extraheader 2025-10-10T02:31:11.3131283Z Entering 'third_party/kleidiai' 2025-10-10T02:31:11.3183583Z http.https://github.com/.extraheader 2025-10-10T02:31:11.3234997Z Entering 'third_party/mimalloc' 2025-10-10T02:31:11.3287038Z http.https://github.com/.extraheader 2025-10-10T02:31:11.3339861Z Entering 'third_party/nlohmann' 2025-10-10T02:31:11.3391345Z http.https://github.com/.extraheader 2025-10-10T02:31:11.3443643Z Entering 'third_party/onnx' 2025-10-10T02:31:11.3498961Z http.https://github.com/.extraheader 2025-10-10T02:31:11.3565822Z Entering 'third_party/onnx/third_party/pybind11' 2025-10-10T02:31:11.3622737Z http.https://github.com/.extraheader 2025-10-10T02:31:11.3680170Z Entering 'third_party/opentelemetry-cpp' 2025-10-10T02:31:11.3735473Z http.https://github.com/.extraheader 2025-10-10T02:31:11.3787076Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-10-10T02:31:11.3837770Z http.https://github.com/.extraheader 2025-10-10T02:31:11.3886433Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-10-10T02:31:11.3942273Z http.https://github.com/.extraheader 2025-10-10T02:31:11.3992360Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-10-10T02:31:11.4045092Z http.https://github.com/.extraheader 2025-10-10T02:31:11.4094439Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-10-10T02:31:11.4145466Z http.https://github.com/.extraheader 2025-10-10T02:31:11.4196427Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-10-10T02:31:11.4248067Z http.https://github.com/.extraheader 2025-10-10T02:31:11.4296894Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-10-10T02:31:11.4347816Z http.https://github.com/.extraheader 2025-10-10T02:31:11.4395921Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-10-10T02:31:11.4447640Z http.https://github.com/.extraheader 2025-10-10T02:31:11.4494585Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-10-10T02:31:11.4545756Z http.https://github.com/.extraheader 2025-10-10T02:31:11.4598198Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-10-10T02:31:11.4649465Z http.https://github.com/.extraheader 2025-10-10T02:31:11.4703609Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-10-10T02:31:11.4754915Z http.https://github.com/.extraheader 2025-10-10T02:31:11.4828671Z Entering 'third_party/pocketfft' 2025-10-10T02:31:11.4880401Z http.https://github.com/.extraheader 2025-10-10T02:31:11.4935753Z Entering 'third_party/protobuf' 2025-10-10T02:31:11.4988319Z http.https://github.com/.extraheader 2025-10-10T02:31:11.5041237Z Entering 'third_party/protobuf/third_party/benchmark' 2025-10-10T02:31:11.5094872Z http.https://github.com/.extraheader 2025-10-10T02:31:11.5146138Z Entering 'third_party/protobuf/third_party/googletest' 2025-10-10T02:31:11.5196503Z http.https://github.com/.extraheader 2025-10-10T02:31:11.5250994Z Entering 'third_party/psimd' 2025-10-10T02:31:11.5304143Z http.https://github.com/.extraheader 2025-10-10T02:31:11.5354596Z Entering 'third_party/pthreadpool' 2025-10-10T02:31:11.5409457Z http.https://github.com/.extraheader 2025-10-10T02:31:11.5459198Z Entering 'third_party/pybind11' 2025-10-10T02:31:11.5512294Z http.https://github.com/.extraheader 2025-10-10T02:31:11.5562875Z Entering 'third_party/python-peachpy' 2025-10-10T02:31:11.5619217Z http.https://github.com/.extraheader 2025-10-10T02:31:11.5670611Z Entering 'third_party/sleef' 2025-10-10T02:31:11.5723695Z http.https://github.com/.extraheader 2025-10-10T02:31:11.5773814Z Entering 'third_party/tensorpipe' 2025-10-10T02:31:11.5826276Z http.https://github.com/.extraheader 2025-10-10T02:31:11.5874853Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-10-10T02:31:11.5928989Z http.https://github.com/.extraheader 2025-10-10T02:31:11.5979471Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-10-10T02:31:11.6032538Z http.https://github.com/.extraheader 2025-10-10T02:31:11.6083600Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-10-10T02:31:11.6134538Z http.https://github.com/.extraheader 2025-10-10T02:31:11.6182652Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-10-10T02:31:11.6233000Z http.https://github.com/.extraheader 2025-10-10T02:31:11.6278718Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-10-10T02:31:11.6330522Z http.https://github.com/.extraheader 2025-10-10T02:31:11.6505617Z A job completed hook has been configured by the self-hosted runner administrator 2025-10-10T02:31:11.6546870Z ##[group]Run '/home/ec2-user/runner-scripts/after_job.sh' 2025-10-10T02:31:11.6555191Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-10-10T02:31:11.6555563Z ##[endgroup] 2025-10-10T02:31:11.6676583Z [!ALERT!] Swap in detected! [!ALERT!] 2025-10-10T02:31:30.6934747Z Cleaning up orphan processes